It is possible to integrate AltaVista Search Engine (AVSE) with a third party web server (3PWS). The most common reason to do this is to leverage that 3PWS's security features, such as SSL encryption and authentication, which are not provided in AVSE's o wn web server, Mhttpd.
Integration is accomplished by means of a module called AvsProxy which plugs into the 3PWS. The module must comply with the plugin architecture of the 3PWS so, for instance, to plug into Microsoft's Internet Information Server (IIS), you will need a vers ion of AvsProxy which fits the ISAPI extension interface.
AvsProxy is not a replacement for Mhttpd. As its name implies, AvsProxy works as a specialized HTTP proxy. The 3PWS is configured to let the AvsProxy module handle all end-user requests for certain web pages. When such a request comes in, AvsPro xy connects to a specially configured Mhttpd and asks it to produce the page. AvsProxy then takes the results, modifies them to make it look like they were really produced on the 3PWS, and sends them back to the end-user. Because AvsProxy communicates w ith Mhttpd over a standard Internet connection, Mhttpd and the 3PWS can run on different machines and even different operating systems.
Out of the box, we provide versions of AvsProxy which work with IIS and the popular free web server Apache. We also provide a utility library and full source code which you can use to port AvsProxy to work with any other 3PWS you need, such as Netscape's Enterprise Server, Allaire's ColdFusion, or Lotus's Domino.
The Apache version of AvsProxy was designed and tested with Apache version 1.3.12 on both Unix and Windows. It may require slight modification to work with other versions. Note that Apache must be built with DSO (dynamic shared object) support, which is the default for recent versions.
Locate Apache's modules directory. On most Unix systems, the default is /usr/local/apache/libexec. On Red Hat Linux, the default is /etc/httpd/modules. On Windows, the default is c:\program files\apache group\apache\modu
les.
Copy the AvsProxy module to the modules directory. On Unix, it is called mod_avsproxy.so. On Windows, it is called mod_avsproxy.dll. This file is a specialized proxy module which provides AVSE-specific functionality not availa
ble in Apache's generic caching proxy. Make sure its owner and permissions match the other modules in the directory.
Locate Apache's configuration file. On most Unix systems, the default is /usr/local/apache/conf/httpd.conf. On Red Hat Linux, the default is /etc/httpd/conf/httpd.conf. On Windows, the default is c:\program files\apache
group\apache\conf\httpd.conf.
After the existing LoadModule lines in httpd.conf, add an entry for AvsProxy. For most Unix systems, this should be:
LoadModule avsproxy_module libexec/mod_avsproxy.so
For Red Hat Linux, it should be:
LoadModule avsproxy_module modules/mod_avsproxy.so
For Windows, it should be:
LoadModule avsproxy_module modules/mod_avsproxy.dll
After the existing AddModule lines in httpd.conf, add an entry for AvsProxy. On all platforms, this should be:
AddModule mod_avsproxy.c
Combined with the LoadModule directive, this will make Apache load and recognize AvsProxy as a plugin module.
Add the following line to the end of the httpd.conf file:
AvsProxyMhttpd mhttpd-server.company.com 9000
This directive tells AvsProxy where Mhttpd is running. You should substitute the appropriate hostname and port number for your site.
Add the following line to the end of the httpd.conf file:
AvsProxyBasehost http://apache-server.company.com/
This directive tells AvsProxy where to tell the end-user that the pages came from. In other words, it should be the URL the end users specify to reach Apache. Specify the appropriate URL for your site, and remember to substitute https for <
code>http if you plan to use SSL. Note that the trailing slash is required.
If you would like to control the level of logging, add the following line to the end of the httpd.conf file:
AvsProxyLogLevel verbose
You can choose verbose to monitor each query (helpful in determining configuration errors), normal to report severe errors only, or none.
AvsProxy writes its messages to Apache's error log. On most Unix systems, the default location of this file is /usr/local/apache/logs/error_log. On Red Hat Linux, it is /etc/httpd/logs/error_log. On Windows, it is c:\pro
gram files\apache group\apache\logs\error.log.
Add the following lines to the end of httpd.conf to route AVSE-related requests through AvsProxy.
<Location /cgi-bin/query> SetHandler avsproxy </Location> <Location /main> SetHandler avsproxy </Location>
AVSE uses /cgi-bin/query for dynamically generated pages, and /main for static pages and graphics.
If you are using a single installation of Apache to serve more than one hostname or port, make sure you place these lines inside the appropriate <VirtualHost> block.
Find the IP address of the machine running Apache. On Unix, you can find it using this command:
nslookup `hostname`
On Windows, you can find it by running the ipconfig program from the command line.
On the machine running Mhttpd, locate the $(AVSDIR)/httpd/config file.
Add the following line to the config file:
ProxyHost 127.0.0.1
This directive tells Mhttpd to only respond to requests which come from AvsProxy; otherwise malicious users could circumvent the security which AvsProxy adds. Substitute the IP address of the machine running Apache as appropriate for your site. Note tha
t by default there is an example of this directive in the config file, but it is commented out.
Many administrators would like to protect their search site with a password. Some would like to present a different interface or a different list of indexes to different users. AvsProxy works with Apache to address both these needs.
Create a password file avs.htpasswd in the same directory as httpd.conf. Apache stores its passwords in a file format similar to the standard Unix passwd file, but keeps its own list of users. This means that they
don't have to have system accounts on the machine running Apache.
Apache comes with a utility called htpasswd for managing its password files. Use it in the following manner:
htpasswd -c avs.htpasswd Alice htpasswd avs.htpasswd Bob htpasswd avs.htpasswd Claude
The -c tells htpasswd to create a new file, so it should only be used the first time. It will prompt you to enter the password for each user.
Create a group file avs.htgroup in the same directory as httpd.conf. Just like with passwords, Apache stores group data in a file format similar to the standard Unix group file. There is no automatic utility for ma
naging the file, so you will have to edit it yourself:
avsusers: Alice Bob Claude
In the httpd.conf file, modify the <Location> block for queries, so that it references the new password and group files. On most Unix systems, this should be:
<Location /cgi-bin/query> SetHandler avsproxy AuthType Basic AuthName "AltaVista Search Engine" AuthUserFile /usr/local/apache/conf/avs.htpasswd AuthGroupFile /usr/local/apache/conf/avs.htgroup require group avsusers </Location>
On Red Hat Linux it should be:
<Location /cgi-bin/query> SetHandler avsproxy AuthType Basic AuthName "AltaVista Search Engine" AuthUserFile /etc/httpd/conf/avs.htpasswd AuthGroupFile /etc/httpd/conf/avs.htgroup require group avsusers </Location>
On Windows it should be:
<Location /cgi-bin/query> SetHandler avsproxy AuthType Basic AuthName "AltaVista Search Engine" AuthUserFile conf/avs.htpasswd AuthGroupFile conf/avs.htgroup require group avsusers </Location>
To give different users access to different interfaces, indexes, or features of the search page, you must continue with a few more steps.
Add the following lines to the end of the httpd.conf file:
AuthMappingScheme user_name uil AuthMapping Alice eniso AuthMapping Bob eniso AuthMapping Claude friso
The AuthMappingScheme directive tells AvsProxy to set a CGI variable based on which user is accessing the site. The first parameter tells AvsProxy what to look for: user_name, server_name (if Apache is configured to
listen on multiple hostnames), or server_port (if Apache is configured to listen on multiple ports). The second parameter specifies which CGI variable to set. AvsProxy will guard this variable to prevent malicious users from trying to over
ride its changes.
The AuthMapping directive tells AvsProxy what the value of that CGI variable should be when the user name, server name, or server port matches, according to the mapping scheme. In this example, the mapped CGI variable is uil, wh
ich the default AVSHE templates use to set the language in which the page is displayed. When Claude accesses the site, he will get see it in French, whereas Alice and Bob will see it in English.
Other popular choices for the mapped CGI variable are mss (which file to use as the starting template) and i (which index to search).
You can also choose to use a custom variable rather than one from the default templates, as in the following example:
AuthMappingScheme server_port myvariable AuthMapping 80 foo AuthMapping 81 bar
You can then modify your AVSHE templates to perform different actions based on the value of the new variable:
<!-- #avinclude if="${cgi.myvariable} eq foo" file="foostuff" -->
This can be used, for instance, to display a different list of indexes for different users.
AvsProxy does not require any special configuration to handle encryption, but can take advantage of Apache's built-in and plugin encryption features. It has been tested with mod_ssl (available from http://www.modssl.org/), but should work wi
th other SSL implementations with little or or no modifications.