Chapter 2

Choosing a Server Platform


CONTENTS

In computer jargon, the word platform can refer to many things: The type of hardware being used, the operating system, a set of network protocols, or some combination of these.

In this book we define a computing platform as the matched set of hardware and operating system used to run selected software applications. In particular, an Intranet server is the combination of hardware, operating system and HTTP software used to run an internal Web. Because of its openness and wide availability, HTTP has been ported to almost every computer platform, including 16-bit environments like Microsoft Windows 3.x and Apple MacOS. The latter lack the memory protection and multithreading of 32-bit systems and are therefore not suitable for critical business use. But they can be a good place to learn Web server basics.

Here, the focus is on robust, 32-bit Web servers with preemptive multitasking. Three operating systems dominate this market, shown next together with the hardware they support.

Operating systemRuns on
UNIX (many brands)RISC processors, Intel PCs, Apple Macintosh
Windows NTIntel PCs, selected RISC machines (for example, Digital Alpha, Hewlett-Packard PA-RISC)
Novell NetWareIntel PCs. Limited support for RISC systems

As you can see, UNIX runs on the widest variety of hardware, from high-performance RISC machines to Intel-based PCs. As an example, Sun Microsystems offers versions of its Solaris operating system for both its native SPARC processor and for Intel hardware. To compete with such diversity, Microsoft has had to step outside the Intel box, porting Windows NT to several vendors' RISC systems as well as Intel PCs. Novell, meanwhile, continues to aim its market-leading NetWare product at Intel-based networks, reserving multi-platform support for its UNIX brand (UNIXWare).

NOTE
RISC stands for Reduced Instruction Set Computer, a type of microprocessor design associated with high-performance UNIX workstations. Examples of popular RISC processors are Digital Alpha, Silicon General MIPS, and IBM PowerPC.
Early Intel processors such as the 80286 used a different, slower design called CISC, for Complex Instruction Set Computer. But with the addition of RISC features to its Pentium class processors, Intel has blurred the RISC/CISC distinction- in both design and performance.

In this chapter you learn about the following:

NOTE
The products mentioned in this chapter are just a sampling of the many commercial, shareware, and freeware Web servers available to meet your needs. For a complete, up-to-date listing, visit http://www.webcompare.com/.

Essential and Desirable Features of a Web Server

The purpose of a Web server is to respond to HTTP requests from Web clients (that is, browsers) by delivering a requested file, or executing a requested script. At a minimum, then, all Web servers must be able to interpret standard HyperText Transfer Protocol (HTTP) commands. In addition, to run programs, a server must conform to the Common Gateway Interface (CGI) standard. You can assume HTTP and CGI are present in any software calling itself a Web server.

Another essential server feature is the ability to recognize various file content types. This is necessary in order for the server to return the appropriate content-type header to the client, identifying the nature of the response. Web servers carry out this content negotiation, as the process is called, by mapping certain file extensions (such as .MPEG or .HTML) to standard MIME, Multipurpose Internet Mail Extensions,content types. Every Web server must accordingly maintain a configuration file showing this mapping.

Occasionally new document types become available, such as the RealAudio format for streaming audio over the Internet. A server administrator can add new types and extensions to the MIME configuration file, enabling the server to handle them. Table 2.1 shows part of a typical MIME configuration.

Table 2.1  MIME Content Types and Associated File Extensions
MIME TypeFile Extension
audio/basicau snd
audio/x-aiffaif aiff aifc
audio/wavwav
image/gifgif
image/jpegjpeg jpg jpe
image/tifftiff tif
text/htmlhtml htm
text/plaintxt
text/richtextrtx
video/mpegmpeg mpg mpe
video/quicktimeqt mov
video/msvideoavi

All Web servers have the ability to record HTTP transactions in a log file. For now, the industry has settled on the so-called common log format. Conformance to this standard ensures logs can be read by third-party utilities, such as log analysis and reporting tools.

In addition to HTTP, CGI, MIME and logging, a Web server may offer one or more of the following enhancements:

Each of these is discussed briefly in the following sections.

Server-Side Includes (SSI)

SSI is an extension to standard HTTP that enables HTML authors to embed executable commands in their Web pages. These commands execute on the server after a page is requested, but before it's sent to the client. The embedded commands are called server-side includes.

NOTE
The essential reference for SSI is at NCSA's Web site: http://hoohoo.ncsa.uiuc.edu/docs/tutorials/includes.html.

SSI is a simple and direct form of interaction with a Web server. It makes possible in a few lines of self-explanatory code effects much trickier to achieve with CGI scripting. For instance, the following code snippet causes the current date/time to be displayed on a Web page:

<p>
Local intranet time is: <!--#echo="DATE_LOCAL"-->.
</p>

Those familiar with HTML will recognize what looks like a comment (anything set off with <!--comment -->) in the middle of the text. SSI command syntax is the same as that for comments, except that the keyword must be one of six allowable words and preceded by a '#' sign. SSI-capable server software also makes available a set of SSI environment variables, including DATE_LOCAL as shown above. Others record the IP address or domain name of the client. The complete list of standard SSI commands is posted on NCSA's Web site.

This seems like a pretty good way to greet users by name, report the current time, or check for a certain browser version. And it is. But as with most neat tricks, there's a catch.

Two catches, actually. SSI loads the Web server heavily, requiring the server to scan each document for SSI code before transporting it. On top of that, it's the least secure of all the interactive modes. SSI in effect allows users to run embedded code on the server without restriction.

To mitigate these risks, server software typically provides configuration options for disallowing some or all SSI commands, selectively enabling SSI by user or directory. Further, to alleviate the parsing burden on the server, a MIME configuration option is available to identify those files that should be scanned for SSI code.

As an example the following line tells the UNIX-based NCSA HTTPd server to parse files ending in *.SHTML for includes:

AddType text/x-server-parsed-html .shtml

The NCSA documentation contains additional details.

Authentication, Authorization, and Encryption

HTTP servers pose unique security problems for network administrators, stemming from the fact that file permissions have to be fairly open to enable users to access Web pages and run CGI scripts. For instance, processes launched on a server from a Web browser can delete files, send e-mail, even format disks. Scripts therefore represent a major security hole on any Web. Fortunately, techniques for closing this hole are well-documented. For a thorough treatment check out Lincoln Stein's comprehensive WWW Security FAQ at http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html.

In addition to writing secure scripts, you may want to restrict access at the Web server based on a client's IP address or username. Most server software provides some type of access control mechanism by which "authorized" users receive access to protected information and others are denied.

For this to work, the server needs to ascertain two things: that a user is who she says she is, and that the user has permission to access the requested document. The process of verifying the identity of a user is called authentication. The process of verifying an authenticated user's rights is called authorization. The server software described later in this chapter can do both.

Access control is an essential facet of Intranet security, but it leaves a flank exposed. All exchanges on a Web, even those with restricted access, consist of open data transmissions on the network. It is technically trivial to monitor these transmissions, in essence neutralizing access control schemes.

Enter encryption. "In a multi-user setting," writes RSA Laboratories, a leading source of cryptographic tools and services, "encryption allows secure communication over an insecure channel."

Encryption works by encoding the text of a message with a key, which is just a very long number. Typical keys are 40, 64, 80 or 128 digits long, with the longer keys affording stronger encryption. In the parlance of cryptographers, a key's strength refers to the amount of computation required to crack it. The idea is that strong keys require more computing power to crack than can be practically harnessed.

Secure Web servers provide encryption using a particularly strong technique called public keys. In a public key system, everyone owns a unique pair of keys. One is called the public key, and is widely distributed to anyone who wants a copy. The other, called the private key, is kept secret.

Under this system, a person who needs to send a message to a recipient encrypts the message with the recipient's public key. So encrypted, the message can only be read by decrypting it with-you guessed it-the recipient's private key. This way, anyone can send a secure message, but only the intended party can read it. This solves the problem of message interception, as well as a few other subtler ones. If you plan to carry sensitive material on your Intranet, consider installing a secure Web server (one with public key encryption) from the outset.

Proxy Services

Many organizations find that the value of Intranet and Internet technology is maximized when they work together across a firewall. The firewall is a device, typically a router or server, that provides secure communications between private, trusted networks (such as an Intranet), and public, untrusted networks (such as the Internet). Firewalls can also mediate between workgroups within an organization. This might be useful, for instance, in a business with regulated and unregulated subsidiaries that share some resources, but are obligated to maintain a "Chinese wall" between others.

Importantly, the firewall can provide secure access to most Internet services, including HTTP, FTP, DNS (domain name services) and SMTP (e-mail). Each service provided in this way is called a proxy.

Technically, a proxy is a program that lives on the firewall and can see both sides of the interface, Intranet and Internet. Requests for outbound services from within the organization, such as a Web browser pointing at a remote URL, are caught by the proxy (HTTP, in this case) and, if allowed by the firewall's rule base, passed onto the Internet. Conversely, traffic from the Internet headed into the organization, such as e-mail, is captured by the proxy (SMTP, in this case) and, if allowed by the firewall's rule base, passed to the Intranet. A further advantage of firewall/proxy servers is that all services accessed through the firewall are logged, providing a built-in audit trail for transactions with the outside world.

Not all Web servers can act as proxies. For instance, CERN HTTPd can, but NCSA HTTPd cannot. If proxy services are important for your Intranet, add them to your checklist. Except for Apache the servers discussed next can function as proxies.

The remainder of this chapter illustrates how Web server software can provide essential and enhanced Intranet services, whether on UNIX, Windows NT or Novell NetWare.

Web Origins: UNIX HTTPd

There is no one UNIX. Rather, a welter of UNIX brands continues the tradition of open network computing begun at Bell Labs in 1969. Sun Microsystems Solaris, IBM AIX, Hewlett-Packard HP/UX, and freeware Linux issue from this common tradition.

But regardless of brand, every version of UNIX has a special relationship to the Internet, because the Internet was built on a UNIX foundation. TCP/IP is integral to the UNIX kernel, for instance, replete with the services commonly associated with cyberspace. And because the WWW began as a service of the Internet, it too has roots in UNIX soil.

As a matter of fact, the first Web server software was written to run on UNIX. From this tradition emerged two freeware programs that remain popular today: NCSA HTTPd, from the University of Illinois, and CERN HTTPd, maintained at MIT by the W3 Consortium.

Most commercial Web servers derive from these early programs. Netscape's server line and Microsoft IIS, for instance, build on NCSA HTTPd, adding secure transaction and proxy support. The freeware Apache HTTPd, too, is a descendant that aims to improve on the NCSA original. The next section will show you how.

Native to UNIX: The Apache Project

NOTE
See how popular your favorite server is on the Netcraft survey report, available on the WWW at http://www.netcraft.co.uk/Survey/Reports/.

Since August 1995 Netcraft, a British Web and network consultancy, has conducted a monthly survey of Web server software usage on Internet-connected computers. The May 1996 results show an astonishing 30 percent of 193,000 survey respondents using some version of Apache HTTPd. (NCSA HTTPd accounts for the next largest chunk at 25 percent, followed by Netscape, with a 15 percent share of respondents.)

The Apache project was organized in March 1995 in an attempt to answer some of the concerns regarding active development of a public domain HTTP server. According to the project's Web site at <http://www.apache.org/>, "the goal of the project is to provide a secure, efficient and extensible server that provides HTTP services in sync with the current HTTP standards." The home page is shown in figure 2.1.

Figure 2.1 : The Apache Project's home page.

The software runs on UNIX and builds directly on the NCSA source code. It therefore meets conventional Web standards. Apache aims to provide superior speed and stability over the NCSA server, as well as enhanced features. Most of these are technical, such as the use of UNIX-standard DBM databases for authentication to improve server performance with large number of authorized users.

NOTE
The online documentation for Apache SSI complements the NCSA specification. See http://www.apache.org/docs/mod_include.html.

Apache supports SSI in the same manner as NCSA HTTPd. Security is also similar, with improvements as noted previously. The current production release (1.0.5) does not support encryption or proxies. A general API to allow native program interactions is under development.

From the decision-maker's standpoint, Apache is a better NCSA server with a large and growing following. Both are currently free. The Apache project, however, reports that NCSA is in the process of changing its license terms, and makes a point of stating that Apache will never charge for use or redistribution. If low cost is prominent on your checklist, this may be a selling point.

Offsetting this advantage somewhat is the lack of product support for freeware in general. Apache is supported mainly through the comp.infosystems.www.servers.unix newsgroup. The software comes in source code form; you must compile at the UNIX command line. Apache will therefore be most suitable in shops with good UNIX skills, where the sweat equity required in lieu of a purchase is not too burdensome.

Netscape on the Server

NOTE
Complete information on Netscape's server platforms is available at http://home.netscape.com/comprod/server_central/index.html.

The company best known for its market-leading Web browser also has an impressive line of server software. Netscape recently reoriented its server offerings to meet a broader spectrum of business needs. All products are available for both major UNIX platforms and Windows NT.

At the top of the line is Netscape Enterprise Server, a high performance, secure Web server intended as a hub for client/server application systems. Out of the box the server comes integrated with a full-text search engine and multi-user version control.

On the World Wide Web Netscape has earned a reputation for driving the HTML standard forward with its popular, proprietary extensions. The same thing is happening with Netscape's HTTP implementation. For instance, in addition to standard server-side includes, Enterprise Server supports a range of server-parsed enhancements. These are suitable for adding dynamic data to Web pages, such as live stock quotes, weather data, or manufacturing data from an assembly line. Netscape's vision is clearly to make the Web server a real-time decision support resource.

The Netscape Server API (NSAPI) is a means of extending or customizing the core functionality of the Netscape Web server. NSAPI provides a scaleable, efficient mechanism for building interfaces between the HTTP server and back-end applications.

NOTE
In its white paper, "The NSAPI versus the CGI Interface," Netscape points out many NSAPI advantages, such as the ability to customize server behavior regarding on-the-fly document translation, non-standard encryption, or custom error handling. Netscape's paper is online at http://home.netscape.com/newsref/std/nsapi_vs_cgi.html.

Netscape's emphatic point concerning the implementation of NSAPI is that it modularizes their server product, in the spirit of user-configurable software. While Netscape has no intention of giving away its source code, NSAPI offers users almost the same discretion in tailoring product behavior. All of Netscape's servers support NSAPI. NSAPI, in turn, supports JavaScript on the server, making possible ODBC connections to relational databases.

In the areas of authentication, authorization and encryption, few vendors can rival Netscape, which drafted many of the Web security standards in current use. These include Secure Sockets Layer (SSL), a Netscape proposal widely adopted for Internet commerce.

Netscape Proxy Server, a peer product to Enterprise Server, enables companies to optimize Web performance across network boundaries, both Internet and internal. Proxy Server provides the basic gateway and security services mentioned earlier. In addition, Netscape has added replication to this server, making it a powerful tool for segmenting and controlling access to large webs. Automated replication is an important capability for distributed data systems. It accounts for a much of the appeal of groupware products such as Lotus Notes. By offering replication at the Web server, Netscape further marginalizes these traditional venues.

Responding to the demand for integrated solutions, Netscape bundles Enterprise Server and Proxy Server with other Internet software in a package called SuiteSpot. The server suite also comes with Netscape's new Web development platform, LiveWire Pro.

For smaller installations Netscape offers FastTrack Server, optimized for easy setup and maintenance. The $295 product complements its enterprise-strength siblings at the workgroup or departmental level. FastTrack Server supports secure Web communications through the Secure Sockets Layer (SSL). The server also supports NSAPI, JavaScript and ODBC, in keeping with Netscape's Internet Application Framework.

Besides these recent releases, Netscape still supports its original product line, including Netscape Communications Server and Commerce Server. However, these employ previous-generation technology and should not be considered for new projects.

Gaining Fast: Windows NT Webs

Microsoft was late to market with the Internet. Server authors such as Rob Denny (now at O'Reilly & Associates) ported NCSA to Windows long before February 1996, when Microsoft finally released its Internet Information Server (IIS).

But timing is a second-order effect when a giant jumps in the pool. The splash is what gets your attention.

IIS is creating a splash for three reasons. First, Microsoft is giving it away through its WWW site at <www.microsoft.com>. As competitors have been quick to point out, Microsoft's fee-based support offsets the zero initial cost; but a freebie is compelling nonetheless. Second, IIS was designed from the ground up to run under Windows NT. The week of its release, reviewers were already acclaiming its superb performance, a consequence of single-minded design. Third, IIS bundles many of the tools needed to implement high-functioning Intranets out of the box. With most Web servers, the onus is on the developer to assemble a tool kit, learn multiple languages, and integrate them all to the customer's satisfaction. With IIS, most of this is turnkey.

These advantages come with the same Faustian bargain as all Microsoft products-namely, vendor-dependence. Where the free market (acting on the open technology of UNIX) brought about a rainbow of variants, there is only one Windows NT. And it has the assembled might of a $2B software giant and its army of VARs and technical partners behind it.

As an IT decision-maker assessing platforms, your chief concern is whether NT-based Web servers will meet your needs. The answer is that technically, they are at least the equal of UNIX httpd, although NT employs different methods to achieve the same ends. On the other hand, deciding whether the Microsoft Way is a good fit for your organization is "left as an exercise for the reader."

Intranets Based on Windows NT

Make no mistake: buying into IIS is a fork in the road. Many of the tools and standards synonymous with Web technology under UNIX-which is to say, canonical standards-are weakly supported or non-existent in the NT world.

For instance, the language of choice for many programming tasks under UNIX is Perl, a powerful and concise scripting language. While versions of the Perl interpreter have been written for many non-UNIX platforms, including NT, they tend to have bugs and lack key features of the UNIX original. Kinks like this are often smoothed with time. But the problem remains that Perl is a UNIX tool, requiring mastery of UNIX syntax and programming concepts (such as regular expressions) largely irrelevant to NT. So, while NT Perl may be a useful stepping stone for UNIX programmers making the leap to Windows, it will never be the language of choice for IIS.

Investigate NT Perl yourself by downloading it from http://www.process.com/resource/perl.htp (version 4.036, the classic) or http://www.perl.hip.com/ (v5.001, the latest).

If this were the only gap between IIS and UNIX HTTP, the two roads would run much closer. As it turns out, however, Perl dysfunction is just a symptom of the non-standard way all Windows-based HTTP servers implement the Web's programming interface, CGI.

Windows CGI ("Win-CGI") is Microsoft's native interface, that enables Web forms to launch processes and display result sets. The specification for Windows CGI v1.3a begins with the following disclaimer:

"It is not intended for this specification to enter the Internet standards track, as it is platform-specific to Microsoft Windows 95 and Windows NT." [Courtesy: Rob Denny]

Essentially Microsoft is writing a new standard in its own image. This won't come as a surprise to anyone who has watched IT vendors duke it out for the last ten years. Leveraging the low cost of open technology to dominate the market with proprietary enhancements is a winning strategy. It's precisely what Netscape has done with its browser-all those <FONT> and <CENTER> tags run roughshod over the HTML standards. Microsoft is merely following the leader with its own (different, naturally) browser extensions. The same holds true for its effort to redefine CGI.

NOTE
Anyone serious about creating an NT-based Intranet should read Rob Denny's Win-CGI specification, available online at http://solo.dc3.com/wsdocs/32demo/windows-cgi.html.

The Power of Proprietary Thinking

For some organizations, the idea that Microsoft will be the one-stop place to shop is comforting. Open standards put a lot of responsibility on the customer to understand technology components and integrate them. In an age of information overload, proprietary thinking may be just what the doctor ordered.

NT-based servers have some powerful features over their UNIX cousins. We'll focus on Microsoft IIS here, but many of these features are available with other products as well.

NOTE
For a fair-minded overview of the offerings, consult the list of Win32 servers and features online at http://www.webcompare.com/server-win.html.

The virtues of IIS fall into two categories: administrative ease and open database access. Setting up an NT-based server is straightforward. You simply follow the instructions, filling out a sequence of online forms while the server does the hard work behind the scenes. This is a distinct advantage over UNIX-based servers, which usually require some character-based dialogue via the command line. An NT server can be maintained graphically as well.

Enabling users to access data distributed throughout the enterprise is the Holy Grail of networking, and the IIS makes a real contribution towards this goal. It does this in two ways. First, the proprietary Win-CGI interface described above provides a pipeline to so-called Rapid Application Development (RAD) tools like Visual Basic and PowerBuilder. These programming environments bring with them rich libraries of components that solve many standard problems, such as exchanging data with a database. In addition to this, IIS comes with its own Internet Database Connector (IDC).

IDC enables the developer to create templates for data access and retrieval. The problem with Web-to-database connectivity so far has been the need to maintain custom scripts for formatting query results in HTML tables. IDC automates this labor-intensive process. Furthermore, in conjunction with other IIS enhancements (namely, Microsoft's proprietary ISAPI), IDC promises to improve database access performance as well.

On balance, Microsoft's IIS is a very strong product that can solve major Intranet problems, albeit with proprietary solutions. For the price (free), it's well worth comparing to other NT-based servers.

Leveraging the Novell LAN: NetWare Web Server

NOTE
For more information on NetWare Web Server, visit Novell's Web site at http://iamg.novell.com/iamg/internet/nws.htm.

If your enterprise LAN is based on Netware 4.1, you need to consider Novell's Web Server for Netware. (Novell has no plans to support earlier versions of Netware). Implemented as a set of NetWare Loadable Modules (NLMs), NetWare Web Server can host HTML files natively, eliminating the need for a UNIX or NT platform. It ships with a native Perl interpreter and features CGI support, access control and logging.

In its current release (2.1), Web Server for Netware lacks a couple of desirable features, such as the ability to support multiple IP addresses on one server ("multi-homing") and proxy support. These will come, according to Novell.

Where the product shines, not surprisingly, is in the area of network integration. The Web Server authenticates users against NetWare Directory Services (NDS) simplifying security administration. The Web Server also supports standard NCSA-style Web security. SMTP gateways are available from Novell and other vendors to enable the Web server to process "mailto:" requests.

Many of the features for which NetWare is justly famous carry over to the Web server. For instance, you can ensure scalability in your Intranet-the ability to add processors without replacing whole machines-by hosting it on a NetWare 4.1 Symmetric MultiProcessing (SMP) platform. Alternatively, if your primary need is 24-hour, 7-day availability with no downtime, you can run the Web server on NetWare SFT III, Novell's highly rated fault-tolerant platform.

NOTE
Several companies besides Novell make server software for NetWare. Two you may wish to consider are:
American Internet Corporation (Bedford MA). AIC offers SiteBuilder‰, a complete Web server solution for NetWare with features similar to Novell's entry. However, SiteBuilder runs on NetWare 3.11 or later, good news if you run a legacy LAN. Find out more at <http://www.american.com/>.
Electronic Dimensions (Australia), makers of Edime‰ WebWare‰, a line of Web servers for NetWare. Distinguishing features include powerful non-standard extensions to SSI, access to server serial ports for Web-based process control, automatic formatting of database query results as HTML tables. An Enterprise Version adds support for SQL and EDI (Electronic Data Interchange) transactions. For more information consult Edime's Web site at http://www.edime.com.au/webware.html.