Chapter 1

Common Gateway Interface (CGI)


CONTENTS


Thanks to the World Wide Web, almost anyone can provide information on the Internet in a visually pleasing and widely distributable manner. You have undoubtedly navigated the Web and have looked at other people's sites, and you now probably know that intimidating acronyms such as "HTTP" and "HTML" are simply fancy acronyms for "Web" and "way to express information on the Web." Perhaps you have some experience providing information over the Web as well.

The Web has proven to be an ideal medium for distributing information as can be seen from its immense popularity and exponential growth. Although some have questioned the Web's utility and attributed its growth and popularity mostly to media hype, the Web is unquestionably an important means of providing all sorts of information. Not only are many up-to-the-minute news services (providing real-time news, weather, and sports) and reference materials available electronically, vast amounts of other types of data exist as well. The Internal Revenue Service, which made all of its 1995 tax forms and other information available over the World Wide Web, recently remarked that it was actually receiving fan mail for its Web site. Who would have thought that the IRS could ever receive fan mail for anything? It was not because its site was good-looking, but because it was a genuinely useful tool for thousands, perhaps millions, of people.

What makes the Web unique and so appealing as an information server? First, it provides a hypermedia interface to data. Think about the hard disk drive on your own computer. Typically, data has been expressed in a linear fashion analogous to a filing system. For example, you have a bunch of folders, and within each folder, you either have documents or more folders (see Figure 1.1). The Web uses a different paradigm for expressing information called hypermedia. A hypertext interface consists of a document and links. Links are words on which you can click to see other documents or retrieve other types of information (see Figure 1.2). The Web extends the concept of hypertext to include other types of media such as graphics, sounds, and video (hence the name "hypermedia"). Selecting text or graphics on a document enables you to see related information in any number of forms about the item you selected.

Figure 1.1 : The file system representation of data.

Figure 1.2 : Hypermedia.

Almost every type of person benefits from this easy and unique way of representing and distributing information, from academics who want to immediately share data with their peers to business people who want to offer information about their company to anyone who is curious. However, although giving information is extremely important, over the past few years, many have realized that receiving information is just as important.

Although the Web provides a unique, hypermedia interface to information, there are many other effective ways to distribute data. For example, network services such as the File Transfer Protocol (FTP) and gopher existed long before the World Wide Web. E-mail has been the primary medium for communicating and exchanging information over the Internet and most other networks almost since the inception of these networks. Why did the Web become such a popular way to distribute information? The multimedia aspect of the Web clearly contributed to its wild success, but in order for the Web to become most effective, it had to be interactive.

Without the capability to receive input from users as well as provide information, the Web would be a completely static medium. Information would be available only in a format defined by the author. This seems to undermine one of the powers of computing in general: interactive information. For example, instead of forcing a user to browse through several documents as if he or she were flipping through a book or a dictionary, it would be better to let the user specify the keywords of the topic in which he or she is interested. Users can customize the presentation of the data rather than rely on a rigid structure defined by the content provider.

The term Web server can be confusing because it can refer to either the physical machine or the software running on it that makes it interact with Web browsers. When a browser queries a given Web address, it first makes a connection to the machine over the Internet, submitting the request for a document to the Web server software. This software runs constantly, waiting for such requests to come in and responding appropriately.

Although Web servers can send and receive data, the server itself has limited functionality. For example, the most basic Web server can only send the requested file to the browser. The server normally does not know what to do with any additional input. Unless the Web provider tells the server how to handle that additional information, the server most likely ignores the input.

In order for the server to do anything more advanced than retrieving and sending files to the Web browser, you must know how to extend the functionality of the Web server. For example, a Web server cannot search a database based on a keyword entered by a user and return several matching documents unless you have somehow programmed that capability into the server.

What Is CGI?

The Common Gateway Interface (CGI) is an interface to the Web server that enables you to extend the server's functionality. Using CGI, you can interact with users who access your site. On a theoretical level, CGI enables you to extend the capability of your server to parse (interpret) input from the browser and return information based on user input. On a practical level, CGI is an interface that enables the programmer to write programs that can easily communicate with the server.

Normally, if you wanted to extend the Web server's capabilities, you would have to modify the server yourself. This is an undesirable solution because it requires a low-level understanding of network programming over the Internet and the World Wide Web protocol. It would also require editing and recompiling the server source code or writing a custom server for each task. For example, suppose you want to extend your server to act as a Web-to-e-mail gateway that would take user input from the browser and e-mail it to another user. You would have to insert code into the server that would parse the input from the browser, e-mail the input to the other user, and send a response back to the browser over a network connection.

First, such a task requires having access to the server code, something that is not always possible. Second, it is difficult and requires extensive technical knowledge. Third, it works only for your specific server. If you want to move your Web server to a different platform, you would have to start over or at least spend a lot of time porting the code to that platform.

Why CGI?

CGI provides a portable and simple solution to these problems. The CGI protocol defines a standard way for programs to communicate with the Web server. Without much special knowledge, you can write a program in any computer language that interfaces and communicates with the Web server. This program will work with all Web servers that understand the CGI protocol.

CGI communication is handled over the standard input and output, which means that if you know how to print and read data using your programming language, you can write a Web server application. Other than parsing the input and output, programming CGI applications is almost equivalent to programming any other application. For example, if you want to program a "Hello, world!" program, you use your language's print functions and the format defined for CGI programs to print the proper message.

Choosing Your Language

Because CGI is a "common interface," you are not restricted to any specific computer language. An important question many people ask is what programming languages can you use to program CGI? You can use any language that can do the following:

Almost all programming languages and many scripting languages perform these three activities, and you can use any one of them.

Languages fall under one of the following two classes: compiled or interpreted. A compiled language-such as C or C++-tends to be smaller and faster, whereas interpreted languages-such as Perl or Rexx-require loading a sometimes large interpreter upon startup. Additionally, you can distribute binaries (code compiled into machine language) without source code if your language is compiled. Distributing interpreted scripts normally means distributing the source code.

Before you choose your language, you must first consider your priorities. You need to balance the speed and efficiency gains of one programming language versus the ease of programming in another. If you think you want to learn another language rather than use one you already know, carefully weigh the advantages and disadvantages of the two languages.

Perhaps the two most commonly used languages for CGI programming are C and Perl (both of which are covered in this book). Both have their own distinct advantages and disadvantages. Perl is a very high-level yet powerful language especially useful for parsing text. Although its ease of use, flexibility, and power make it an attractive language for CGI programming, its relatively large size and slower performance sometimes makes it unsuitable for certain applications. C programs are smaller, more efficient, and offer more low-level control over the system, and yet are more difficult to program, do not have easy built-in text processing routines, and are more difficult to debug.

Which language is the superior CGI programming language? Whichever language you are most comfortable programming. Both are just as effective for programming CGI applications, and with the proper libraries, both have similar capabilities. However, if you have a heavily accessed server, you might want to use smaller compiled C programs. If you need to quickly write an application that requires a lot of text processing, you might want to use Perl instead.

Caveats

There are some important alternatives to CGI applications. Many servers now include a programming API that makes it easier to program direct extensions to the server as opposed to separate CGI applications. Server APIs tend to be more efficient than CGI programs. Other servers include built-in functionality that can handle special features without CGI such as database interfacing. Finally, some applications can be handled by some new client-side (rather than server-side) technologies such as Java. With such rapid change in technology, is CGI rapidly becoming obsolete?

Probably not. CGI has several advantages over the newer technologies.

Summary

The Common Gateway Interface is the protocol by which programs interact with Web servers. The versatility of CGI gives programmers the opportunity to write gateway programs in almost any language, although there are many trade-offs associated with different languages. Without this ability, making interactive Web pages would be difficult at best, requiring modifications to the server and putting interactivity out of the reach of most programmers who are not also site administrators.