Chapter 3

HTML and Forms


CONTENTS


While you, as the CGI developer, might spend weeks or even months developing complex CGI applications, the most visible aspect of your work is not the code itself, but the front end to your application. As any developer knows, good programs have good user interfaces.

Programming a good user interface can take months. Often, a CGI developer doesn't need to worry about programming an interface because someone-specifically, the author of the Web browser-has programmed the interface already. However, even though a CGI developer doesn't often program the user interface, he or she must be concerned with the underlying HTML used to code the interface.

The look and feel of the user interface is easily controlled using fairly simple HyperText Markup Language (HTML) form tags. This chapter provides a complete guide to creating forms using HTML. Because most CGI programs obtain data from forms, you learn the relationship between HTML and CGI. You also learn the data you can expect to gather from various form elements.

A Quick Review of HTML

This section briefly discusses some basic styles rather than offer a tutorial on HTML. I assume you already have some experience with HTML. I want to convey the importance of good style and emphasize certain elements of HTML. This section also briefly goes over some specific tags that you might find useful when you are programming your CGI applications.

HTML was invented primarily as a way to represent information, not as a way of controlling how the information looks. What is the difference between information representation and presentation?

Consider the drawing in Figure 3.1. I can describe this picture to you in several different ways. I can tell you that it consists of a big circle with three points and a curve inside. Given this information along with exactly where the points and curve should go, you can create an accurate reproduction.

Figure 3.1 : An example drawing.

A written description of this picture might look like what is shown in Listing 3.1.


Listing 3.1. A possible written description of Figure 3.1.
Circle with center at (0,0) and radius 4
Point at (-2,2)
Point at (2,2)
Point at (0,0)
Bottom half of semi-circle with center at (0,0) and radius 3

There are no inherent problems with this kind of visual description, but there are limitations. For example, at first glance this written description doesn't reveal what it is supposed to represent. After following the instructions and drawing the picture, you can infer that the listing is supposed to describe a happy face. However, you have no way of knowing whether this inference is correct. For all you know, I can be describing some mathematical function or the design for a new golf course. Indeed, I haven't even given you a scale. You could draw a circle with a four-inch radius or a four-foot radius.

If I am indeed talking about a happy face, I can describe it more effectively as a round head with two beady eyes, a small nose, and a smiling mouth. I could come up with a mark-up language similar to HTML to describe the face. I can write its representation as shown in Listing 3.2.


Listing 3.2. An example of a mark-up language.
<face>
  <head>round</head>
  <eye position=left>beady</eye>
  <eye position=right>beady</eye>
  <nose>small</nose>
  <mouth>smile</mouth>
</face>

Although this description does not tell you exactly what I am envisioning, there's no question that I'm describing a happy face. Given the description in Listing 3.2, you might draw the happy face in several different ways. A few possibilities are rendered in Figure 3.2.

Figure 3.2 : A few variations of the example drawing.

Although each depiction rendered in Figure 3.2 is different, every picture describes a happy face.

HTML Basic Font Tags

Although HTML is primarily a representative language, it gives you a large amount of visual control as well, especially with some of the proprietary extensions described in Chapter 14, "Proprietary Extensions." Good style and proper use of HTML is important in your documents. This means taking advantage of both the visual and representative tags.

Examples of visual tags include the bold tag <b> and the italics tag <I>, and of course, the <PRE> preformatted tag. When you mark bold text in your HTML documents, you know exactly what the text will look like in a Web browser. You assume that the browser knows best, and usually it does; but not always. Although visual tags give you more control of how you want your text to look, they have their limitations, and you should use them sparingly. Try and use style tags instead wherever possible. With representative or style tags such as headline tags (for example, <h1>) and list tags (for example, <ul>), you don't necessarily know how the text will be displayed on the Web browser. Because of this, you will find that tags that describe certain elements within an HTML document are often more useful for a CGI developer than a visual tag. See Appendix B, "HTML Guide," for a listing of the different tags.

HTML documents consist of two major elements: the head and the body. The <head> tag contains information about your HTML document, most of which is never displayed. The <body> tag contains all the information that will show up in your Web browser. All proper HTML documents follow this basic format, shown in Listing 3.3.


Listing 3.3. Bare bones of an HTML file.
<html>
<head>

</head>

<body>

</body>
</html>

Whether you use tags such as <html> or <head> in your HTML documents, your documents probably look fine from your web browser. Nevertheless, you should still use the proper tags for a number of important reasons. Think about the English language or any other language. If I say, "Me food want," you can figure out that I'm hungry even though I didn't use proper English. However, as my thoughts become more complex, my sentences will become more indecipherable unless I use correct English. Proper grammar and style are important for clarity and communication.

Similarly, with a computer language, you can write obfuscated code that happens to compile and run on your machine. However, if you don't program correctly and use good style, your program might not be portable, it might do unexpected things, it might not do everything you want it to do, and it is difficult to debug and understand.

Although some Web browsers can understand your documents and display what you expect, there is no guarantee that all Web browsers will understand your Web documents and display them correctly unless you use proper HTML. The importance of good style becomes more clear as the size and number of your documents grow and as you do more advanced tasks using CGI.

The <head> Tag

The <head> tag is extremely useful, especially for developers using CGI and HTML forms to communicate with databases. The HTML specification (RFC1866) defines the head of the document as "an unordered collection of information about the document." Most of the information enclosed in the <head> tag is never displayed. Instead, the head provides an area to describe the document. For example, you can define keywords for the document within the <head> tags or include the document's author and when it was last modified.

You can include within the <head> tag five possible tags. I discuss only the three that apply to CGI programming in this chapter: <base>, <isindex>, and <meta>. For a complete reference guide to HTML, refer to Appendix B.

The <base> Tag

Format:

<BASE HREF=" . . . " >

Note that <base> is a stand-alone tag. No </base> tag is required. The <base> tag tells the web browser where that document belongs in relation to the rest of the documents on the Web server. This is important for a number of reasons. Suppose you have a CGI program that returns the HTML shown in Listing 3.4.


Listing 3.4. HTML without using the <base> tag.
<html>
<head>
<title>A Big Bear</title>
</head>

<body>
<h1>A Big Bear</h1>

<img src="bigbear.gif">
</body>
</html>

When the client receives and parses this HTML file, it will look for the image bigbear.gif in the same directory as the CGI program. In all likelihood, the graphic isn't located in that directory. For example, you might keep all of your CGI programs in one directory and all of your images in another. Alternatively, you might have specified your CGI directory as only containing CGI programs (as discussed in Chapter 2, "The Basics," with servers such as NCSA). If this is the case, then when the browser tries to access /cgi-bin/bigbear.gif, it will try to run bigbear.gif rather than send the image.

Suppose that your machine name is yourserver.org, and that the CGI program that returns this HTML is in /cgi-bin/bigbear.cgi, and that bigbear.gif is in /images/bigbear.gif. You can either specify the complete path of the image as shown in Listing 3.5, or you can use the <base> tag as shown in Listing 3.6.


Listing 3.5. HTML using the complete path of the image.
<html>
<head>
<title>A Big Bear</title>
</head>

<body>
<h1>A Big Bear</h1>

<img src="/images/bigbear.gif">
</body>
</html>


Listing 3.6. Using the <base> tag.
<html>
<head>
<title>A Big Bear</title>
<base href="http://yourserver.org/images/">
</head>

<body>
<h1>A Big Bear</h1>

<img src="bigbear.gif">
</body>
</html>

Both of these solutions enable the browser to find the location of the image. In this example, there is no apparent advantage for using the <base> tag compared to specifying the complete path in the <img> tag. In fact, unless you have a specific reason to avoid using full paths in HTML files, specifying the full path might be the better solution in this case.

There are, however, important applications for the <base> tag. For example, you might have a program that does keyword searches of documents on your Web server. A useful application is one that adds bold to all of the keywords found in the returned HTML document.

Designing such a CGI program is fairly simple. A detailed example is given in Chapter 12, "Databases." For now, it is adequate simply to discuss the design of the application. Let's call this application print_bold.cgi. Given the URL of a document and a keyword, print_bold.cgi will access the document and surround all occurrences of the keyword with bold or some other emphasis tags. A call to print_bold.cgi looks like this:

http://yourserver.org/cgi-bin/print_bold.cgi/bigbear.html?fuzzy

In this case, print_bold.cgi displays the file bigbear.html located in the root directory of your document tree and highlights all instances of the word fuzzy. However, what happens if fuzzy.html has an image that does not specify the complete path? Although you might include complete paths for images in all of your HTML documents, there are no guarantees that other authors on your server will do so as well. In this case, the base address is the following by default:

http://yourserver.org/cgi-bin/

This is clearly not adequate, because the image is in a different directory specified relative to the document root. In order to prevent broken images like this from occurring, edit your program, print_bold.cgi, to add the <base> tag with the given URL. Now, the following base tag is included in the returned HTML:

<BASE HREF="http://youserver.org/">

The base address is now the document root of your directory tree, and all of the images and other inline links are properly displayed.

Note
Notice how the argument is passed to the CGI program.
http://yourserver.org/cgi-bin/print_bold.cgi/bigbear.html?fuzzy
The bigbear.html argument is separated from the fuzzy keyword argument by a question mark (?). Remember from Chapter 2 that encoding is the way arguments and information are passed to a CGI program. Arguments are passed differently than form data. You learn a little more about encoding in this chapter and later on in Chapter 5, "Input."

The <meta> Tag

Format:

<META [HTTP-EQUIV | NAME]=" . . . " CONTENT=" . . . " >

The <meta> tag, which is a stand-alone tag, is used to describe "meta-information," or information about the document such as keywords, type of content, and other elements. One of the original intents of the <meta> tag was to enable the document authors to define certain HTTP headers within an HTML document. As discussed in Chapter 2, a web server sends a set of HTTP headers to the client before it sends the actual document. Suppose you want to set an expiration date on your document. For example, if you provide a news service, you might want old news files to expire on a certain date. You can write a CGI program that sends an Expires header along with the other HTTP headers, such as the program shown in Listing 3.7.

Note
The expiration date can do two things. The expiration date tells browsers that if the present time is past the expiration date, the browser should not use any cached copies of the page. The Expire header can also affect certain server actions, but headers are discussed more a little later.


Listing 3.7. HTML without the <meta> tag.
#!/usr/local/bin/perl

print "Content-Type: text/html\r\n";
print "Expires: Sat, 30 Nov 1996 10:29:02 GMT\r\n\r\n";
print "<html><head>\n";
print "<title>Late-Breaking Story!</title>\n";
print "</head> <body>\n";
print "<h1>Late-Breaking Story!</h1>\n";
print "<p>This story won't be late-breaking for very long.\n";
print "</body> </html>";

It's not always feasible to write a CGI program for every document you want to have an expire header. One solution is to write one general CGI program which, given a document, will send the appropriate HTTP Expires header. However, this is still not the prettiest solution. It is better if you can specify the expiration information within the HTML document.

This is the purpose of the <meta> tag. Assuming your server preparses each HTML document, you can write an HTML file that has the same effect as the program in Listing 3.7. In this case, the server will parse the file for the expiration information and send that information before it sends the rest of the document (see Listing 3.8).


Listing 3.8. HTML with the <meta> tag.
<html> <head>
<title>Late-Breaking Story!</title>
<meta http-equiv="Expires" content="Sat, 30 Nov 1996 10:29:02 GMT">
</head>
<body>
<h1>Late-Breaking Story!</h1>

<p>This story won't be late-breaking for very long.</p>
</body>
</html>

The HTTP-EQUIV option in the <meta> tag tells the server to send the contents of this <meta> tag as an HTTP header.

Unfortunately, this technique requires a server that preparses the HTML file appropriately, and most servers will not do this. However, even if your server doesn't preparse HTML documents for the <meta> tags, the <meta> tag is still a useful place to store information about the document that can be accessed by a CGI program. For example, you can define keywords within a document using the following tag:

<meta name="Keywords" content="milk duds">

Then you can write a CGI program that searches each HTML file for the keywords defined in a <meta> tag.

Note

The <meta> tag can also be used by clients for special purposes. Some companies, most notably Netscape, enable you to cause the browser to do certain tasks such as automatically refresh itself by including the appropriate <meta> tag. This is discussed in more detail in Chapter 14.

The <isindex> Tag

Early in the history of the World Wide Web, developers spoke of the need for a way to search some index of HTML or other documents on the server. To establish this search feature, two things were required: a way for a user to enter keywords and a way for the server to search all of the documents. The result was the <isindex> tag.

When placed within the <head> tags, <isindex> tells the browser that the user is able to search an index. The specification for this tag is purposefully vague. It does not mention which index should be searched, how to search the index, or how the client should enable the user to search the index. The <isindex> tag is stand-alone and affects the entire document.

The most common way browsers deal with an <isindex> HTML file is to display a form element with some prompt. For example, the Netscape prompt is This is a searchable index. Enter search keywords:. When you enter data and press Enter, the browser encodes the keywords just as it does normal CGI data and sends it to the server.

For example, suppose you have an HTML file named document.html, as shown in Listing 3.9. To access the file, you use the URL http://yourserver.org/document.html and get Figure 3.3.

Figure 3.3 : An example of a file using the <isindex> tag.


Listing 3.9. The document.html file.
<html>
<head>
<title>Some searchable file</title>
<isindex>
</head>

<body>
<h1>Search me!</h1>

<p>You can search an index using this HTML file.
</body>
</html>

Suppose you enter the following keywords:

pig blankets

The browser encodes this by replacing spaces with pluses (+) and URL encoding special characters. This encoded string then gets appended to the base URL:

http://yourserver.org/document.html?pig+blankets

The server then decodes this string back to its original expression. What the server does next depends on the server. Some servers such as the CERN server enable you to specify which search program the server runs when it receives a URL like this.

If the URL refers to a CGI program, the server passes this string to the command line (argv) and to the environment variable QUERY_STRING. The CGI program can then do what it wants. There is a more in-depth discussion of <isindex> input in Chapter 5. There are CGI examples using the <isindex> tag in Chapter 11, "Gateways."

Suppose the URL does not refer to a CGI program and your server doesn't enable you to specify a search program to run for all <isindex> queries. What is a general method for specifying a CGI program that searches some index? You can use the <base> tag described previously to specify an alternative base address.

For example, if you have the program search.cgi, which searches an index for keywords, you can modify document.html in Listing 3.9 to run search.cgi whenever someone attempts an <isindex> search, as shown in Listing 3.10.


Listing 3.10. The document2.html program.
<html>
<head>
<title>Some searchable file</title>
<base href="http://yourserver.org/cgi-bin/search.cgi">
<isindex>
</head>

<body>
<h1>Search me!</h1>

<p>You can search an index using this HTML file.
</body>
</html>

Now, if you enter the keywords pig blankets, the server will search for those keywords using search.cgi.

Forms

The early World Wide Web developers soon realized that the <isindex> tag was deficient in many ways and that a new way of accepting user input from the web browser was needed. After much discussion, a draft specification of HTML forms was proposed and later accepted.

Forms enable you to create a fairly sophisticated user interface to your CGI programs with text fields, checkboxes, menus, and other useful tools. The user interface, or front end, is controlled by HTML tags, which were briefly introduced in Chapter 2. This chapter next goes into more detail, but first gives you a quick review on how forms work. Each input, or form, field (no matter what the type) represents a variable with a name you provide. The user provides the value of these variables by filling in the field. You can also provide a default value.

After this information is submitted, the browser properly encodes the information and submits it to the server, which consequently passes the information to the CGI program. Certain form elements get passed to the CGI program in special ways. How this information is submitted for each type of tag is described later in this chapter. Regardless of the input type-whether it is a text field or a radio button-the only information that is returned to the server and CGI program is the name of the input field and the value, which always consists of some text.

Although you can accomplish almost every type of user-entry task using HTML forms, you might be frustrated by the lack of certain types of input fields or options. Some vendors have included their own proprietary tags and options that extend the HTML form specification. These tags are described in Chapter 14, "Proprietary Extensions." New technology offers new ways of customizing the forms interface. You can read more about these technologies in Chapter 8, "Client/Server Issues."

The <FORM> Tag

Format:

<FORM ACTION=" . . . " METHOD=[POST | GET] [ENCTYPE=" . . . " ]>

The <form> tag is the essential element of the HTML form. All form elements are surrounded by this tag. Although you can have multiple forms on one page, you cannot nest forms.

ACTION tells the browser where to send the information after it has been submitted and encoded. Enter the location of the CGI program that will process the form data in ACTION. METHOD specifies how the information should be sent to the server. The POST method will send the information to the server using standard input, whereas the GET method passes the data through an environment variable.

Tip
In general, the POST method is the preferred way to send data to the server. The GET method is primarily useful for quickly passing information to a CGI program using the URL or for maintaining state. Because the GET method relies on the URL to pass the information and an environment variable to store it, the length of the information you can submit using the GET method has an upper bound. The POST method does not have an upper bound and is especially useful for large amounts of data.

The ENCTYPE variable is optional and is usually excluded. You use ENCTYPE to specify how the information from the form should be encoded by the browser. This attribute applies only to the POST method. All information passed via the GET method will be encoded in the standard way.

The default MIME type for ENCTYPE is application/x-www-form-urlencoded, which is simply the standard URL encoding method for CGI. Theoretically, programmers who write web browsers could implement all sorts of encoding types for form encoding, including secure encoding. Such implementations are non-existent to rare, however. You learn some proprietary uses of this tag in Chapter 14.

The <INPUT> Tag

Format:

<INPUT TYPE=" . . . " [NAME=" . . . " ] [VALUE=" . . . " ] [SIZE=" . . . " ][MAXLENGTH=" . . . " ]
Â[SRC=" . . . " ] [CHECKED]>

The <input> tag is the most important and most flexible form element. It is a stand-alone tag; no </input> tag is required. It enables you to specify a number of different ways to input data. There are a number of different input types; each is described in detail in the following text. All input types except for submit and reset require a name. Otherwise, the variable represented by the <input> tag won't have an identifier. The other possible attributes to the <input> tag have different properties for different input types.

Caution
You can give several input fields the same name, but this is hardly ever a good idea; be careful when you do this. After the CGI program receives the encoded information, it has no way of distinguishing different values associated with the same name. For example, I can create a form with two text fields and a radio button, all with the name "street". When the CGI program receives this information, it has no way of telling which "street" values came from which input fields or input types. All the CGI program knows is that it has three different variables, all called "street".

type=text

Syntax:

<INPUT TYPE=TEXT NAME=" . . . " [VALUE=" . . . " ] [MAXLENGTH=" . . . " ] [SIZE=" . . . " ]>

The most basic input type is the text field. This gives you an input box, sometimes called a form field, which enables the user to input a single line of text. You can specify a maximum length of the text, which can be entered using the MAXLENGTH attribute. If you do not enter values for the MAXLENGTH argument the browser will usually assume that there is not one. The default size for a form field is 21 characters.

SIZE enables you to control the width of your text field. Finally, you can specify a default value to be displayed in the text field by using the VALUE attribute.

Note
If you do not include a TYPE attribute in your <input> tag, by default, it will assume the text type.

Caution
There is no way to restrict the kind of text that can be submitted. For example, you can't set the text field to accept only numbers or any other alphanumeric set. The only option is to reject the input with an error.

Note
If you leave a text field empty, the name of the field will be submitted with an empty value field. For example, if you have the following field and the user leaves this field blank, when the form gets submitted, the value "street=" will be submitted: <input type=text name="street">

type=submit

Syntax:

<INPUT TYPE=SUBMIT [NAME=" . . . " ] [VALUE=" . . . " ]>

Almost every form requires a submit button in order to actually submit the information. There are two exceptions. If there is only one type=text field in your form, no separate submit button is required for certain browsers, such as Netscape. The user submits the data by pressing Enter in the field. If you are using a clickable imagemap type=image, which is discussed later in this chapter, then no submit button is required.

You can control the label of the submit button by using the VALUE attribute in conjunction with the NAME attribute. To have multiple submit buttons in one form, you can specify a different VALUE attribute variable and the same NAME attribute.

For example, you can have a guestbook with two submit buttons: one to submit a private message and one to submit a public one. In this case, the segment of the form with the submit buttons looks like this:

You may either submit a <input type=submit name="message" value="private">
message which will be mailed to the administrator of this site or a
   <input type=submit name="message" value="public"> one which will be
posted to the guestbook for the general public to browse.

If you press the button labeled Private, your CGI program will receive the name/value pair message=private. If you press the other button instead, then the pair message=public will be sent.

type=reset

Syntax:

<INPUT TYPE=RESET [VALUE=" . . . " ]>

The reset type provides a button which, when pressed, clears the form entry. You can change the label of the button by specifying a VALUE.

type=password

Syntax:

<INPUT TYPE=PASSWORD NAME=" . . . " [VALUE=" . . . " ]   [MAXLENGTH=" . . . " ] [SIZE=" . . . " ]>

The password type is exactly equivalent to the text type with one minor difference: when the user enters text, some character (usually an asterisk) appears instead of the actual text. This is to protect sensitive information from eyes peaking over your shoulder.

type=checkbox

Syntax:

<INPUT TYPE=CHECKBOX NAME=" . . . " VALUE=" . . . " [CHECKED]>

I now move on to non-text input fields. The checkbox type enables you to create a box that can be either on (checked) or off. If the box is checked, the associated name/value pair will be submitted; otherwise, the pair is never submitted.

Caution
Unlike the text type, a name/value pair may or may not be submitted with the checkbox type. You need to take this fact into consideration when you are writing your CGI program. If you want to see whether a box has been checked or not, check to see if there's a value contained within the check box's variable name (for example, the name). If there isn't one, then the box was never checked.

You can specify whether the box is checked or not by default using the CHECKED attribute.

type=radio

Syntax:

<INPUT TYPE=RADIO NAME=" . . . " VALUE=" . . . " [CHECKED]>

The radio button is similar to the check box. If you specify multiple radio buttons with the same name, the user can check only one radio button (whereas you can check multiple check boxes with the same name).

If you don't specify a default CHECKED button, the user can submit a form without any of the radio buttons selected. In this case, the radio buttons work the same way as the check box; no name/value pair is submitted. However, once you select a radio button, one and only one radio button must always be selected. In other words, you cannot "uncheck" a radio button once it has been selected except by selecting another radio button with the same name.

type=image

Syntax:

<INPUT TYPE=IMAGE NAME=" . . . " SRC=" . . . " [ALIGN=" . . . " ]>

The image type enables you to select specific pixels on an image. This is a way to offer imagemaps using forms. The SRC and ALIGN attributes are equivalent to the SRC and ALIGN attributes of the <img> tag.

Selecting a pixel on the image is equivalent to pressing a submit button. After you click on a pixel with your mouse, the coordinates of that pixel are returned in name.x and name.y, where name is the value of the NAME attribute.

type=hidden

Syntax:

<INPUT TYPE=HIDDEN NAME=" . . . " VALUE=" . . . " >

You can define variables within your form using the hidden field. The syntax is straightforward. As with all other tags, NAME and VALUE specify the name/value pair. The hidden field is useful for storing state information within a form.

The <SELECT> Tag

Format:

<SELECT NAME=" . . . " [SIZE=" . . . " ] [MULTIPLE]>

Format:

<OPTION [VALUE=" . . . " ] [SELECTED]>

The <select> tag enables you to create a menu of items. You can allow the user to select multiple items by using the attribute MULTIPLE. You can limit the displayed size of the list by using the SIZE tag; if the list is longer than the value of SIZE, the browser will usually include scroll bars. A <select> list of size 1 is generally displayed as a pull-down menu. If a SIZE tag isn't specified, a size of 1 is assumed.

The syntax of the <select> tag is similar to those of lists (<ul>, <ol>, <dl>). Each list item is specified by the <option> tag, which is a stand-alone tag, and the set of <option> tags are surrounded by the <select> tag. If VALUE isn't specified within the <option> tag, the value is the text that follows the option tag. You can use the SELECTED attribute to select an option by default.

With lists of size 1 (pull-down menus), one item is always selected by default. If you don't designate an option as selected, the first option is selected by default. However, with lists of size greater than 1, it is possible to have a <select> list with no options selected. In this case, just as with check boxes and radio buttons, no name/value pair is submitted.

The <TEXTAREA> Tag

Format:

<TEXTAREA NAME=" . . . " [ROWS=" . . . " ] [COLS=" . . . " ]>
Default text goes here
</TEXTAREA>

The <textarea> tag enables the user to enter multiple lines of text. You can specify the size of the text field by using the ROWS and COLS attributes. You can specify default text by surrounding the default text with the <textarea> tags.

Some Examples

The best way to demonstrate how you can use the various tags is by example. I have included several fictional but realistic scenarios along with both the listing and the displayed HTML form. Don't worry about how information is passed to and handled by the CGI program just yet, that is discussed in detail in Chapter 5.

Comments Form

Scenario: Most sites like to have a form for users to submit suggestions, questions, and other comments. It is possible to use the "mailto" URL for comments, but a form gives you more control over what sort of information the user provides. Plus, it looks better.

Our back-end script is called comments.cgi. The form we use is in Listing 3.11. The output can be seen in Figure 3.4.

Figure 3.4 : A simple comments form.


Listing 3.11. The comments.html example.
<html> <head>
<title>Comments Form</title>
</head>

<body>

<h1>Send us your comments</h1>

<form action="comments.cgi" method=POST>

<p>Full Name: <input name="name">

<p>Email Address: <input type=text name="email" size=50>

<p>Comments:

<textarea name="comments" rows=15 cols=70>
</textarea>

<input type=submit value="Submit comments">
<input type=reset value="Clear form">

</form>

</body> </html>

In this example, I used four different input types: text, textarea, submit, and reset. Note that I excluded the TYPE attribute from the first <input> tag. By default, this tag is considered a text field. If I wanted to include some default text in the comments box, I could have inserted it between the <textarea> tags.

Ordering Food

Consider this scenario. Greg owns an Italian restaurant, and he wants to set up a form so people can order pizza over the Web. Customers must be able to select their order from the menu and fill in the delivery information, including their credit card number. Assume that the CGI script used to process the information is called order.cgi. The solution is in Listing 3.12 and the output is shown in Figure 3.5.

Figure 3.5 : Pizza menu and order form.


Listing 3.12. The restaurant.html example.
<html> <head>
<title>Corbett's Zesty Italian Food</title>
</head>

<body>

<h1>Corbett's Zesty Italian Food</h1>

<form action="order.cgi" method=POST>

<h2>Cheese Pizzas</h2>

<p>How many pizzas? <input name="numpizzas" value="0" size=3 maxlength=3>

<p>
<input type=radio name="size" value="large" checked>Large<br>
<input type=radio name="size" value="medium">Medium<br>
<input type=radio name="size" value="small">Small

<h3>Extra Toppings</h3>

<p>
<input type=checkbox name="topping" value="pepperoni">Pepperoni<br>
<input type=checkbox name="topping" value="sausage">Sausage<br>
<input type=checkbox name="topping" value="mushroom">Mushroom<br>
<input type=checkbox name="topping" value="peppers">Peppers<br>
<input type=checkbox name="topping" value="onion">Onion<br>
<input type=checkbox name="topping" value="olives">Olives<br>

<p>Name: <input type=text name="name">
<p>Phone number: <input type=text name="phone">
<p>Address:
<textarea name="address" rows=6 cols=50>
</textarea>

<p>Credit card number: <input type=password name="creditcard" size=20></p>

<input type=submit value="Submit order">

</form>

</body> </html>

Notice that for the credit card field, I use a password input type. This way, when you enter your credit card number, you see asterisks rather than your actual credit card number.

Voting Booth/Poll

Here is another scenario. The city of Cambridge wants to hold mayoral elections over the Web. Your job is to design a form that lists the candidates and enables the voter to choose one. The code is shown in Listing 3.13 and the contents of the browser can be seen in Figure 3.6.

Figure 3.6 : An online ballot for a city's mayoral election.


Listing 3.13. The ballot.html example.
<html> <head>
<title>Cambridge Mayoral Elections</title>
</head>

<body>

<h1>Cambridge Mayoral Elections</h1>

<p>You can either vote for a candidate or view a candidate's position
paper.

<form action="vote.cgi" method=POST>

<select name="candidate">
<option>Thomas Kuhn
<option>Imre Lakatos
<option>Paul Feyerabend
</select>

<p>You may <input type=submit name="action" value="vote"> for the above
candidate, or you may <input type=submit name="action" value="view">
his position paper.

</form>

</body> </html>

Here, I use two elements: a <select> list and two submit buttons. If the citizen selects Thomas Kuhn and presses the vote button, vote.cgi will receive the following:

candidate=Thomas+Kuhn&action=vote

whereas if the voter selects the view button, vote.cgi will receive the following:

candidate=Thomas+Kuhn&action=view

vote.cgi determines which submit button is pressed by looking at the "action" name/value pair; it acts accordingly.

Shopping Cart

Here is a different scenario. A common Web application is the online shopping cart, a CGI program that lets you add various items to your "shopping basket" until you are satisfied and then lets you buy all your selected items at once. This is a fairly complicated application-one that is discussed in great detail in Chapter 13, "Multipart Forms and Maintaining State." One way to maintain state is to use a hidden field, as demonstrated here in Listing 3.14. A sample can be seen in Figure 3.7.

Figure 3.7 : A sample page from a shopping cart application.


Listing 3.14. The car.html example.
<html> <head>
<title>A New Car</title>
</head>

<body>

<h1>Item: A New Car</h1>

<h2>Specifications</h2>

<ul>
  <li>Goes really fast!
  <li>AM/FM radio, power steering, and windshield wipers all standard!
</ul>

<form action="shop.cgi" method=POST>
<input type=hidden name="item" value="newcar">

<input type=hidden name="basket" value="soap">
<input type=hidden name="basket" value="cd">
<input type=hidden name="basket" value="watch">

<hr>
<p><input type=submit name="action" value="Add"> this item to your basket.
<p><input type=submit name="action" value="View"> your shopping basket.
<hr>

</form>

</body> </html>

Here, I use hidden fields for two purposes: to identify the current item, and to keep track of all of the items currently in the user's shopping basket. Once again, I use multiple submit buttons to provide several different modes of action for the user.

Map

Consider this scenario. I want to find out from which parts of the United States people are accessing my page. I'll set up a registration page with a map so users can fill in their personal information and click the region of the United States where they live. The code is in Listing 3.15 and the example is shown in Figure 3.8.

Figure 3.8 : A registration page with a clickable map.


Listing 3.15. The region.html example.
<html> <head>
<title>Where are you from?</title>
</head>

<body>

<h1>Where are you from?</h1>

<p>I want to know where you're from. Please fill out the following
form and click on the region of the United States where you live.
Thanks!

<form action="registermap.cgi" method=POST>

<p>Name: <input name="name" size=30>
<p>E-mail: <input name="email" size=50>

<input type=image name="region" src="usa.gif">

</form>

</body> </html>

Because I'm using a clickable map, a submit button is not required. Clicking on the map has the same effect as pressing a submit button. If you enter Joe Schmoe as the name, jschmoe@yourmachine.org as the e-mail address, and if you click the pixel located at (10,23) on the picture of the United States, the following information will be sent to registermap.cgi:

name=Joe+Schmoe&email=jschmoe@yourmachine.org&region.x=10&region.y=23

By parsing this string, I can figure out exactly where you clicked the map.

Summary

HTML and CGI are closely related. The most common way for CGI programs to get input is from an HTML form. Additionally, CGI programs usually return HTML files, sometimes after first preparsing the file for special tags.

This chapter provides a detailed description of the HTML tags pertinent to CGI programming, especially the form tags. Remember that good HTML style is important. It makes your documents easier to read and helps prevent the unpredictable from being sent to your CGI programs. Now that you have a thorough understanding of the necessary HTML, you will learn CGI input and output in more detail.