Chapter 12

Guestbook/Graffiti Wall


CONTENTS

Some people prefer to surf the Web in anonymity. They like to move from site to site, page to page, reading all the information available. They peruse the bulletin boards, maybe even listen in on the chat forums, but never actively participate in a discussion. In Cyberspace, the term for such individuals is lurkers, as they tend to haunt the recesses of the Internet as shadows in the background.

Other folks dive into the middle of the fray, proudly and loudly, proclaiming their position to anyone who cares to listen and even some who don't. These are the kinds of surfers who like to leave a message for the rest of the Web community. Something that says, "I've been here, and, while I'm at it, this is what I think…." For such individuals, creating a place on your site where they can make their personal comments part of the culture of the Web is an added feature that will be appreciated.

Guestbooks

The principle behind a guestbook is simple:

The Guestbook Database

Guestbooks are databases of information, much like the billboard database discussed in chapter 8, "Advertising with Billboards." As a refresher, the billboard database is a simple, flat file on the server having one advertiser entry, which is called a record, per line. There are two fields within each record: the URL of the graphic to display and the URL of the advertiser's site. Individual fields are separated by a delimiter character that is a printable character that's not present within the information in any field, as in

eff_gry_lg.gif|http://www.eff.org/
cigjs.gif|http://www.visi.com/~sjwalter/javascript/

This same system can be used for a guestbook; all that's needed is to redefine the fields. For example, a simple guestbook could contain the following information for each entry:

Using the same delimiter as before, it would produce records such as this:

Scott J. Walter|sjwalter@visi.com|http://www.visi.com/~sjwalter/

To parse the database into memory for processing into HTML, a Perl loop similar to the one shown in listing 12.1 can be used.


Listing 12.1  Parsing the Guestbook
open (GUESTBOOK, 'guestbook.dat');

while (<GUESTBOOK>) {
   chop; # hack off the newline character

   ($name, $email, $url) = split (/\|/);

   push (@Names  , $name);
   push (@Emails , $email);
   push (@URLs   , $url);
}

close (GUESTBOOK);

Displaying the Guestbook

With the guestbook entries loaded into memory, you are ready to generate the HTML document to display them. To add a little flair to the resulting document, use the tokenized HTML trick from chapter 8, and define an HTML comment line as the placeholder for the guestbook entries within the document. However, this requires that you wrap the HTML file with a call to the CGI script by specifying the Perl script as your link:

Look at the <A HREF="/cgi-bin/guestbook.pl">GuestBook</A>

A different approach is to use server-side includes, and call the guestbook script from within the HTML file. Listing 12.2 demonstrates this technique.

TIP
Server-side includes were first introduced in chapter 1, "Browser Identification."


Listing 12.2  guestbook.html-Guestbook Display File
<HTML>
<HEAD>
   <TITLE>Guestbook</TITLE>
</HEAD>
<BODY>
<CENTER><H1>Guestbook</H1></CENTER>

<HR>

Curious about who else has stopped by?  Here's a list of other
surfers who've visited these pages.

<HR>

<!-- the following line calls the guestbook script -->
<!--#exec cgi="/cgi-bin/guestbook.pl"-->

<HR>
Care to leave your mark here as well?  If so, please
<A HREF="signin.html">sign the guestbook</A> yourself!

<A HREF="/index.html">Back to My Home Page</A>
</BODY>
</HTML>

CAUTION
If you're not running your own Web server, it's a good idea to check with the site administrator to make sure that the server is configured to allow server-side includes. For security reasons, some sites don't allow them.

By using server-side includes, it's not necessary to parse the HTML file looking for a token. The file has already been opened by the server, which is waiting for your script to supply the text to insert in place of the #exec directive. The only looping necessary is that which steps through the lists in memory and constructs the lines of output. Because the resulting output is being piped back to the Web server for processing, you can freely use HTML tags to format the final list. Listing 12.3 takes advantage of HTML formatting to format the guestbook in the table, one guest per line. Each guest entry has two elements:

  1. The guest's name, formatted as a link to his or her corresponding e-mail address.
  2. The URL of the guest's home page, formatted as a link to his or her corresponding home page.

Listing 12.3  HTML Formatting from Perl
...
print "Content-type: text/html\n\n";
...
print "<CENTER><TABLE>";

$numNames = @Names;
for ($i=0; $i<=$numNames; $i++) {
   print "<TR>";
   print "<TD><A HREF=\"mailto:$Emails[$i]\">$Names[$i]</A></TD>";
   print "<TD><A HREF=\"$URLs[$i]\">$URLs[$i]</A></TD>";
   print "</TR>";
}

print "</TABLE></CENTER>";

NOTE
Even though you're not creating an entire HTML document when using server-side includes, the first thing your script must print back to the server is always the Content-type: line. This informs the server how to treat what follows: in this case, HTML text.
Additionally, you should probably place the Content-type: line as soon as possible within your script. Some servers limit the time they wait for output from a script. If your script takes a long time, as when loading in a large guestbook database, the connection could time out before your script finishes. Quickly transmitting the Content-type: field back to the server ensures that the connection will be kept open until your script finishes.

Please Sign In…

After retrieving, formatting, and displaying the guestbook, the only thing left is finding a way for visitors to add their information to the database. Within the context of the Web, getting information from the user is handled through HTML forms. Expanding on the previous code listings in this chapter, listing 12.4 is an example of a simple form.


Listing 12.4  signin.html-Entering a Visitor's Information
<HTML>
<HEAD>
   <TITLE>Please Sign In...</TITLE>
</HEAD>

<BODY>
<CENTER><H1>Please Sign In.../H1></CENTER>

<HR>

Please fill in the blanks below, then click 
<B>Sign In!</B> to add your name to the guestbook.

<FORM METHOD=POST ACTION="/cgi-bin/signin.pl">
   Name: <INPUT TYPE=TEXT NAME=REALNAME SIZE=30><BR>
   E-Mail: <INPUT TYPE=TEXT NAME=EMAIL SIZE=40><BR>
   URL: <INPUT TYPE=TEXT NAME=URL SIZE=50><BR>

   <INPUT TYPE=SUBMIT VALUE="Sign In!"> * 
   <INPUT TYPE=RESET VALUE="Clear Form">
</FORM>

<HR>

<A HREF="/index.html">Back to the Home Page</A>

</BODY>
</HTML>

The form for the guestbook is the easy part. The script that takes the information from the form and adds it to the database does the most work.

Remember that when a form transmits back to the server, all fields and their values pass to the CGI program through the CONTENT_LENGTH environment variable. Individual fields are separated within this string by ampersands (&), as in:

REALNAME=Scott+J.+Walter&EMAIL=sjwalter@visi.com&URL=...

Note also that any embedded spaces within a field convert to plus signs (+). To retrieve the data from "CONTENT_LENGTH," use a Perl fragment similar to that in listing 12.5.


Listing 12.5  Retrieving Form Data
...
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});

# Split the name-value pairs
@pairs = split(/&/, $buffer);

foreach $pair (@pairs) {
   ($name, $value) = split(/=/, $pair);

   # Un-Webify plus signs and %-encoding
   $value =~ tr/+/ /;
   $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;

   # strip HTML comment tags
   $value =~ s/<!--(.|\n)*-->//g;

   if ($allow_html != 1) {
      $value =~ s/<([^>]|\n)*>//g;
   }

   $FORM{$name} = $value;
}

NOTE
The section of the foreach loop that "un-Webifies" the $value string is of special interest. When data transmits from a form back to the server, several character conversions occur. Embedded spaces convert to plus signs (+), and other special characters such as the tilde (~) translate to %-codes. This is a two-character hexadecimal representation of the character's ASCII value, prefaced by a percent sign. This permits data to traverse the Web and still maintain all the original information. Once received by the script, however, it's necessary to convert back to the original form.

Script Hacking
In listing 12.5, there is a line within the foreach (this is a Perl command) loop that strips out HTML comments from the $value variable. This plugs a security hole when using Perl scripting and server-side includes.
Remember that, from within HTML, a server-side include activates by using a line some-thing like:
<!--#exec ...-->
Normally, the Web server ignores the content of comment tags. However, when the #exec directive is encountered, the comment is treated as a request to perform a server-side function, such as run a particular piece of Perl code. This creates a potential break in the security of a site, and permits a creative individual to embed an #exec directive within a form's field for processing by the server.
The signin.pl script writes the content of the form fields out to the guestbook file. This content is later read back into the system by guestbook.pl and passed back to the server as HTML code. At this point, it is formatted by the server, including processing any embedded directives. Embedded directives, such as another #exec., can do just about anything: Retrieve a password file, add a new password into the system, or delete an entire directory tree.
To prevent this kind of hack from occurring, regard the following line:
$value =~ s/<!--(.|\n)*-->//g;
This strips out all HTML comment tags from the input field before the field writes to the database

Once the data has been retrieved from the environment, the final step is to add the new information to the guestbook database. This is done by appending the new information to the end of the existing file:

open (GUESTBOOK,">>guestbook.dat");
print GUESTBOOK "$FORM{'REALNAME'}|$FORM{'EMAIL'}|$FORM{'URL'}\n";
close(GUESTBOOK);

For a more advanced guestbook, load the entire database into memory, then step through it to look for duplicate information. You can either ignore the new entry or replace the old.

A Graffiti Wall

You can extend the guestbook system to allow a visitor to scribble a comment for the enjoyment of others on the Web. All that's needed is to:

  1. Add a <TEXTAREA> field to the entry form to accept a multiple-line comment.
  2. Expand the signin.pl to write the additional field out to the database.
  3. Derive a graffiti.pl script that works just like the guestbook script, with the exception that it also displays the comment field.

As an added feature, you could permit a visitor to embed HTML statements within the comment field and allow them to design a custom bit of graffiti complete with varying color, text style, horizontal rules, and fonts. If you choose to do this, it's vital that you preprocess the comment field to look for any potential embedded commands, such as server-side include calls. Otherwise, you're giving visitors more control over your site than you probably want them to have.

CAUTION
One final word of caution on graffiti walls-unless you preview every submission to the wall, you may end up with slogans or sayings displayed on your Web site that don't give the best impression of you. If you're a little leery of letting visitors to the Web site say whatever they feel about anything, you may want to screen each new submission to your wall.
To screen any new graffiti, you can redirect submissions to your mailbox, or a mailbox on your system, instead of the graffiti page. From there, if the grafitti is acceptable, it can be added to the wall.

A JavaScript Guestbook?

On the Internet now and then, someone asks whether it's possible to create a guestbook with JavaScript. The questioner is usually looking for a way to implement a guestbook, but doesn't have CGI-level access on his or her server.

Can it be done?

Remember that JavaScript can't read (open) other files on the server, nor can it write out to files on the server. This means that any guestbook database must be contained totally within the loaded HTML document. It also means that any data entering through a form cannot be saved by JavaScript for others to view.

You can use JavaScript for the following:

Validate the data within the fields of the sign-in form and prevent the CGI script from running if any necessary information is missing.
Check to be certain that the character your CGI script uses as a field delimiter isn't found in any field.
Pre-scan input fields for HTML directives and possibly warn the user that you don't permit it.
Customize the display of the guestbook data to utilize the best features of the user's browser.

You could also use a server-side include script that writes JavaScript code back out to the HTML document stream. Writing out the code to load a JavaScript array with the guestbook database is an example. From there, JavaScript processes the database internally.

There is development on the Web to permit client-side languages like JavaScript and VBScript to access other files on the server in a secured way. One of these techniques is called data tainting. Until the bulk of the Web community establishes and utilizes a standard, JavaScript is best left to format existing information for display rather than editing or updating it.

From Here…

This chapter took the tricks from chapter 8, "Advertising with Billboards," and applied them to the concept of a guestbook. Displaying a database of visitors is, fundamentally, no different from displaying a list of advertisers. If you're interested in digging further into this and similar techniques, check out: