Chapter 27 Power Scripting Toolkit

CONTENTS

Perl Tools
JavaScript Tools
From Here…

Now that you've made it to the end of this book, you've probably started recognizing patterns in the script code. Many operations, such as reading from files, writing HTML, and so on, are done over and over to such a level of repetition that you may want to build a library of commonly used functions.

Perl Tools

The Perl examples presented throughout the book focused on the underlying mechanisms for various operations-reading files, parsing databases, and so on, which would lead to tedious coding if not for pre-existing collections of functions that have been developed by a variety of Perl gurus around the Web. What follows is an overview of several of the more popular libraries and what they offer.

All libraries and source files covered here are available on the companion CD-ROM or from any Perl Web site on the Internet. If you want to make sure you've got the latest version of any package, three of the best sites to check out are:

Matt's Script Archive (http://www.worldwidemart.com/scripts/)
Selena Sol's Public Domain CGI Script Archive and Resource Library
(http://www.eff.org/~erict/Scripts/)
The Perl Language Home Page (http://www.perl.com/perl/)

cgi-lib.pl

Probably one of the most commonly used Perl libraries today is the cgi-lib collection, written by Steven E. Brenner. cgi-lib is a collection of routines that make manipulation of HTML extremely simple. Table 27.1 lists the functions available in the library.

Table 27.1 cgi-lib.pl Functions

Function	Purpose
`ReadParse`	Reads in data (from standard input for `POST` requests, `QUERY_STRING` for `GET` requests), unescapes the text, and builds a list `%in` containing the key/value pairs.
`PrintHeader`	Prints the `Content-type:` line and blank line that identify the end of an HTML header.
`HtmlTop`	Returns the `<HEAD>` of a document and opens the `<BODY>` tag. You can specify the document title and a `<H1>` header as a parameter.
`HtmlBot`	Closes the document by returning the closing `</BODY>` and `</HTML>` tags.
`SplitParam`	Splits a multi-valued parameter into a list of the constituent parameters. Optionally, the list will be converted into an array.
`MethGet`	Returns `true` if this CGI script was executed through a `GET` request, `false` if otherwise.
`MethPost`	Returns `true` if this CGI script was executed through a `POST` request, `false` if otherwise.
`MyBaseUrl`	Returns the base URL to the script (no path or query information).
`MyFullUrl`	Returns the full URL to the script, including path and query information.
`MyURL`	Returns the same information as `MyBaseUrl`.
`CgiError`	Formats and prints out an error message with HTML markup for a prettier look.
`CgiDie`	Calls `CgiError` to print an error message, then quits the script.
`PrintVariables`	Prints out the variables aquired through `ReadParse`.
`PrintEnv`	Prints out all environment variables in HTML format.

libwww-perl

Originally written by Roy Fielding, libwww-perl started as a backend for MOMspider, a robot designed to help maintain wide-area webs. Since then, contributions from individuals around the world have helped this library evolve and ported it into versions for both Perl 4 and Perl 5.

The libwww-perl package is actually a collection of several different libraries, together providing a consistent programming interface to the World Wide Web. Among the packages included are:

A simple tool for making GET requests from the command line.
A package for implementing the robot exclusion protocol.

NOTE

Because of the popularity of robots and search engines on the Web, a standard has been proposed to control how deep a search engine from one site will be allowed into another. This standard is called the robot exclusing protocol.

A package for manipulating dates as used on the Web.
A package for doing local file requests (from localhost) and making them look as though they came from a Web server.
A collection of routines for reading, manipulating, and creating HTML documents.
Library utilities for handling MIME files and spawning off the proper viewers (based on content-type).
Utilities for manipulating URLS, such as parsing, creating, manipulating, canonicalizing, and so on.

cookie.lib

Written by Matt Wright, cookie.lib is a collection of Perl routines for manipulating cookies. Table 27.2 lists the available functions.

Table 27.2 cookie.lib Functions

Function	Purpose
`GetCookies`	Retrieves all cookies for the associated document.
`PrepareCookie`	Makes a cookie ready for setting.
`SetCookieExpDate`	Sets a cookie's expiration date.
`SetCookiePath`	Sets a cookie's path.
`SetCookieDomain`	Sets a cookie's domain.
`SetSecureCookie`	Identifies a cookie as secure.
`SetCookies`	Writes out the updated cookie information.

Perl 5 Tools

Perl 5, the latest version of this popular language, has shifted its focus toward an object-oriented design. To that end, many of the libraries and function collections are being ported to the new structure, and are found as packages on the various Perl archives.

The following sections look at some of the more popular packages.

CGI.pm Lincoln D. Stein took the cgi-lib.pl concept and ported its functionality to the object-oriented environment of Perl 5. For those doing Perl 5 scripting, this library is a must. This library is unique in that it has its documentation built into the script. You can create formatted documentation by feeding the library through pod2man or pod2html, both of which come with Perl 5.

http_lib.pl Written by Gunther Birznieks, http_lib.pl is a collection of routines allowing your script to connect to another Web site as a browser, retrieve information, and return that data to the original user.

sendmail_lib.pl Also written by Gunther Birznieks, smtpmail_lib.pl simplifies connecting to a mail server and sending e-mail from within Perl. The code is UNIX specific, and is designed to interface with the sendmail program.

smtpmail_lib.pl Another of Gunther Birznieks' packages, smtpmail_lib.pl provides another route to connecting to sendmail, this one by way of directly utilizing TCP/IP sockets to communicate via SMTP (Simple Network Mail Protocol).

date.pl A nice little date-manipulation package that works with dates by converting them to and from Julian format.

JavaScript Tools

Even though JavaScript tends to be highly customized to the type and content of a given HTML document, there is still a standard collection of things you'll always find yourself doing. Just as with the Perl libraries discussed previously, you can find all the JavaScript routines covered here on the companion CD-ROM as well as on one of several JavaScript central sites on the Web.

If you're looking for new and different implementations of JavaScript to add to your tool chest, check out "The JavaScript Index" at the following sites:

The JavaScript Index (http://www.c2.org/~andreww/javascript/)
Cut-N-Paste JavaScript (http://www.infohiway.com/javascript/)
The Web Scripting Secret Weapons HomeSite (http://www.visi.com/~sjwalter/webscripting/)

Browser Identification

Because of the differences between Navigator and Explorer, it's often necessary to know which browser (or which browser version) is being used. Listing 27.1 presents a set of functions that provide that information.

Listing 27.1 Browser Identification

// All functions return true or false.
//
function isNetscape() {
   return (navigator.appName.indexOf("Netscape") != -1);
}

function isNav2() {
   return (navigator.appName.indexOf("Netscape") != -1) &&
          (navigator.appVersion.indexOf("2.") != -1);
}

function isNav3() {
   return (navigator.appName.indexOf("Netscape") != -1) &&
          (navigator.appVersion.indexOf("3.") != -1);
}

function isExplorer() {
   return (navigator.appName.indexOf("Microsoft") != -1);
}

NOTE

The necessity for testing the version of Navigator comes from the additional features Navigator 3.0 supports in JavaScript 1.1 that aren't supported in JavaScript 1.0 (the Navigator 2.0) version.

Object Browsing

New additions to JavaScript tend to come out faster than the documentation explaining them. Sometimes, you have to do your own digging to figure out what a new object is made of. Listing 27.2 is an example of an object-dump function that shows all the properties of an object and their contents.

Listing 27.2 Browsing the Object Tree

function Browse(strObject, maxNest) {
   var nest = 1;

   if(Browse.arguments.length == 3) {
      nest = parseInt(Browse.arguments[2]);
   }

   if(nest > maxNest) {
      return "";
   }

   var theObject = eval(strObject);

   var tStr = "<UL>\n";

   for (propName in theObject) {
      tStr += "<LI>" + propName;

      var tTmp  = eval(strObject + "." + propName);
      var isObj = false;

      if(tTmp != null) {
         isObj = (tTmp.toString().indexOf("[obj") != -1);
      }

      if(!isObj) {
         tStr += " = " + eval("theObject." + propName);
      } else {
         if(tTmp.toString().indexOf("Array") != -1) {
            tStr += "[]";
         }
      }

      tStr += "</LI>\n";

      if(isObj) {
         tStr += Browse(strObject + "." + propName,
                        maxNest, nest + 1);
      }
   }

   tStr += "</UL>\n";

   return tStr;
}

This is also an example of a recursive function in that, whenever a property is encountered that is also an object, Browse() is called again to further explore down the tree. Because recursion chews up more and more memory as it searches farther and farther down, it's necessary to define the maximum number of levels to search; otherwise, you'll generate an error when JavaScript runs out of memory.

Nonetheless, this is a nice function to go exploring through the various JavaScript objects. Once you find an object that you're curious about, you can always explore it further by making it the "root" of your search. An example that would explore the window object to 4 levels is:

document.write("<H2>window</H2>" + Browse("window", 4));

NOTE

This function won't work in Internet Explorer, as it doesn't support the for...in statement.

Arrays

JavaScript 1.1 has its own Array object that handles much of the behind-the-scenes construction. However, if you want to be as compatible as possible with the current mix of browsers on the Web, you'll want to at least account for the existence of Navigator 2.02. Listing 27.3 shows the standard Array creation functions that work in all versions of JavaScript (both Navigator and Explorer).

Listing 27.3 Arrays

function _MakeArray(size) {
   this.length = size;

   for(var i=1; i<=size; i++) {
      this[i] = null;
   }

   return this;
}

function MakeArray(size) {
   return new _MakeArray(size);
}

By wrapping the new _MakeArray() statement within its own function, you place these functions in a parent document (for code centralization). Using them becomes as simple as this:

myArray = parent.MakeArray(5);

NOTE

The new operator can't recognize an object generation function that's located in another document, such as when you're doing code centralization. Therefore,

myArray = new parent.MakeArray(...);

wouldn't work. This makes the "wrapper" function both necessary and nice.

To explore the contents of an array, you can use the function from listing 27.4.

Listing 27.4 Array Browsing

function BrowseArray(strArray, maxNest) {
   var nest = 1;

   if(maxNest == null) {
      maxNest = 1;
   }

   if(BrowseArray.arguments.length == 3) {
      nest = parseInt(BrowseArray.arguments[2]);
   }

   if(nest > maxNest) {
      return "";
   }

   var tTmp    = eval(strArray);
   var isArray = false;

   if(tTmp != null) {
      isArray = (tTmp.toString().indexOf("Array") != -1);
   }

   if(!isArray) {
      return "";
   }

   var tStr = "<UL>\n";

   for(var i=0; i<eval(strArray+".length"); i++) {
      tStr += "<LI>Element " + i + "</LI>\n";
      tStr += Browse(strArray + "[" + i + "]", 1);
   }

   tStr += "</UL>\n";

   return tStr;
}

NOTE

Many predefined JavaScript arrays start their indexing at 0 rather than 1, which is the recommended starting point for user-defined arrays. To browse a user-defined array, the for loop needs to be modified to:

for(var i=1; i<=eval(strArray+".length"); i++) {

keeping in mind that the end condition needs to change (as shown) to <= instead of just <, or you won't access the last element of the array.

User-Defined Objects

As discussed in chapter 8, "Advertising with Billboards," user-defined objects let you create and manipulate your own object constructions. Generally, for each different type of object, you have to create its own unique construction function. However, listing 27.5 demonstrates a way to have one function that creates any kind of object.

Listing 27.5 User-Defined Objects

// parameters are defined in pairs, the first
// being the property name, the second it's
// value.
//
function Object() {
   var argc = Object.arguments.length;
   var argv = Object.arguments;

   for(var i=0; i<argc; i+=2) {
      var strInit = "this." + argv[i]
                  + " = \"" + argv[i+1] + "\";";

      eval(strInit);
   }

   return this;
}
As an example, the Billboard object creation function from Chapter 8:
function Billboard(strImageURL, strLinkURL) {
   this.imageURL = strImageURL;
   this.linkURL  = strLinkURL;
   return this;
}

and its corresponding object initializer:

myBillboard = new Billboard('ad.gif', 'http://mysite.com/');

could be replaced by:

myBillboard = new Object("imageURL", "ad.gif", 
                         "linkURL", "http://mysite.com/");

CAUTION

Internet Explorer 3.0 currently doesn't correctly handle a reference to this within the eval() function, and generates an error.

Databases

User-defined objects and arrays are the basis for JavaScript databases. Using the same eval() function trick from the previous section, listing 27.6 shows how to combine the entire database setup process into one function by building on various functions from the previous sections.

Listing 27.6 Database Creation

// the first parameter is the number of fields/record,
// followed by the field names, then the individual
// records.
//
function Database() {
   var argc = Database.arguments.length;
   var argv = Database.arguments;

   var nFields  = parseInt(argv[0]);
   var nRecords = (argc - 1 - nFields) / nFields;

   this = MakeArray(nRecords);

   for(var i=1; i<=nRecords; i++) {
      var which  = nFields + 1 + (i - 1) * nFields;
      var strCmd = "this[" + i + "] = new Object(";

      for(var j=1; j<=nFields; j++) {
         strCmd += '"' + argv[j] + '",' +
                   '"' + argv[which + j - 1] + '"';

         if(j < nFields) {
            strCmd += ',';
         }
      }

      eval(strCmd);
   }

   return this;
}

As an example of this function, in chapter 8, "Advertising with Billboards," you created a billboard database as follows:

Billboards    = new Array(3);
Billboards[i] = new BillBoard('ad.gif', 'http://mysite.com/');
...

With Database(), all you need do is:

BBoards = Database(2, // number of fields
                   'imageURL', 'linkURL', // properties
                   'ad.gif', 'http://mysite.com/',
                   ... // additional records as needed
                   );

NOTE

When creating complex arrays-arrays whose elements are objects themselves, not simple strings-always start the array indexing from 1, not 0.

Common HTML Creation

If you find yourself creating a lot of dynamic HTML documents, as is done in the online catalog scripts in chapter 17, "Creating Online Catalogs," you may find the functions in listing 27.7 useful.

Listing 27.7 Commonly Used HTML

function HTMLHeader(strTitle) {
   return '<HTML><TITLE>' + strTitle + '</TITLE></HEAD>';
}

// an optional third parameter specifies a default
// selection.  objList is a simple array of strings.
//
function HTMLSelect(strName, objList) {
   var tStr = "<SELECT NAME=\"" + strName + "\">";

   if(HTMLSelect.arguments.length == 3) {
      tStr += "<OPTION SELECTED>" + 
              HTMLSelect.arguments[2];
   }

   for(var i=1; i<=objList.length; i++) {
      tStr += "<OPTION>" + objList[i];
   }

   tStr += "</SELECT>";

   return tStr;
}

// The optional 3rd argument indicates which
// radio button is the default (1 == first,
// 2 == second, etc.).
//
function HTMLRadio(strName, objList) {
   var tStr  = "";
   var which = 0;

   if(HTMLRadio.arguments.length == 3) {
      which = parseInt(HTMLRadio.arguments[2]);
   }

   for(var i=1; i<=objList.length; i++) {
      tStr += '<INPUT TYPE=RADIO NAME="'
            + strName + '" VALUE="'
            + objList[i] + '"';

      if(i == which) {
         tStr += ' CHECKED';
      }
 
      tStr += '>\n';
   }

   return tStr;
}

//  objList is assumed to be an array of strings
//
function HTMLList(objList) {
   var tStr = '<UL>\n';

   for(var i=1; i<=objList.length; i++) {
      tStr += '<LI>' + objList[i] + '</LI>\n';
   }   

   tStr += '</UL>\n';

   return tStr;
}

Other HTML tags can be wrapped in much the same way.

Status Bar Control

Displaying messages in the status bar at the bottom of the browser is a popular alternative to displaying the hyperlink's URL. Listing 27.8 provides a function that makes managing this easy.

Listing 27.8 Status Bar Control

var statusBarID = 0;

// an optional second parameter specifies the
// time (in milliseconds) the message should
// remain before it's cleared.
//
function StatusBar(strMessage) {
   if(statusBarID) {
      clearTimeout(statusBarID);
      statusBarID  = 0;
   }


   var iDelay = 5000;

   if(StatusBar.arguments.length == 2) {
      iDelay = parseInt(StatusBar.arguments[1]);
   }

   var strCmd = "self.status = '";

   // Netscape displays "Document: Done" while
   // Explorer simply displays "Done" by default
   //
   if(isNetscape()) {
      strCmd += "Document: ";
   }

   strCmd += "Done'";

   self.status = strMessage;
   statusBarID = setTimeout(strCmd, iDelay);
}

NOTE

It's necessary to implement a timer in the status bar control to turn off your special status message after a period of time.

This function is most often used in conjunction with the onMouseOver event of a hyperlink, which provides a "context-sensitive help" feature for your site.

Cookies

Cookies were explored in several chapters. Listing 27.9 presents the cookie manipulation routine set, with additional code to make cookies behave under Internet Explorer.

Listing 27.9 Cookies

function _getCookieVal(offset) {
   var iEnd  = document.cookie.indexOf(";",offset);

   if(iEnd == -1) {
      iEnd = document.cookie.length;
   }
  return unescape(document.cookie.substring(offset, iEnd));
}

function GetCookie(name) {
   if(isNavigator()) {
      var arg  = name + "=";
      var alen = arg.length;
      var clen = document.cookie.length;
      var i    = 0;

      while(i < clen) {
         j = i + alen;

         if(document.cookie.substring(i, j) == arg) {
            return _getCookieVal(j);

         }

         i = document.cookie.indexOf(" ", i) + 1;

         if(i == 0) {
            break;
         }
      }

      return null;
   } else {
      return eval(name);
   }
}

function SetCookie(name, value) {
   if(isNetscape()) {
      var argv    = SetCookie.arguments;
      var argc    = SetCookie.arguments.length;
      var expires = (argc > 3) ? new Date(argv[3]) : null;
      var path    = (argc > 4) ? argv[4] : null;
      var domain  = (argc > 5) ? argv[5] : null;
      var secure  = (argc > 6) ? argv[6] : false;

      document.cookie = name + "=" + escape(value)
         + ((expires == null) ? "" : ("; expires=" + 
                             expires.toGMTString()))
         + ((path == null) ? "" : ("; path=" + path))
         + ((domain == null) ? "" : ("; domain=" + domain))
         + ((secure == true) ? "; secure" : "");
   } else {
      eval(name + " = '" + value + "'");
   }
}

function DeleteCookie(name) {
   if(isNetscape()) {
      document.cookie = name + "=" 
         + escape(GetCookie(name))
         + "; expires=01-Jan-70 GMT";
   } else {
      eval(name + " = ''");
   }
}

NOTE

The Internet Explorer "cookie patch" relies on having a collection of global variables that matches your cookie names. For example, if you have a "CIGJSPage'' cookie, you'll need to define a global variable:

var CIGJSPage = "0";

in your JavaScript code.

Remember, also, that this patch has no persistence. The "cookie" values will only remain as long as the page that contains the global variables stays in memory. Once the page is unloaded (when the user leaves your site) the data is lost, making this technique good for framed sites, where the code can be centralized within the parent document.

From Here…

The code presented in this chapter came from various sources and authors across the Internet. In many cases, a basic function was posted by one person, and has been adapted and expanded for the purpose of this book. As with all other code in this book, you'll find full source for all these functions (and much more) on the companion CD-ROM, including detailed comments within the code to help you better understand what tricks are being used. You'll also find a collection of links to the various online resources that spawned many of the ideas presented here, and they will probably send you off on a few new avenues of your own.

Congratulations, you've made it! You now have a collection of scripts and scripting tricks in Perl and JavaScript that you can use to do the following:

Use "out of the box" to implement guest books, visitor tracking, database access, and much more.
Use as a basis for ideas not discussed or even thought of.

What you do now is entirely up to you.

Happy scripting!