Appendix B

HTML Guide


CONTENTS


This book has assumed that you have at least a working knowledge of HyperText Markup Language (HTML). This appendix provides a complete reference to HTML as well as some conceptual information. This appendix is divided into subdivisions of HTML. Different sections document each tag and sometimes provide an example.

HTML is currently in a state of flux. The current HTML standard (v2.0) is documented in RFC1866, although it is already outdated. Many HTML tags are not part of the official standard, but they are definitely part of the adopted standard and will most likely become part of the new HTML specification currently being drafted. The current Internet Engineering Task Force (IETF) draft of the new (v3.0) specification has expired and has been replaced by several, more specialized drafts. For example, there are separate drafts for tables, file upload, and client-side imagemaps.

Because of the constantly evolving nature of HTML, this reference document will in all like-lyhood be outdated by the time you read it. However, the following are some good HTML references on the Web that are well updated:

Because I want to present HTML tags other than those in the official 2.0 standard, this appendix is inherently subjective. These tags should work properly in almost all browsers. Additionally, extended and new tags supported by Netscape and Microsoft's Internet Explorer are included.

General Structure

The general format for all HTML documents is

<HTML>
<HEAD>
   "metainformation" goes here
</HEAD>

<BODY>
   displayed content goes here
</BODY>
</HTML>

Metatags

Metatags go inside of the <HEAD> tag. The <META> tag is used to embed document meta-information not defined by other HTML tags. Information within the <META> tag can be extracted by servers and/or clients for use in cataloging, identifying, and indexing document meta-information.

<META NAME="..." HTTP-EQUIV="..." CONTENT="...">

This metatag simulates an HTTP header. NAME is used to name properties such as author or publication date. If the NAME element is absent, the name can be assumed to be the value of HTTP-EQUIV. HTTP-EQUIV binds the element to an HTTP response header and the value of the header is in the CONTENT attribute.

<TITLE>...</TITLE>

This is the title of the HTML document; it must occur within the head of the document. The attributes of the <TITLE> tag usually appear somewhere on the browser window.

<ISINDEX>

<ISINDEX> informs the HTML user agent that the document is an index document. The reader can read the document or enter a keyword search. The default URL used for processing queries can be overridden with the HREF attribute. (See Chapter 5, "Input," for more details.)

Netscape has made additional, backward-compatible tags and attributes available on top of the HTML 1.0 specification. Some of Netscape's extensions have been incorporated into HTML 2.0 and 3.0, respectively. The following attribute to the <ISINDEX> tag is a Netscape extension:

PROMPT="..."

The PROMPT attribute enables you to change the default prompt supplied by the browser.

<BASE HREF="...">

The <BASE> tag defines the base URL of the current document. All relative URLs in this document are relative to the URL in HREF. By default, the value of HREF depends on the current location of the document.

The following is the Netscape extension:

TARGET="..."

TARGET enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.

<LINK REV="..." REL="..." HREF="...">

<LINK> specifies links, documents, and information related to the current document. REL specifies the relationship this link has to the current document, and REV specifies the same relationship in reverse. HREF specifies the location of the link.

For example, suppose you had two chapters of a book: chapter1.html and chapter2.html. Chapter1.html might contain one of the following links:

<LINK REL="next" HREF="chapter2.html">

or

<LINK REV="previous" HREF="chapter2.html">

<BODY>

<BODY> tags contain the information that is displayed in the Web browser. To make notes within the <BODY> tag that the browser ignores, you can use the comment tag:

<!-- ... -->

Netscape Navigator and Microsoft Internet Explorer support the following proprietary extensions to the <BODY> tag:

Tip
For links with information about colors, see the following:
URL:http://werbach.com/web/wwwhelp.html#color
For more information on HTML 2.0 and 3.0, see the following:
http://home.netscape.com/assist/net_sites/

Structural

The following text blocking statements demonstrate how you can lay out the text in the body of your HTML page:

<P [ALIGN=LEFT|CENTER|RIGHT]>...</P>

The preceding tag defines a paragraph. ALIGN specifies the alignment of the paragraph within the <p> tags.

Lists

There are physically several different ways to present a listing of items. In HTML, each list consists of a tag that specifies the kind of list and a tag (or series of tags) that specifies each item in the list.

<LI>

<LI> stands for "list item." <LI> is the most common way of expressing list items, such as unordered, ordered, menus, and directory lists. The following are Netscape extensions that can be used with list item tags:

<ol>
  <li value=3>item #3
</ol>

Unordered Lists

Unordered lists differ from ordered lists in that, instead of each list item being labeled numerically, bullets are used.

<UL>...</UL>

The <UL> tag creates lists with generic bullets preceding each unordered list item.

Ordered Lists

List items in ordered lists are automatically preceded by a sequential number or letter, beginning with the number 1 or letter A and incrementing by 1 with each new list item.

<OL>...</OL>

Each item within the ordered list tag of the list is ordered by number or letter according to the nesting and assigned either a number or letter according to the nesting.

TYPE=A|a|I|i|1

TYPE='...' is a Netscape extension that defines the default bullet type. See "<LI>."

Menus

Menus are a list of items. This tag has the same effect as the <UL> tag.

<MENU>...</MENU>

The <MENU> tag indicates a menu of items. The output usually looks similar or equivalent to <UL>. Menus cannot be nested.

Directory

Directories are specified with the <DIR> tag. The output is the same as the <UL> tag.

<DIR>...</DIR>

A directory of items. Usually looks similar/equivalent to <UL>. Directories cannot be nested.

Definitions

Definitions are specified with the definition list tag. The following is a definition list, where <DT> is the name or title of the item to be defined and <DD> is the definition.

<DL>
  
<DT>
  <DD>
   
</DL>

Preformatted

Source code and other text can be displayed with monowidth fonts with the <PRE> tag. The text within the <PRE> tags appears exactly as it is typed, usually using a monowidth font.

<PRE>...</PRE>

Division

Division defines a container of information.

<DIV [ALIGN=LEFT|RIGHT|CENTER|JUSTIFY]>...</DIV>

<DIV> defines a container of information. ALIGN specifies the alignment of all information in that container. This tag is the preferred way of aligning elements in your HTML documents.

Center

Text can be centered within the browser window with the <CENTER> tag.

<CENTER>...</CENTER>

The <CENTER> tag centers the content between the tags. This tag is a proprietary solution, meaning most, if not all, browsers support it. It is a good idea to use <DIV ALIGN=CENTER> over the <CENTER> tag for the benefit of newer browsers.

Text Formatting

The following tags describe the element between the tags. The appearance of the element is not as important as the actual definition. For example, a user should be able to specify how his or her browser displays any heading in an HTML file; these headings should be defined as headings rather than as bold text in a large font.

Headings

Headings are usually used for section headings; the alignment is specified by ALIGN. There are six headings: <H1>, <H2>, <H3>, <H4>, <H5>, and <H6>.

<H1 [ALIGN=LEFT|CENTER|RIGHT]>...</H1>

Emphasis

The <EM> tag emphasizes the text between the tags. The emphasized text is usually (but not necessarily) displayed as italics.

<EM>...</EM>

Strong Emphasis

The <STRONG> tag strongly emphasizes text between the tags. The emphasized text is usually (but not necessarily) displayed as bold.

<STRONG>...</STRONG>

Block Quotes

You can block-quote selected text with the <BLOCKQUOTE> tag. This sets off the text between tags, usually by indenting or changing the margin and centering.

<BLOCKQUOTE>...</BLOCKQUOTE>

Citation

You use citations when you are referring to another printed document, such as a book. Text within the <CITE> tag is usually italic.

<CITE>...</CITE>

Address

E-mail addresses are usually wrapped in the <ADDRESS> tag.

<ADDRESS>...</ADDRESS>

The preceding defines an e-mail address, usually in italics.

Source Code

Computer source code is usually surrounded by the <CODE> tag.

<CODE>...</CODE>

The preceding defines a source code excerpt and uses a fixed-width font.

Sample Output

Sample output of a program can be formatted with the <SAMP> tag.

<SAMP>...</SAMP>

The preceding defines sample output from a program.

Keyboard Input

The keyboard input tag will mark text that the user is to type on the keyboard. It is normally rendered in a fixed-width font.

<KBD>...</KBD>

Variable

The variable tag is used to mark a variable used in a mathematical formula or computer program. It is normally displayed in italics.

<VAR>...</VAR>

Definition

Definitions are usually formatted differently than other text. Use the <DFN> tag to display definitions.

<DFN>...</DFN>

Physical Formatting

Physical formatting has become very popular because it has a very literal style.

Bold

Text can be rendered bold with the <B> tag.

<B>...</B>

Italics

Text can be displayed in italics with the <I> tag.

<I>...</I>

Typewriter

The typewriter tag displays text in a typewriter-looking font.

<TT>...</TT>

Underline

Text can be underlined with the following tag:

<U>...</U>

Strikeout

Text can be displayed with a line through the middle with the <S> tag to indicate strikeout.

<S>...</S>

Subscript

Subscript renders the text smaller than the normal font.

<SUB>...</SUB>

Superscript

Superscript works the same as subscript tags in that it displays the text smaller than the normal text.

<SUP>...</SUP>

Netscape Extensions

The <BLINK> tag makes the text within the tags blink. This is not recommended because of the way it affects different browsers.

<BLINK>...</BLINK>

The size and color attributes of the <FONT> tag define the size or color of the text. SIZE is a number between 1 and 7 (the default size is 3). You can specify the font to be relatively larger or smaller than the preceding font by preceding the number with either a plus (+) or minus sign (-).

<FONT SIZE=n|+n|-n COLOR="...">...</FONT>

<BASEFONT SIZE=n> defines the default size of the fonts. The default value is 3.

Links

Text can be linked to other text with a click of the mouse; text linked in this way is called hypertext.

<A HREF="...">...</A>

When the user selects the link, the browser goes to the location specified by HREF. The <A HREF="x"> variable can be either a URL or a path relative to the local document root.

<A NAME="...">

The <A NAME="..."> tag sets a marker in the HTML page. NAME is the name of the marker. To reference that marker, use the following:

<A HREF="filename.html#markername">...</A>

The following is the Netscape extension:

TARGET="..."

The TARGET tag enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.

Inline Multimedia

This tag places an inline image within an HTML document with the SRC attribute.

<IMG SRC="..." [ALIGN=TOP|BOTTOM|MIDDLE]
     
[ALIGN=LEFT|RIGHT|TEXTTOP|ABSMIDDLE|BASELINE|ABSBOTTOM]
     [ALT="..."] [ISMAP] [USEMAP="..."]>

SRC defines the location of that image-either a URL or a relative path. ALIGN specifies the alignment of both the image and the text or graphics following the image. ALT specifies alternative text to display if the image is not displayed. ISMAP is used for server-side imagemaps, and USEMAP is used for client-side imagemaps.

Client-side imagemaps are defined using the <MAP> tag.

<MAP NAME="...">
  
<AREA SHAPE="   " COORDS="   " HREF="   "|NOHREF>
     
</MAP>

NAME is the name of the map (similar to <A NAME="...">), and the <AREA> tags define the areas of the map. COORDS is a list of coordinates that define the area. HREF defines where to go if that area is selected. If you specify NOHREF, then the browser ignores you if you click in that region.

The following are Netscape extensions to the <IMG> tag:

Dividers

You can insert line breaks using the <BR> tag. Using this tag is the same as pressing Enter to start a new line of text.

<BR>

The <BR> tag indicates a line break. In Netscape, </NOBR> prevents line breaks and <WBR> indicates where to break the line if needed.

<HR>

The <HR> tag indicates a horizontal line, also known as a hard rule. Netscape extensions to the <HR> tag are the attributes SIZE=number, WIDTH=[number|percent], ALIGN=[left|right|center], and NOSHADE.

Forms

Forms can be used with the <FORM> tag to make your Web pages interactive with user-defined entries. For more detailed information on HTML forms, see Chapter 3, "HTML and Forms."

<FORM ACTION="..." METHOD=GET|POST ENCTYPE="...">...</FORM>

The ACTION, METHOD, and ENCTYPE elements define the form action, method, and encryption type.

<INPUT TYPE="..." NAME="..." VALUE="..." SIZE=n MAXLENGTH=n
       CHECKED>

The other attributes are all dependent on the TYPE attribute (see Chapter 3). TYPE is one of the following:

The <SELECT> tag lets you define a menu of items from which to select. The following is an example of the <SELECT> tag:

<SELECT NAME="..." SIZE=n [MULTIPLE]>
   
<OPTION [SELECTED]>
   ...
</SELECT>

The <TEXTAREA> tag defines a textual area where the user can type in multiple lines of text.

<TEXTAREA NAME="..." ROWS=n COLS=n>...</TEXTAREA>

Tables

Tables are defined by rows and cells in those rows.

<TABLE [BORDER]>...</TABLE>

The <TABLE> tag defines a table. If you specify BORDER, a border will be drawn around the table.

The following are the Netscape extensions to the <TABLE> tag:

Table Rows

You can use the <TR> tag to specify table rows.

<TR [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM]>...</TR>

The preceding defines a row within the table. ALIGN specifies the horizontal alignment of the elements within the row and VALIGN specifies the vertical alignment.

Table Data

You can specify the elements of a table cell with the <TD> tag as follows:

<TD [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM][COLSPAN=n]
Â[ROWSPAN=n]>...</TD>

This code specifies a table cell within a row. Normally, the cell lies within the row. However, you can have it extend into another row or column using the COLSPAN or ROWSPAN attribute, where n defines how far into another column or row the cell spans.

Table Headings

Use the <TH> tag to place headings within a table.

<TH [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM][COLSPAN=n]
Â[ROWSPAN=n]>...</TH>

<TH> tags are equivalent to <TD> except they are used as table headers. The contents of table heading tags are normally bold.

Captions

Captions can be inserted into a table as follows:

<CAPTION [ALIGN=TOP|BOTTOM]>...</CAPTION>

This code describes a caption in the table.

Frames

Frames are a Netscape enhancement that enable you to divide the browser window into several different components. For more detailed information on frames, see Chapter 14, "Proprietary Extensions."

The following shows the basic frame element. The example defines either a row or column of frames. You may embed multiple <FRAMESET> tags within each other.

<FRAMESET ROWS=n|COLS=n>...</FRAMESET>

Frame Elements

The <BODY> tag is replaced by the <FRAMESET> tag in a framed HTML page.

<FRAMESET [SRC="..."] [NAME="..."] [MARGINWIDTH=n] [MARGINHEIGHT=n]

Â[SCROLLING="yes|no|auto"] [NORESIZE]>

The preceding tag defines the frame element within the <FRAMESET> tags. SRC is the location of the document that should appear in this frame. NAME is the name of the frame. SCROLLING defines whether or not to display a scrollbar. MARGINWIDTH and MARGINHEIGHT define the margin between the content of the frame and the frame in pixels. NORESIZE prevents the user from resizing the frame.

NOFRAMES

<NOFRAMES> defines the HTML to appear if the browser does not support frames. If the browser does support frames, everything within these tags is ignored.

Special Characters

Table B.1 covers the HTML attributes inserted into text for characters that are not usually on a 101-key keyboard.

Table B.1. Non-alphanumeric characters.

Description
Code
Entity Name
Quotation mark
&#34
&quot
Ampersand
&#38
&amp
Less-than sign
&#60
&lt
Greater-than sign
&#62
&gt
Non-breaking space
&#160
&nbsp
Inverted exclamation
&#161
&iexcl
Cent sign
&#162
&cent
Pound sterling
&#163
&pound
General currency sign
&#164
&curren
Yen sign
&#165
&yen
Broken vertical bar
&#166
&brvbar
 
 
&brkbar
Section sign
&#167
&sect
Umlaut (dieresis)
&#168
&uml
 
 
&die
Copyright
&#169
&copy
Feminine
&#170
&ordf
Left angle, quote, guillemotleft
&#171
&laquo
Not sign
&#172
&not
Soft hyphen
&#173
&shy
Registered trademark
&#174
&reg
Macron accent
&#175
&macr; &macr; &hibar
Degree sign
&#176
&deg
Plus or minus
&#177
&plusmn
Superscript two
&#178
&sup2
Superscript three
&#179
&sup3
Acute accent
&#180
&acute
Micro sign
&#181
&micro
Paragraph sign
&#182
&para
Middle dot
&#183
&middot
Cedilla
&#184
&cedil
Superscript one
&#185
&sup1
Masculine ordinal
&#186
&ordm
Right angle quote, guillemotright
&#187
&raquo
Fraction one-fourth
&#188
&frac14
Fraction one-half
&#189
&frac12
Fraction three-fourths
&#190
&frac34
Inverted question mark
&#191
&iquest
Capital A, grave accent
&#192
&Agrave
Capital A, acute accent
&#193
&Aacute
Capital A, circumflex accent
&#194
&Acirc
Capital A, tilde
&#195
&Atilde
Capital A, dieresis or umlaut mark
&#196
&Auml
Capital A, ring
&#197
&Aring
Capital AE diphthong (ligature)
&#198
&AElig
Capital C, cedilla
&#199
&Ccedil
Capital E, grave accent
&#200
&Egrave
Capital E, acute accent
&#201
&Eacute
Capital E, circumflex accent
&#202
&Ecirc
Capital E, dieresis or umlaut mark
&#203
&Euml
Capital I, grave accent
&#204
&Igrave
Capital I, acute accent
&#205
&Iacute
Capital I, circumflex accent
&#206
&Icirc
Capital I, dieresis or umlaut mark
&#207
&Iuml
Capital Eth, Icelandic
&#208
&ETH;&Dstrok;
Capital N, tilde
&#209;
&Ntilde
Capital O, grave accent
&#210;
&Ograve
Capital O, acute accent
&#211;
&Oacute
Capital O, circumflex accent
&#212;
&Ocirc
Capital O, tilde
&#213;
&Otilde
Capital O, dieresis or umlaut mark
&#214;
&Ouml
Multiply sign
&#215;
&times
Capital O, slash
&#216;
&Oslash
Capital U, grave accent
&#217;
&Ugrave
Capital U, acute accent
&#218;
&Uacute
Capital U, circumflex accent
&#219;
&Ucirc
Capital U, dieresis or umlaut mark
&#220;
&Uuml
Capital Y, acute accent
&#221;
&Yacute
Capital THORN, Icelandic
&#222;
&THORN
Small sharp s, German (sz ligature)
&#223;
&szlig
Small a, grave accent
&#224;
&agrave
Small a, acute accent
&#225;
&aacute
Small a, circumflex accent
&#226;
&acirc
Small a, tilde
&#227;
&atilde
Small a, dieresis or umlaut mark
&#228;
&auml
Small a, ring
&#229;
&aring
Small ae diphthong (ligature)
&#230;
&aelig
Small c, cedilla
&#231;
&ccedil
Small e, grave accent
&#232;
&egrave
Small e, acute accent
&#233;
&eacute
Small e, circumflex accent
&#234;
&ecirc
Small e, dieresis or umlaut mark
&#235;
&euml
Small i, grave accent
&#236;
&igrave
Small i, acute accent
&#237;
&iacute
Small i, circumflex accent
&#238;
&icirc
Small i, dieresis or umlaut mark
&#239;
&iuml
Small eth, Icelandic
&#240;
&eth
Small n, tilde
&#241;
&ntilde
Small o, grave accent
&#242;
&ograve
Small o, acute accent
&#243;
&oacute
Small o, circumflex accent
&#244;
&ocirc
Small o, tilde
&#245;
&otilde
Small o, dieresis or umlaut mark
&#246;
&ouml
Division sign
&#247;
&divide
Small o, slash
&#248;
&oslash
Small u, grave accent
&#249;
&ugrave
Small u, acute accent
&#250;
&uacute
Small u, circumflex accent
&#251;
&ucirc
Small u, dieresis or umlaut mark
&#252;
&uuml
Small y, acute accent
&#253;
&yacute
Small thorn, Icelandic
&#254;
&thorn
Small y, dieresis or umlaut mark
&#255;
&yuml