THE HYPERTEXT MARKUP LANGUAGE (HTML) AND THE WORLD-WIDE WEB: RAISING ASCII TEXT TO A NEW LEVEL OF USABILITY

Jeff Barry
Cooperative Information Services Librarian
The University of Tennessee Libraries
Knoxville, TN 37996.
INTERNET: jeff@utkux.utcc.utk.edu.

BACKGROUND OF THE WEB

The World-Wide Web initiative originated with the European Laboratory for Particle Physics (CERN) in 1989 as an attempt to electronically distribute the literature of high- energy physics to researchers. 1 The World-Wide Web initiative was based on the hypertext concept. By creating computer linkages from the citations of an article to the corresponding source documents, users would be able to navigate through a body of related literature online simply by following the "electronic footnotes." In order to realize such a system, computer protocols and standards had to be created for describing the structure of documents, specification of links, and the transmission of documents over a computer network.

As the World-Wide Web developed, key supporting technologies were established. The HyperText Markup Language (HTML) describes the organization of a document so that certain structural elements can be uniquely identified and accessed over the Internet. Within a HTML document, links to other information on the Internet are specified through the use of Uniform Resource Locators (URLs). The actual process of transferring HTML documents over the network in the Web is accomplished by computers employing the HyperText Transfer Protocol (HTTP); the computers that deliver HTML documents to users of the Net are usually referred to as "Web servers." Individuals access documents on Web servers through client software on their local machines, such as Mosaic, Cello, and Lynx. Since the hypertext nature of the Web facilitates the browsing of networked resources, Web client software has generically come to be known as "browsers." Although it is important to the operation of the Web, authors of HTML documents don't need understand the details of the HTTP protocol.

The Web has evolved beyond being just a hypertext tool: it is now a hypermedia environment that incorporates images, sound, and even video. In fact, the diversity of documents found in the World-Wide Web has fostered the need for ongoing revisions of HTML. This process is largely supported by ad hoc volunteer efforts by many individuals around the world who are dedicated to seeing the Web evolve into a more mature and stable networked communications tool. As individuals have tried to apply the HTML tags to a variety of document types, the limitations of HTML have become very clear. An excellent overview of these limitations can be found in a recent paper by John Price-Wilkin. 2 Nevertheless, the Web is a precursor of the networked environment that will permeate libraries in the future. As HTML tags are explained in this tutorial, areas that might change with the next HTML specification are identified.

STRUCTURE OF HTML DOCUMENTS

A HTML document is simply an ASCII text file that has been marked up with standardized tags in order to provide structure to the text. One of the disadvantages of plain ASCII files is that they do not provide the reader with information about document structure or formatting. Whenever you convert a file created in a word processor to ASCII, the fonts, bullets, bold, italics, and other formatting information are lost during the conversion.

Although it utilizes ASCII files, HTML provides information about a document's structure (e.g., title, headings, and paragraphs) and format (e.g., bold and italics) through the use of standardized markup tags.

In HTML terminology, a document is composed of "elements." In simple terms, an element can be viewed as being either a part of a document, such as a title; a formatting code, such as bold; a hypertext link; or an image. In turn, elements are identified by markup tags. In general, this paper will simplify HTML terminology and use "tag" to refer to both elements and actual markup tags. For the details of HTML's complex document structure, consult the HyperText Markup Language (HTML), Version 2.0 specification. 3

For example, the tag to begin a title is < TITLE>, the tag to begin a first-level heading is < H1> , and the tag to begin a paragraph is < P> . The beginning tag is followed by the text of the title, heading, or paragraph respectively. Most HTML tags are used in pairs, although some tags can be used singularly. The beginning tag is usually called the "start tag." The ending tag is usually called the "end tag." Except for the addition of a forward slash, the end tag is the same as the start tag. For example, the tag to end a title is < /TITLE> , the tag to end a first-level heading is < /H1> , and the tag to end a paragraph is < /P> .

Figure 1.  Example HTML Document

	< TITLE> The Title of Your Document is Entered Here< /TITLE>     
	< H1> The Heading of Your Document is Entered Here< /H1> 
	< P> The text of the first paragraph of your document here.< /P> 


All HTML tags have the same general format.

		Figure 2.  General HTML Tag Format

			< TAG> text< /TAG> 
			|	|	|
		Start Tag 	|	|
			Content |	|
			             End Tag


Tags are not case sensitive. For example, < title> is equivalent to < TITLE>. (For the convenience of the reader of this paper, tag names, actual tags, and tag attributes are shown in uppercase.)

HTML documents can be created using either:

  1. Regular text editors in which the author enters the markup tags by hand or by means of a macro.

  2. Specialized HTML editors that automatically insert the appropriate markup tags at locations designated by the author.

  3. Conversion programs that take a word processing file and translate it into HTML.
Since HTML document editors and conversion programs are still being developed, the most common method of creating HTML documents has been to use a text editor and to manually insert markup tags. Since a relatively small number of tags are used in HTML, this method is not as tedious as it sounds. Regardless of the means used to create the document, a solid understanding of HTML tags is essential for authors preparing documents for the Web. The simplicity of HTML and the ability to generate such documents without specialized tools facilitates the ease of entry into the world of networked hypermedia.

In this tutorial, an example of creating a "home page" in HTML will be provided. A home page commonly refers to the first document a user sees when starting a Web browser. Many users create their own home pages for organizing information about their favorite Internet sites. By following the examples in this tutorial, you should be able to create your own home page.

STRUCTURAL ELEMENTS

A HTML document consists of at least three essential elements: a title, a heading, and the text that forms the body of the document. The body of the document can be in the form of paragraphs, lists, images, or a combination of elements. Figure 3 presents an example home page.

	Figure 3.  Example Home Page

		< TITLE> Jeff's Home Page< /TITLE> 

		< H1> Jeff's Home on the Net< /H1> 

		< P> Your "home page" may include links to 
		other information sources on the network, 
		information about yourself, and even your 
		photograph.  HTML provides the flexibility of 
		crafting a toolbox of networked resources 
		that meets your needs.< /P> 


TITLE TAG (< TITLE> )

The TITLE tag (< TITLE> ) describes the content of the document. Rather than displaying the text of the TITLE tag as part of the document, Web browsers usually display it in an area above the document window; different types of browsers may display the title differently.

The TITLE tag should both describe the content of a document

and provide the reader with an indication of the context of the document. If the HTML document is part of a multi-document work, then the parent document might also be part of the TITLE tag

For example, the following title would be meaningless on its own: < TITLE> Introduction< /TITLE> . A better title would include the context of the document, such as: < TITLE> Dead Sea Scrolls Exhibit--Introduction< /TITLE> .

Providing meaningful titles to hypertext documents facilitates the web-like linkage of congruent resources. In HTML, the TITLE tag serves not only as a concise description of a document, but it also helps users to navigate among a set of documents that, in actuality, might exist on a number of different Web servers throughout the world.

While no limit is placed on the length of this tag, titles should be kept brief. Since documents may be displayed or utilized by many different types of client software, there is no guarantee that lengthy titles will not be truncated, possibly resulting in the loss of information. A "rule of thumb" is to keep document titles to less than 64 characters.

HEADING TAGS (< H1> TO < H6> )

HTML provides for six different levels of HEADING tags: < H1> , < H2> , < H3> , < H4> , < H5> , and < H6> . The most prominent heading is assigned the < H1> tag. A subsection would be marked with the < H2> tag. A section within the subsection would be designated with the < H3> tag.

The Web browser supplies the appropriate sized font for each heading level. The font used for an < H1> heading is more prominent than the font for an < H2> heading, whereas the font used for an < H2> heading is more prominent than the font used for an < H3> heading.

Heading numbers should not be mixed. For example, < H1> Jeff's Home on the Net< /H2> would not be a valid heading because the number of the end tag is not the same as the number for the start tag.

Since headings, unlike titles, are displayed by browsers with the text of the HTML document itself, the < H1> heading is the most visually prominent text displayed.

PARAGRAPH TAG (< P> )

A mild controversy exists among HTML authors about the PARAGRAPH tag (< P> ). Early HTML specifications used the PARAGRAPH tag to indicate a paragraph break. However, using the tag in this manner only provided formatting information to browsers (i.e., when to add blank space around text). Many HTML users viewed the PARAGRAPH tag as a start tag "containing" a block of text that functioned as a paragraph, just as a HEADING tag contained text that functioned as a header.

The latest revision of the HTML specification indicates that the PARAGRAPH tag represents a paragraph and not a paragraph break. The value of viewing the PARAGRAPH tag as a container, rather than a separator, is that a containing tag conveys structural information, whereas a separating tag simply implies formatting. Think of the PARAGRAPH tag as containing a block of text that functions as a paragraph and not as a tag that only separates one text block from another.

To keep documents in conformance with current HTML practice, it is best to place the < P> tag at the beginning of each paragraph. Note that the end tag (< /P> ) is optional and is usually left out.

The use of the PARAGRAPH tag to force the addition of white space around text that is not a paragraph is strongly discouraged.

PRESENTATION TAGS

One of the primary hurdles that authors face in preparing HTML documents is moving from a presentation perspective to a structural one. Word processors focus the author's energies on the presentation of a document. When preparing documents in a word processor, the author considers fonts and other presentation characteristics such as bold, underlining, italics, and bullets.

HTML was initially designed to allow the author to focus on a document's content rather than its presentation. The software that displays HTML documents is responsible for rendering the document for the appropriate display device. The intention of HTML, as with other structural markup languages, is to relieve the author from presentation considerations. HTML documents should be platform independent: the same HTML document should look just as good with NCSA Mosaic for the Macintosh as with NCSA Mosaic for Microsoft Windows.

One of the problems that many people have in learning HTML is that all Web browsers do not support the same conventions. For example, early versions of NCSA Mosaic for Microsoft Windows only supported a limited set of tags. When creating HTML documents, the author must trade off the present inconvenience that, if all HTML tags are used, some browsers may only display a subset of them against the future inconvenience that HTML markup that is restricted to accommodate current Web browser limitations will need to be upgraded as these browsers become more sophisticated.

HTML presentation elements are divided into two sets: logical style and physical style. Logical tags describe the role that the text plays in a document, such as a citation, a definition, or an emphasized statement. Physical tags simply indicate the desired appearance of text, such as bold or italics. While it is natural in a word processor to indicate when text should be in bold or italics, the author of a HTML document should use logical rather than physical elements whenever possible. As Coombs et al. note: 4

Using descriptive markup to identify the logical elements of a document not only simplifies composition, maintenance, collaboration, and publication, it also enables authors to apply a wide range of tools for composition assistance. This feature must be exploited if text processing is going to fulfill its original promise to significantly assist scholarly composition and become more than just improved typing.

LOGICAL TAGS

Table 1 describes the major HTML logical tags.
Table 1.  Major Logical Tags

	 < EM>  < /EM >  Emphasis, usually italics.
	 < STRONG>  < /STRONG>   Strong emphasis, usually bold.
	 < DFN>  < /DFN>  Definition term, usually bold.
	 < CITE>  < /CITE>   Citation, usually italics.
	 < BLOCKQUOTE> < /BLOCKQUOTE>  Quotation, usually italics.

To see how logical tags are used, let's add a couple of them to the example home page from Figure 3. Figure 4 shows the modified home page.
Figure 4.  Home Page With Logical Tags

	< TITLE> Jeff's Home Page< /TITLE> 

	< H1> Jeff's Home on the Net< /H1>

	Your < EM> home page< /EM>  may include links
	to other information sources on the network, information
	about yourself, and even your photograph. HTML provides
	the flexibility of crafting a < STRONG> toolbox of networked
	resources< /STRONG>  that meets your needs.

	< BLOCKQUOTE> If a man does not keep pace with his
	companions, perhaps it is because he hears a different drummer.
	Let him step to the music which he hears, however measured
	or far away.< /BLOCKQUOTE> 

	< CITE> Henry David Thoreau.  Walden, 1854.< /CITE>


Additional logical tags that provide information about the text contained in the tagged element are < CODE> for examples of computer programming code, <KBD> for examples of text typed from a keyboard, < SAMP> to indicate a sample sequence of characters, and < VAR> to specify the enclosed text as the name of a variable. Appendix A describes these tags.

PHYSICAL TAGS

The BOLD tag (< B> ), which is a physical tag, most closely corresponds to the STRONG tag (< STRONG> ), which is a logical tag; and the ITALICS tag (< I> ), which is a physical tag, corresponds to the EMPHASIS tag (< EM> ), which is a logical tag. However, it should not be assumed that text marked with the EMPHASIS tag will be in italics. The physical tags can be used to force the desired type of presentation. Of course, the display device must be capable of presenting characters in the designated format. For instance, on a character-screen terminal, text marked with the ITALICS tag may actually be rendered as bold because italics cannot be displayed on such a monitor.

Whenever a Web browser encounters tags that it does not understand, those tags will be ignored; however, the text within the tags will still be displayed. Some primitive Web browsers do not understand logical tags and simply display text enclosed within logical tags without any highlighting.

The seemingly conflicting recommendation to use logical tags over physical tags-- even when all Web browsers do not yet support the former--reflects the early stage of development of HTML and the Web itself. As Web technology matures, more scalable solutions will evolve. The movement away from physical tags towards logical tags reflects this ongoing evolution.

PREFORMATTED TEXT TAG (< PRE> )

The PREFORMATTED TEXT tag (< PRE> ) instructs Web browsers to preserve the formatting (i.e., character and line spacing) of the enclosed text and present it in a standard, monospace font. This tag is used for text that would become unintelligible if displayed in a proportional font. Because HTML does not currently provide for the display of tables or matrices, the preformatted text element serves as an easy way to display tabular information in HTML documents.

Since the PREFORMATTED TEXT tag retains the hard returns of the original ASCII text, neither the PARAGRAPH tag nor any of the highlighting tags should be used within preformatted text; however, hypertext links may be included.

Actually, the easiest way to create a HTML document is through the use of the PREFORMATTED TEXT tag. By inserting < PRE> at the beginning of a document and < /PRE> at the end of a document, you can create a HTML file. This quick and dirty approach creates correspondingly unattractive, but readable, documents. As with all HTML documents, the file extension must be ".html" (or ".htm" if the documents are being served from a Microsoft DOS or Windows machine).

LINE BREAK TAG (< BR > )

It's often difficult for new HTML users to figure out how to control the line spacing of a document. Many tags, such as HEADING tags, add an extra space below their end tags. If you simply want to simulate the appearance of a carriage return, use the LINE BREAK tag (< BR> ). Figure 5 illustrates the use of the LINE BREAK tag.

Figure 5. LINE BREAK Tag Example

< P> Jeff Barry< BR>
Cooperative Information Services Librarian< BR>
The University of Tennessee Libraries< BR>
Knoxville, Tennessee< BR>

The LINE BREAK tag is useful in displaying addresses. Notice that if the LINE BREAK tag was omitted, the Web client would display the lines as if they flowed together without any separation. Inserting a regular carriage return in an ASCII text file has no significance on the way a HTML document is displayed (except in preformatted text).

HORIZONTAL RULE TAG (< HR> )

Many document authors take advantage of the HORIZONTAL RULE tag (< HR> ) to provide a visual means of dividing their documents. Whenever a Web browser encounters a HORIZONTAL RULE tag, it displays a horizontal divider line across the screen.

Figure 6 presents an example of the HORIZONTAL RULE tag.

	Figure 6.  HORIZONTAL RULE Tag Example

		< TITLE> Jeff's Home Page< /TITLE> 

		< H1> Jeff's Home on the Net< /H1> 
		Your < EM> home page< /EM> may include 
		links to other information sources on the network, 
		information about yourself, and even your photograph.
		HTML provides the flexibility of crafting a < STRONG> 
		toolbox of networkedresources< /STRONG> that meets 
		your needs.

		< HR> 

CREATING HYPERTEXT LINKS

One of the most exciting aspects of HTML is its ability to create hypertext links. Links can be created to items within the same document, to other documents on the same server, or to any document on the Internet. Links are used to relate one document to another. The format for specifying a hypertext link consists of at least three parts: the ANCHOR tag, the network address of the document to be linked, and the text to be displayed in the formatted document. Links are anchored to specific text within a document.

ANCHOR TAG (< A> )

The first step in creating a hypertext link is to determine the text that will represent the link. The text provided for the link should give the user an indication about the content of the link.

Since many HTML documents also serve as printed documentation, hypertext links should provide meaning and readability in the context of the surrounding text without incorporating computer specific actions such as clicking a mouse. Don't create a link that says: "For a hypermedia interface to the Library of Congress' 1492: An Ongoing Voyage Exhibit, click here." Rather, make a link that says: "A hypermedia interface to the Library of Congress' 1492: An Ongoing Voyage Exhibit is available."

Enclose text within an ANCHOR tag to designate it as a hypertext link. The text between the starting tag (< A> ) and ending tag (< /A> ) will be displayed by the Web browser with special emphasis, usually underlined and in a separate color from the other text. Using the ANCHOR tag alone, however, does not constitute a valid link. A network address, in the form of a Uniform Resource Locator (URL), that specifies the document to be retrieved must be included as part of the hypertext link.

In HTML, many start tags can have optional attributes (a description of attributes for all tags is provided in Appendix A). An attribute consists of a name, followed by an equal sign, followed by a value for that attribute. The value of the attribute should be enclosed in double quotes. An important attribute of the starting ANCHOR tag is named HREF (think of hypertext reference). The value of HREF is the location of the document to be retrieved.

Figure 7.  ANCHOR Tag Example

	< A HREF="http://sunsite.unc.edu/expo/1492.exhibit/Intro.html">

	1492: An Ongoing Voyage< /A>

	Notice that the value of the HREF attribute is always 
	in the form of a URL.

Uniform Resource Locators (URLs)

The Web uses Uniform Resource Locators (URLs) as its standard way of referencing information on the Internet. The use of a standard addressing scheme by authors of HTML documents allows computer programs to interpret the address, use the appropriate Internet protocol (e.g., FTP, Telnet, and HTTP), and automate the retrieval of the specified item with the "click of a button." Two forms of specifying URL syntax are available to authors: absolute URLs that contain the full addressing syntax and partial URLs.

Absolute URLs

Absolute URLs are the most common type of URL, and they should always be used to link to documents on Gopher servers. A URL is divided into three parts: the protocol, the machine name, and the path (i.e., protocol://machine.name[:port]/path). The first part of the URL names the Internet protocol used for accessing the document, such as "ftp," "gopher," "http," "telnet," and other supported protocols. The second part of the URL identifies the name of the document server, such as "sunsite.unc.edu." (Some servers run protocols on nonstandard ports; if so, the alternate port number, preceded by a colon, follows the machine name.) The final part of the URL represents the path of the document to be retrieved. Separating the protocol from the machine name in the URL is a colon (:) and two forward slashes (//). Separating the machine name from the path is one forward slash (/).

	Figure 8.  Example URL

		http://sunsite.unc.edu/expo/1492.exhibit/Intro.html
		|	  |		|
		|	  |		|
		|	  |		|
		Protocol  Machine Name Path

Occasionally, a reference to a URL has no path (e.g., http://www.cityscape.co.uk/). In most instances, this will be an acceptable URL. Depending upon the configuration of the particular Web server at that destination, either an index of files in the server's root directory will be generated or a HTML document named "index.html" will be retrieved. The document named "index.html" may not be an index per se, but a default home page that is delivered whenever a path is not specified.

HTTP URL

The URL in Figure 9 specifies the protocol as the HyperText Transfer Protocol ("http"). The document is located on the machine with the host name of "sunsite.unc.edu." The "expo/1492.exhibit/Intro.html" part of the URL represents the path of that document on the Web server.
	Figure 9.  Example HTTP URL

		http://sunsite.unc.edu/expo/1492.exhibit/Intro.html

Gopher URL

Figure 10 is an example of a URL pointing to a Gopher server. This particular server is hosted by the University of Tennessee Libraries. The URL points to the Smoky Mountain Database, which contains information about biodiversity and environmental issues in the Appalachians.

Figure 10.  Example Gopher URL

	gopher://www.lib.utk.edu/11/Information-by-Subject/S%3a/smokies
	|	  |			|
	|	  |			|
	|	  |			|
	Protocol  Machine Name	    Path


Telnet URL

Since the Telnet protocol opens an interactive terminal session, a path is not needed. The URL in Figure 11 connects via Telnet to the CARL UnCover system.
		Figure 11.  Example Telnet URL

			telnet://database.carl.org/
			|	|
			|	|
			|	|
 		Protocol  Machine Name
    

Anonymous FTP URL

Documents that reside an anonymous FTP servers can be accessed by use of a URL like the one in Figure 12.
	Figure 12.  Example Anonymous FTP URL

		ftp://ftp.cni.org/pub/LITA/tiip-forum/proceedings.html
		|	  |			|
		|	  |			|
		|	  |			|
		Protocol  Machine Name		Path


It is also possible to link to subdirectories rather than to a specific document. For example, the URL in Figure 13 links to a subdirectory that contains the document Principles for the Development of the National Information Infrastructure in various formats. The HTML document returned by this link is an index of the files in that subdirectory. Each file name in the directory listing becomes a link to the specific document. In this manner, the Web provides an easy way of retrieving documents from anonymous FTP servers.
Figure 13.  Example Anonymous FTP Subdirectory URL

< P> The proceedings < A HREF="ftp://ftp.cni.org/pub/LITA/tiip- forum/proceedings.html"> Principles for the Development of the National Information Infrastructure< /A> from ALA's Telecommunications and Information Infrastructure Policy Forum are available on the Internet.

Partial (or Relative) URLs

A partial URL specifies the path of a link relative to the originating document. When encountering a partial URL, the Web software assumes that the protocol and the machine name for the destination of the hypertext link are the same as that of the document that contains the link.

LINKING TO OTHER DOCUMENTS ON THE SAME SERVER

Hypertext documents on the Web often consist of links among multiple files on the same server. The following example shows how partial URLs can be used to create hypertext links to documents on the same server. The originating document, the homepage.html file, is in the webfiles directory. The destination links are the htmlguides.html file, which is also in the webfiles directory; the editors.html file, which is in the tools subdirectory of webfiles; and the userguide.html file, which is in the lbryfiles directory. (Note that the lbryfiles directory is not a subdirectory of webfiles). The directory structure for the files is shown in Figure 14, and the marked up text of the homepage.html file is shown in Figure 15.
	Figure 14.  Example Directory Structure

	+---------------------------------------+
	|			|
	webfiles		 lbryfiles
	|			|
	+-------+-----------------+  userguide.html
	| 			|            |
	homepage.html	htmlguides.html  tools
		|
		editors.html

Figure 15.  Contents of Homepage.html

	< P> More information about creating documents for the Web can be 
	found in < A HREF="htmlguides.html"> Guides to HTML< /A> .  To
	facilitate the authoring of HTML documents a number of < A
	HREF="tools/editors.html"> HTML editors< /A>  are being
	developed. < A HREF="../lbryfiles/userguide.html"> Ways of Using
	Networked Resources< /A>  in the library is another document for
	learning to use the Internet.

The path of a URL is a hierarchical "naming space" similar to a directory and file name structure. The conventions used for referencing names are patterned after the UNIX file system. Each forward slash (/) in the URL's path statement is a division of the hierarchy. Items to the left of a forward slash have a greater precedence in the hierarchy than items to the right of the forward slash. For example, in Figure 14, editors.html is a part of tools which, in turn, is a part of webfiles. A convention for navigating the UNIX filesystem is that following the change directory command with a space and two periods (cd ..) moves the user up one level in the directory hierarchy. Consequently, the user could type "cd ../new_directory_name" to move up one directory level and then move into a new directory that branches off of that same level. This capability can be expressed in relative URLs, as shown in the example of moving from the document homepage.html to the document userguide.html (../lbryfiles/userguide.html).

The importance of hierarchical naming and relative URLs is that their use allows HTML documents to be constructed on one machine and easily moved to another. This capability is very useful for authors who do not have user access to a Web server. Through the use of partial URLs, it is quite common for documents to be written and marked up on a PC (or Macintosh) and then FTP'd to a UNIX machine functioning as a Web server. For instance, four of the Library of Congress' Web exhibits text files were actually written in Washington, D.C.; the HTML markup of these files was done in the Netherlands and in Tennessee; and the resulting HTML files were transferred to a Web server in North Carolina. The use of relative URLs made this world-wide endeavor much easier.

FRAGMENT IDENTIFIERS

Fragment identifiers are established by using the NAME attribute of the ANCHOR (< A> ) tag. Normally, the value of the NAME attribute is a mnemonic for the anchored text. Whenever the NAME attribute is used, the anchored text can be the destination of a link, and it is a means of identifying a fragment of the document. It is possible to create links to specific sections of a document only when those sections have been anchored and a value has been given for the NAME attribute. It is a good idea to use the NAME attribute so that future authors can create links to specific areas of your documents. (Of course, regular hypertext links to the documents can always be created.)

The use of the NAME attribute also permits the creation of links within the same document. This can be done at the top of a large file to permit users to quickly access relevant sections of the document rather than forcing users to scroll through the entire document.

Figure 16.  Example Fragment Identifiers

	< P> The < A HREF="documentname#coombs"> 
	article by Coombs et al.< /A>  is an 
	excellent overview of markup practices for 
	scholarly texts.

	< P> More text could go here.  Notice 
	how fragment identifiers can be used to create footnotes.

	< HR>

	< P> < A NAME="coombs"> James H. Coombs, 
	Allen H. Renear, and Steven J. DeRose < /A> , 
	"Markup Systems and the Future of Scholarly Text
	Processing," Communications of the ACM 30 
	(November 1987): 933-947.

IMG TAG (< IMG> )

Images that are placed within HTML documents are called "in-lined images." Supported image formats include GIF, JPEG, and bitmaps. Images can be scanned photos or original graphics created with a paint program. The IMG tag (< IMG> ) indicates that an image should be included in a HTML document. The IMG tag has three attributes. The most important attribute is SRC (think of source), which has as its value the URL of the image. The SRC value may be a partial or absolute URL depending upon the location of the image. The second attribute of the IMG tag is ALT. The value of ALT is the text that should be displayed in web browsers that do not support in-lined images, such as Lynx. The third attribute is ALIGN, which indicates whether to align the text alongside the top, middle, or bottom of the image when the document is displayed. Legal values for ALIGN are "top," "middle," or "bottom"; the default is "bottom."

Figure 17 shows the use of the IMG tag.

Figure 17.  Example IMG Tag

	< IMG SRC="machu_p.gif" ALT="Ruins at Machu Pitthu, Peru"> 


When using an image as a link, the IMG tag is inserted alongside or in place of the text of the anchor. For example, Figure 18 presents anchor links from a small image to a larger photo of that image.
Figure 18.  Example In-lined Image Linked to a Larger
Image

	< A HREF="../full-images/lg.machu_p.gif"> < IMG
	SRC="machu_p.gif"> Ruins at Machu Pitthu, Peru< /A>


Whenever a user clicks the small image, the link retrieves the larger one. Notice that the text is also a link to the image so that the user may click on either the image or the text to activate the link. Browsers usually display images functioning as links with a thicker border than those that surround decorative images. A frequent mistake that HTML authors make with this kind of link is to forget to add the ending ANCHOR tag (< /A> ).

The thumbnail images that are an essential part of so many Web documents can be created with a number of tools. Most image viewers include a way to reduce the size of an image. If scanning from a photo, one might want to consider using a Kodak Photo CD, which provides five different resolutions (one of which is excellent for Web documents).

LISTS

The hierarchical structure of the Gopher software has proven to be very useful, and authors can retain a hierarchical menu when creating Web documents. A drawback of Gopher is that only limited information can be provided within the hierarchical menu itself. Readers normally have to select a Gopher menu item and then view a README file (or some other documenting file) to determine the system's scope. Since HTML provides the capability for hypertext links to be included within text, it's easy to provide some descriptive information about a document before the link is selected. In some ways, one can look at the Web as simply extending the capabilities of Gopher to the next logical level.

A good example of using related text with a hypertext link is NCSA's "What's New" pages (http://www.ncsa.uiuc.edu/SDG/ Software/Mosaic/Docs/whats-new.html). The entries on the "What's New" page are simply separated into paragraphs in order to create a list. Another approach would be to use the HTML list tags.

UNORDERED LIST Tag (< UL> )

In HTML, lists are simply sequences of paragraphs that may be prefaced with special characters. A common means of organizing home pages is to separate the different items into an UNORDERED LIST. A bullet preceding each item calls attention to that item. The tag to begin an UNORDERED LIST is < UL> . Think of "UL" as representing "unordered list." The tag to end an UNORDERED LIST is < /UL> . Each item in a list must be preceded by an < LI> tag. This tag represents the list entry. A list can contain many separate items. The < LI> tag does not require an end tag.

In Figure 19, hypertext links to documents on the Internet are presented as a UNORDERED LIST.

Figure 19.  Example UNORDERED LIST

	< UL>
 
	< LI>  Visit the < A HREF="http://sunsite.unc.edu/expo/1492.exhibit/
	Intro.html"> 1492: An Ongoing Voyage< /A>  Exhibit by the 
	Library of Congress to learn about the early exploration 
	of the Western Hemisphere.

	< LI>  Biodiversity and environmental issues in the 
	Appalachians are the themes of the 
	< A HREF="gopher://www.lib.utk.edu/11/Information-by-
	Subject/S%3a/smokies"> Smoky Mountain Database< /A>.

	< LI> The < A HREF="telnet://database.carl.org/"> 
	CARL Corporation< /A> provides an excellent 
	interactive service accessible over the Internet.

	< LI>  The proceedings < A HREF="ftp://ftp.cni.org/pub/
	LITA/tiip-forum/proceedings.html"> Principles 
	dor the Development of the National Information 
	Infrastructure< /A>  from ALA's Telecommunications and
	Information Infrastructure Policy Forum are available 
	on the Internet.

	< /UL> 

MENU LIST TAG (< MENU> )

A MENU LIST is appropriate for very short items. A MENU LIST groups the items more closely together, and there is normally only one item per line. Because of the line length limitations, a MENU LIST can be considered to be the Web's equivalent of a Gopher hierarchical menu. To start a MENU LIST, use the < MENU> tag. Each item in the menu list is designated by the < LI> tag. To close a MENU LIST, use the < /MENU> tag.
Figure 20.  Example MENU LIST

	< MENU>

	< LI> < A HREF="http://sunsite.unc.edu/expo/1492.exhibit
	/Intro.html"> 1492: An Ongoing Voyage Exhibit< /A> 

	< LI> < A HREF="gopher://www.lib.utk.edu/11/Information-by-
	Subject/S%3a/smokies"> Smoky Mountain Database< /A> 

	< LI> < A HREF="telnet://database.carl.org/"> CARL Corporation< /A>

	< LI> < A HREF="ftp://ftp.cni.org/pub/LITA/tiip-
	forum/proceedings.html"> Principles for the Development 
	of the National Information Infrastructure< /A> .

	< /MENU>


ORDERED LIST TAG (< OL> )

A third type of HTML list is the ORDERED LIST. The tag to begin an ORDERED LIST is < OL> . Each item in the list is designated by the < LI> tag. When an ORDERED LIST is displayed in a Web browser, the < LI> tag is replaced by an Arabic numeral. The list items are numbered in the proper order. Authors do not need to enter the item number when creating an ordered list; HTML ensures that Web browsers display the numbers automatically. A key benefit of an ORDERED LIST is that items can be added in the middle of the list without the author having to manually correct the numerical order. The end of an ORDERED LIST is identified by the < /OL> tag.
Figure 21.  Example ORDERED LIST

	< H1> Creating Hypertext Links in HTML< /H1> 

	< OL> 

	< LI> Identify the destination of links.
	< LI> Determine text to be anchored as the start of a link.
	< LI> Surround text with anchor tags.
	< LI> Insert within the starting anchor tag the URL of the link's
	destination as the value of the HREF attribute.
	< LI> Use the NAME attribute so that the anchored text may also 
	be the destination of a link.
	< LI> Be sure to close the anchor with the end tag.
	< LI> Test the link in a browser.

	< /OL> 


The only differences in the markup for an UNORDERED and ORDERED LIST are the tags that open and close the list. Use < UL> < /UL> for an UNORDERED List and < OL> < /OL> for an ORDERED List.

DEFINITION LIST TAG (< DL> )

A DEFINITION LIST provides information in a glossary format. As with other types of lists, a DEFINITION LIST begins and ends with a unique tag. To begin a DEFINITION LIST, use the < DL> tag. Each term in the list is identified by the < DT> tag. In essence, the < DT> tag serves the same function as the < LI> tag in an ORDERED or an UNORDERED LIST. A DEFINITION LIST also needs a tag to identify the definition of the term itself: the < DD> tag.
Figure 22.  Example DEFINITION LIST

	< H1> Glossary< /H1>

	< DL> 

	< DT> URL< DD> Uniform Resource Locator is the standard used to
	refer to documents and their locations on the Internet.

	< DT> NCSA< DD> The National Center for Supercomputing
	Applications created the Mosaic browser, which was 
	instrumental in bringing greater attention to 
	the World-Wide Web.

	< DT> WWW< DD> The World-Wide Web, originating 
	out of CERN in Switzerland, provides hypertext on 
	the Internet through the use of HTTP and HTML.

	< DT> SGML< DD> The Standard Generalized 
	Markup Language is an international standard 
	that describes the structure of a document.

	< DT> DTD< DD> A Document Type 
	Definition, specified according to the rules of 
	SGML (e.g., HTML), describes a document's structure 
	for the purposes of a particular application such as Web.

	< DT> CERN< DD> The Swiss organization that 
	started the World-Wide Web initiative. The words 
	of the acronym translate into English as the
	European Laboratory for Particle Physics. 

	< /DL>


Neither the < DT> nor < DD> tags require end tags. Remember to use the < DT> and < DD> tags in pairs. Be sure to use the proper closing tag < /DL> for the list.

NESTED LISTS

An extremely useful HTML feature is the ability to nest lists. A nested list can serve as an outline or as a way to show multiple levels of a hierarchical structure of documents. The nested list is most often used with an UNORDERED LIST. Depending on the capabilities of the Web browser, the nested items will be displayed with different types of bullets from items at higher levels of the hierarchy.

ADVANCED HTML FEATURES

The previous sections of this paper described the tags needed to create usable HTML documents; this section provides information about additional structural elements of HTML.

PROLOGUE

The PROLOGUE appears at the very beginning of a HTML file, and it identifies that file as being a HTML document. The primary purpose of the PROLOGUE is to allow software to distinguish HTML documents from other types of SGML documents. All HTML documents written according to the current HTML specifications have the PROLOGUE shown in Figure 23.
Figure 23.  Standard PROLOGUE

	< !doctype html public "-//W30//DTD W3 HTML 2.0//EN"> 

THE HIERARCHY OF A HTML DOCUMENT

A HTML document is composed of a hierarchy of structural elements. At the top of the hierarchy is the HTML element itself. This element encompasses all other elements; therefore, the < HTML> tag comes at the beginning of the document (just after the PROLOGUE) and the < /HTML> tag appears at the end of the document. Below the HTML element in the hierarchy are the HEAD and BODY elements. The HEAD includes elements that describe the document, such as TITLE. Additional elements (BASE, ISINDEX, LINK, and NEXTID) of the HEAD are described in Appendix A. While the HEAD and related elements are useful to software that processes HTML documents, their use is not required. The BODY tag identifies the primary information content of the document.

SPECIAL CHARACTERS

Non-ASCII characters can be displayed using HTML. To represent these characters in a HTML document file, use an ampersand (& ) followed by the designated letter(s) for the desired character. The ampersand instructs the Web browser to ignore the regular meaning of a letter and to insert the new character. A full list of special characters is available from the following URL: http://info.cern.ch/hypertext/WWW/MarkUp/ISOlat1.html.

ORGANIZING HTML DOCUMENTS

Understanding how to use HTML markup tags is only one aspect of creating an effective hypermedia resource; organizing and linking the documents is far more time- consuming. Unlike printed materials designed to be read sequentially, hypertext documents can be read in an unspecified order. A common effect hypertext has on readers is the sensation of being "lost in hyperspace." Without any navigational aids or clues from the author, readers may lose their sense of orientation as to which documents should be read next and which documents have already been read.

All Web clients should provide some sort of navigational features, such as backtracking and history. Backtracking allows the user to go back to each previous document until he or she has regained a "sense of place." The history feature provides a list of documents that the reader has visited. Clicking on any of the documents listed in the history should return the reader to that point. However, HTML document authors should be aware that each Web browser may implement these features differently and, in some cases, the navigational aids of a browser may not be fully reliable. When providing a large number of related hypertext documents, it is always wise to embed navigational links directly into the documents. Authors will often provide links in the form of images for moving forward and backward among related documents. These links serve as a safeguard against the possibility that a browser may not support the proper means of navigation. A good example of this safeguard is the HTML version of "Entering the World- Wide Web: A Guide to Cyberspace" (http://www.eit.com/web/www.guide/). In addition to providing clues as to where to go next, document authors should always provide the reader with a way to escape out of a series of documents without using the backtrack button. Most often, this escape feature is in the form of a link back to the site's home page.

Author-supplied navigational aids may take other forms, such as a guided tour that suggests the order in which documents should be read. The Library of Congress' Rome Reborn Exhibit (http://sunsite.unc.edu/expo/vatican.exhibit/Vatican.exhibit.html) demonstrates the effects of using a "virtual tour guide." Another approach is to provide an outline that provides direct links to selected documents. This outline technique simply nests links (this strategy was employed for the Library of Congress exhibits): see the outline documents for "1492: An Ongoing Voyage" (http://sunsite.unc.edu/expo/1492.exhibit/overview.html) and the Dead Sea Scrolls Exhibit (http://sunsite.unc.edu/expo/deadsea.scrolls.exhibit/overview.html).

One advanced feature of HTML is the ability to create graphical images and then specify certain areas of the image as links to different documents with the ISMAP tag. The University of Tennessee's Office of Research Services has created a HTML document that effectively illustrates the use of ISMAP capabilities http://solar.rtd.utk.edu/default.html).

HTML AND SGML

HTML is only one type of markup language based on the Standard Generalized Markup Language (SGML). As an international standard, SGML provides a way of creating markup languages tailored for different types of documents. The defining characteristic of documents marked up using HTML is that they may contain hypertext links to other documents located on a computer network. The "grammar" (or set of permissible tags and their uses) for any SGML-based markup language is defined in a Document Type Definition (DTD). In the strictest sense, HTML is not a markup language, but a specific DTD for SGML. Just as HTML serves the purposes of hypermedia, other DTDs exist for other purposes. For example, the Text Encoding Initiative (TEI) has produced a DTD for the markup of humanities text files. It is possible for gateways to be developed between one DTD and another DTD; however, any functionality found in one DTD, but not in the other, would be lost as the document traversed the gateway.

WHAT'S AHEAD FOR HTML: SOLIDIFYING THE STANDARD

After the issuance of the HTML specification as an Internet Draft in June 1993, the release of several Web browsers resulted in a dramatic increase of HTML usage as well as the emergence of new Web capabilities. As it became evident that the 1993 HTML specification was not providing certain features that authors needed, an effort to extend HTML was begun. The new specification that evolved from this effort was known as HTML+.

However, at the same time, Web browser developers faced increased difficulties in interpreting the HTML tags being used by authors; many HTML documents on the Web failed to even comply with the original specification.

Some of the conditions that resulted in non-conformance with the HTML standard were widespread use of rapidly changing freeware that did not fully support all HTML elements, unclear (and sometimes conflicting) documentation for authors and software implementors about how to use and interpret HTML tags, and authors' persistence in validating HTML documents with one particular Web browser. As a result, some HTML documents only looked good on a specific browser. For many users, HTML was defined not by its specification, but by whatever features their favorite browser supported.

With the advent of commercial Web software and the wide exposure that the Web garnered in many publications, the scalability of the Web was in jeopardy without a strengthening of the HTML specification to define a stable standard for current practice. Based on discussions on Internet mailing lists devoted to Web and the First International WWW Conference in Geneva (held in May 1994), a mechanism for recasting the HTML specification was established. The designation HTML+ was dropped in preference of a means for identifying different levels of conformance to the HTML specification.

Table 2. HTML Conformance Levels

  1. Level 0 Indicates the minimum conformance level.

  2. Level 1 Indicates Level 0 features, plus features such as highlighting and images.

  3. Level 2 Includes all Level 0 and Level 1 features, plus forms.
HTML elements that conform to Level 0 are implemented in every browser and constitute the basic set of tags needed to create HTML documents.

HTML elements that conform to Level 1, but are not in Level 0, may not necessarily be implemented in the same manner in every browser.

HTML elements which conform to Level 2, but are not in Level 1, are the tags that are used to create forms.

Elements that were to be in the former HTML+ specification will most likely be included in a future HTML 3.0 specification. The aims of HTML 2.0 are to document valid uses of tags in order to provide guidance for authors and software developers and to enable the interoperability of HTML documents among a variety of Web applications.

The fact that the process of solidifying the HTML standard has been a tumultuous one reflects the participatory nature of the Internet and the difficulties of reaching ad hoc consensus among a wide range of users. A more formal means for maintaining the HTML standard was established in July 1994 under the guidance of the Internet Engineering Task Force (IETF), which has responsibility for developing and reviewing Internet standards. The IETF Working Group on HTML is chaired by Tim Berners-Lee, who has been the creative and inspirational force behind the World-Wide Web since its inception.

BRINGING GRAPHIC DESIGN TO THE INTERNET

Often, Web providers initially create servers to promote networked communications to their parent institutions. Tools such as NCSA's Mosaic have become excellent PR devices for the Internet. The ability to place scanned images and original graphics alongside of text can make a strong impression to an institution's senior officers. Yet, effectively demonstrating that these tools are worthy of an institutional commitment of time and funds depends upon a quality presentation.

Graphical browsers that display images and multiple fonts have brought graphical design to the Internet; however, the flexibility that HTML allows authors in controlling the appearance of their documents is quite limited. Future work on the HTML standard will likely address these weaknesses with the use of style sheets that allow authors to "specify formatting . . . without distorting the logical structure markup." 5 The visual impact of a document also involves a significant navigational issue or what Jakob Nielsen refers to as the "homogeneity problem":

The differences in graphical design are intended to reduce the homogeneity problem in on-line text, which basically is that on-line text always looks the same. . . . On-line text does not have the variety which traditional text has, due to variations in typefont, book size, color, etc., or even the basic differences in physical looks between, say a real book, a newspaper, and a note written on a napkin in the cafeteria during a lunch break. 6

One HTML example of how the traditional bookcase can be used as a means of orienting users to different online documents can be found at Novell Online Services (http://www.novell.com/). On its home page, Novell has an image of books sitting on a shelf. The spine of each "book" has a title that matches each service that is accessible through the company's Web server. A caption reads "Click on a book to enter the specified area." The image of the bookshelf employs the ISMAP attribute of the IMAGE element. Each area of the server has a smaller version of the same "bookcase" image. The developer of Novell's Web server handled the problem of homogeneity by creatively using the graphic design elements of HTML. Navigating through this Web server is as easy as "pulling a book from the shelf" by clicking on it. However, as Nielsen observes: 7

One might argue that homogeneity could be desirable because it emphasizes the book metaphor and because readers can assimilate information faster when they encounter a familiar format. Of course, this is true to some extent but we would actually want to avoid the book metaphor in our future hypertext designs because it seems to limit the conceptual models of the search potential of hypertext and non-linear navigation of the information system. For the small information spaces that are often found in the Web, the homogeneity problem can be resolved through the use of familiar organizational metaphors. For larger information spaces, these familiar concepts may not transfer to an electronic environment. What may be needed for large collections of research material is nothing less than a Copernican revolution in the way we look at the written expressions of language. While the World-Wide Web itself will not likely produce such a radical change, the Web does foster the environment for such thinking.

LIBRARY USES OF HTML

HTML increases the librarian's ability to deliver value-added networked information. Rather than only providing printed materials at Internet training sessions, librarians can prepare customized HTML documents for each group of users. As new Internet sites are discovered and added, these documents can become dynamic guides to network resources. Using HTML as the medium for a presentation, rather than an overhead transparency or presentation software, provides a new degree of training versatility. Configuring Web servers so that subject specialists and trainers have control over updating their own HTML documents lessens the system administrator's burden of maintaining documents stored on the server.

Another training impact could be the delivery of hypertext- based instruction using the Web. Previously, this kind of instruction has only been available through software such as HyperCard and ToolBook. While HTML does not have all the advanced features of these packages, it can be used to create instructional environments that not only transcend a single computer platform, but also extend over the campus network and allow students to use the program from locations other than the library. Also, as teachers develop their own networked instructional multimedia materials, there are opportunities to more effectively integrate library resources into the curriculum. By actively exploring the possible uses of HTML, a library can identify needs that can only be met with networked hypermedia.

In addition to enabling new training methods, HTML is a suitable tool for publishing electronic journals and newsletters. While HTML does not scale to fit the complexity of many publications, it does provide enough flexibility for the ASCII- based electronic journals that are currently in existence. Since many articles in these electronic journals already reference other electronic materials, the use of hypertext links would facilitate the retrieval of related information. Furthermore, HTML would allow electronic journals to expand beyond ASCII limitations and include photographs and other images. Finally, with the movement of electronic journals toward HTML, the electronic resources provided by today's libraries might begin to more clearly resemble the networked environment of the future. Indeed, as Ross Atkinson advises:

In considering the future of scholarly information exchange, we must therefore take into account not only the facility of the network but also the effects of computers on scholarly reading and writing. Certainly one of the best approaches to such an assessment is to focus on the phenomenon of hypertext because it is through the concept (if not yet the reality) of hypertext that we begin to sense the most fundamental and far-reaching effects of the computer on the communications in general and scholarly information exchange in particular. 8

CONCLUSION

The Web and HTML bring the reality of hypertext into everyday life for many Internet users. As Web browsers become more sophisticated and pervasive, HTML raises ASCII text to new levels of usability. HTML is becoming a common denominator for accessing electronic services in a networked environment. Through the use of the enabling technologies found in the World-Wide Web, librarians can explore the obstacles and barriers that will need to be overcome in implementing future library services.

NOTES

1.T. J. Berners-Lee, R. Cailliau, J. F. Groff, and B. Pollerman, "World-Wide Web: An Information Infrastructure for High-Energy Physics" (Presented at Software Engineering, Artificial Intelligence, and Expert Systems for High Energy and Nuclear Physics, La Londe-les-Maures, France, January 1992). (Preprint available by anonymous ftp; URL: ftp://info.cern.ch/ pub/www/doc/www-for-hep.ps.Z.)

2.John Price-Wilkin, "Using the World-Wide Web to Deliver Complex Electronic Documents: Implications for Libraries," The Public-Access Computer Systems Review 5, no. 3 (1994): 5-21. (To retrieve this article, send the following e-mail message to listserv@uhupvm1.uh.edu: GET PRICEWIL PRV5N3 F=MAIL.)

3.The latest edition of the HyperText Markup Language (HTML), Version 2.0 specification is available from the following URL: http://www.hal.com/~connolly/html-spec.

4.James H. Coombs, Allen H. Renear, and Steven J. DeRose, "Markup Systems and the Future of Scholarly Text Processing," Communications of the ACM 30 (November 1987): 944.

5.Charles F. Goldfarb, The SGML Handbook (Oxford: Clarendon Press, 1990), 93.

6.Jakob Nielsen, "The Art of Navigating Through Hypertext," Communications of the ACM 33 (March 1990): 299.

7.Ibid., 300.

8.Ross Atkinson, "Networks, Hypertext, and Academic Information Services: Some Longer-Range Implications," College & Research Libraries 54 (May 1993): 202.


APPENDIX A: HTML TAGS

In the following entries, the level description is for HTML 2.0.

1.0 ADDRESS

TAG: < ADDRESS> < /ADDRESS>

LEVEL: 0

FUNCTION:

Used to provide authorship information for HTML documents. Normally found at the bottom of documents.

OPTIONAL ATTRIBUTES:

None.

COMMENTS:

ADDRESS is a very useful tag that enables users to quickly identify a document's author. The ADDRESS tag may contain a link to another HTML document that provides additional information about the author. The text of this tag typically appears in italics when displayed.

EXAMPLE:

< ADDRESS> Jeff Barry jeff@utkux.utcc.utk.edu< /ADDRESS>
2.0 ANCHOR

TAG: < A> < /A>

LEVEL: 0

FUNCTION:

The start and end ANCHOR tags surround text that represents a hypertext link. When used with the HREF attribute, the ANCHOR tag represents the origin of a link. When used with the NAME attribute, the ANCHOR tag serves as the destination of a link.

ATTRIBUTES:

HREF : Specifies, in the form of a URL, the document to be retrieved when the link is selected.

NAME: Identifies the text as a specific location within the document. It can be the destination of a hypertext link.

Level 1 Attributes:

TITLE: The value for this attribute is the title of the document given by the URL in the HREF attribute.

URN: Specifies the Uniform Resource Number (URN) for the document given by the URL in the HREF attribute.

METHODS: Specifies functions for the document given by the URL in the HREF attribute. Prior to link activation, METHOD indicates to the user whether the linked document is searchable, is an image, or has some other special function. Depending on the method indicated, links may be displayed differently. Not all Web browsers support this attribute.

PROPOSED ATTRIBUTES:

REL:Provides the relationship between the originating document and the destination document.

REV: Provides the relationship between the destinationdocument and the originating document.

COMMENTS:

For the ANCHOR tag to be functional, one of the attributes HREF or NAME must be used, or both of these attributes must be used. The Level 1 and proposed attributes serve advanced uses of HTML and are by no means required.

EXAMPLE:

< HREF="http://sunsite.unc.edu/expo/ticket_office.html"> EXPO, a showcase of online exhibits< /A>

3.0 BASE

TAG: < BASE>

LEVEL: 0

FUNCTION:

Records the URL of the document for use by partial URLs, which will be relative to this base URL.

REQUIRED ATTRIBUTE:

HREF Specifies the URL of the document.

COMMENTS:

The BASE element is only used in the HEAD of a HTML document. When BASE is not present, relative URLs are resolved against the URL used to access the document. Use of the BASE element is not required. The BASE element does not have an end tag.

EXAMPLE:

< BASE HREF="http://sunsite.unc.edu/expo/deadsea.scrolls/ Intro.html">

4.0 BLOCKQUOTE

TAG: < BLOCKQUOTE> < /BLOCKQUOTE>

LEVEL: 0

FUNCTION:

Renders enclosed text in a distinguishing manner to indicate a quotation.

OPTIONAL ATTRIBUTES:

None.

COMMENTS:

Specific rendering depends upon the browser, but the text is usually either in italics or with left and right margins indented.

EXAMPLE:

< BLOCKQUOTE> If a man does not keep pace with his companions, perhaps it is because he hears a different drummer. Let him step to the music which he hears, however measured or far away.< /BLOCKQUOTE>
5.0 BODY

TAG: < BODY> < /BODY>

LEVEL: 0

FUNCTION:

The BODY element represents one of the two main hierarchical divisions of a HTML document (the other division being the HEAD). Essentially, the BODY is the text that is to be displayed.

COMMENTS:

The BODY can include the following elements and types of elements: ADDRESS, ANCHOR, BLOCKQUOTE, FORM, Heading elements (e.g., HEADING 1), Highlighting elements (e.g., < B> , < I> , < EM> , and < STRONG> ), HORIZONTAL RULE, IMG, LINE BREAK, List elements (e.g., ORDERED LIST), PARAGRAPH, PREFORMATTED TEXT, and special characters.

EXAMPLE:

< BODY>

Your < EM> home page< /EM> may include links to other information sources on the network, information about yourself, and even your photograph. < /BODY>

6.0 BOLD

TAG: < B> < /B>

LEVEL: 1

FUNCTION:

Indicates that enclosed text should be highlighted using a bold font style.

OPTIONAL ATTRIBUTES:

None.

COMMENTS:

The BOLD element is a physical tag that only provides formatting information. Preferably, use either the < EM> or < STRONG> tags to denote emphasis or strong emphasis.

EXAMPLE:

HTML provides the flexibility of crafting a < B> toolbox of networked resources< /B> that meets your needs.

7.0 CITE

TAG: < CITE> < /CITE>

LEVEL: 1

FUNCTION:

Renders the enclosed text in a distinguishing style to indicate a citation.

COMMENTS:

Usually displayed in italics.

EXAMPLE:

< CITE> Henry David Thoreau. Walden, 1854.< /CITE>

8.0 CODE

TAG: < CODE> < /CODE>

LEVEL: 1

FUNCTION:

A highlighting feature that renders the enclosed text in a distinguishing style to indicate a sample of computer code.

COMMENTS:

Usually displayed in a monospace font.

EXAMPLE:

< CODE>

#!/usr/bin/perl

print "This is a sample of computer code.\n";

< /CODE>

9.0 DEFINITION "DEFINED"

TAG: < DD>

LEVEL: 0

FUNCTION:

Indicates that the designated text is a term definition that is in a definition list.

COMMENTS:

Must follow the content of a < DT> tag. Can only be used within a DEFINITION LIST (< DL> ). Use of a < /DD> end tag is optional.

EXAMPLE:

< DL>

< DT> NCSA< DD>
The National Center for Supercomputing Applications created the Mosaic browser, which was instrumental in bringing greater attention to the World-Wide Web. < /DL>
10.0 DEFINITION LIST

TAG: < DL> < /DL>

LEVEL: 0

FUNCTION:

Formats terms like a glossary.

OPTIONAL ATTRIBUTE:

COMPACT Reduces the amount of white space between terms. Used as < DL COMPACT> . This attribute has no value.

COMMENTS:

Must be used with the < DT> and < DD> tags to indicate the term to be defined and the definition of that term. Terms are listed in the left column, with the definitions in the right column. Long definitions will wrap to succeeding lines. The < DL> tag must be followed immediately by the < DT> tag.

EXAMPLE:

< DL>

< DT> URL< DD> Uniform Resource Locator is the standard used to refer to documents and their locations on the Internet. < /DL>

11.0 DEFINITION TERM

TAG: < DT>

LEVEL: 0

FUNCTION:

Indicates that designated text is a term to be defined in a DEFINITION LIST.

COMMENTS:

A < DT> is followed by the term, which is then followed by a < DD> tag to indicate the actual definition of the term. The defined term is usually displayed along the left margin. Use of a < /DT> end tag is optional.

EXAMPLE:

< DL>

< DT> WWW< DD> The World-Wide Web, originating out of CERN in Switzerland, provides hypertext on the Internet through the use of HTTP and HTML. < /DL>

12.0 DIRECTORY LIST

TAG: < DIR> < /DIR>

LEVEL: 0

FUNCTION:

A listing feature that displays short items (20 characters or less) in columns.

COMMENTS:

Used in conjunction with the < LI> tags to designate the items in each column. Depending upon its interpretation by browsers and the length of listed items, the DIRECTORY LIST may yield unexpected results. This element is seldom used.

EXAMPLE:

< DIR>

  • < LI> first column
  • < LI> second column
  • < /DIR>

    13.0 EMPHASIS

    TAG: < EM> < /EM>

    LEVEL: 1

    FUNCTION:

    The enclosed text will be highlighted.

    COMMENTS:

    The appearance of text surrounded with the < EM> < /EM> tags is determined by the Web browser. However, the text is normally displayed in italics.

    EXAMPLE:

    Your < EM> home page< /EM> may include links to other information sources on the network, information about yourself, and even your photograph.

    14.0 FORM

    TAG: < FORM> < /FORM>

    LEVEL: 2

    FUNCTION:

    Indicates that a form is included within a HTML document.

    COMMENTS:

    All Web browsers do not support the FORM tag.

    15.0 HEADER

    TAG: < HEAD> < /HEAD>

    LEVEL: 0

    FUNCTION:

    Contains information describing the HTML document.

    COMMENTS:

    Elements contained in the HEADER of a HTML document are not displayed by a browser with the document. Use of a HEADER is recommended, but not required. The following tags are allowed within the < HEAD> tag: BASE, ISINDEX, LINK, NEXTID, TITLE.

    EXAMPLE:

    < HEAD>

    < TITLE> Sample Home Page< /TITLE>

    < /HEAD>

    16.0 HEADING 1

    TAG: < H1> < /H1>

    LEVEL: 0

    FUNCTION:

    The first level heading, represented in a very large bold font and centered.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H1> The most prominent text displayed in a document< /H1>

    17.0 HEADING 2

    TAG: < H2> < /H2>

    LEVEL: 0

    FUNCTION:

    The second level heading, represented in a large, bold font with no indentation.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H2> Text describing the second level of a document< /H2>

    18.0 HEADING 3

    TAG: < H3> < /H3>

    LEVEL: 0

    FUNCTION:

    The third level heading, represented in a large italic font that is slightly indented.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H3> Text describing the third level of a document< /H3>

    19.0 HEADING 4

    TAG: < H4> < /H4>

    LEVEL: 0

    FUNCTION:

    The fourth level heading, represented in a bold normal font and indented.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H4> Text describing the fourth level of a document< /H4>

    20.0 HEADING 5

    TAG: < H5> < /H5>

    LEVEL: 0

    FUNCTION:

    The fifth level heading, represented in a normal italic font and

    indented.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H5> Text describing the fifth level of a document< /H5>
    21.0 HEADING 6

    TAG: < H6> < /H6>

    LEVEL: 0

    FUNCTION:

    The sixth level heading, represented in a normal bold font and indented.

    COMMENTS:

    The Web browser determines font representation, although the above rendering is recommended by the HTML specifications. Do not use the PARAGRAPH tag (< P> ) within or around headings.

    EXAMPLE:

    < H6> Text describing the sixth level of a document< /H6>
    22.0 HORIZONTAL RULE

    TAG: < HR>

    LEVEL: LEVEL 0

    FUNCTION:

    Indicates to a Web browser that a horizontal divider line should be displayed at the designated location within the HTML document.

    COMMENTS:

    No end tag required.

    EXAMPLE:

    < P> Additional information about the < A HREF="http://info.cern.ch/hypertext/WWW/TheProject.html"> World- Wide Web Project< /A> is available for those interested in learning more about networked hypermedia. < HR>


    23.0 HTML LABEL

    TAG: < HTML> < /HTML>

    LEVEL: 0

    FUNCTION:

    Labels the file as being a HTML document.

    COMMENTS:

    The < HTML> tag comes at the beginning of a HTML document and the < /HTML> tag comes at the end. Use of this element is recommended, but not required.

    24.0 IMG

    TAG: < IMG>

    LEVEL: 1

    FUNCTION:

    Indicates that an image is to be included in the HTML document.

    COMMENTS:

    No end tag required.

    MANDATORY ATTRIBUTE:

    SRC Identifies the source of the image file using URL syntax.

    OPTIONAL ATTRIBUTES:

    ALIGN Indicates whether text should be aligned with the top, middle, or bottom of the image. Valid values: top, middle, bottom.

    ALT Specifies the text that should be displayed by web browsers that do not support images.

    ISMAP Indicates that the image is a graphical map that has "hot spots" corresponding to hypertext links.

    EXAMPLE:

    < IMG SRC="machu_p.gif" ALT="Ruins at Machu Pitthu, Peru">

    25.0 ISINDEX

    TAG: < ISINDEX>

    LEVEL: 0

    FUNCTION:

    The inclusion of this tag indicates that the HTML document is searchable, provided that the Web server supports this capability.

    COMMENTS:

    No end tag required.

    26.0 ITALICS

    TAG: < I> < /I>

    LEVEL: 1

    FUNCTION:

    These tags indicate that the enclosed text is to be displayed in an italics font.

    COMMENTS:

    The < I> tag represents a formatting instruction to Web browsers. It is recommended that an < EM> tag be used rather than the < I> tag to indicate emphasis.

    EXAMPLE:

    Your < I> home page< /I> may include links to other information sources on the network, information about yourself, and even your

    photograph.

    27.0 KEYBOARD SAMPLE

    TAG: < KBD> < /KBD>

    LEVEL: 1

    FUNCTION:

    Indicates that the enclosed text is a sample of a keyboard entry that should be highlighted.

    EXAMPLE:

    To search the online catalog for a book by its title, type < KBD> til< /KBD> and press the Enter key.

    28.0 LINE BREAK

    TAG: < BR>

    LEVEL: 0

    FUNCTION:

    Indicates that a new line is to be started instead of automatic line wrap.

    COMMENTS:

    Used to simulate single line spacing when additional white space is not desired. Use of a < /BR> end tag is optional.

    EXAMPLE:

    < P> Jeff Barry< BR>
    Cooperative Information Services Librarian< BR>
    The University of Tennessee Libraries< BR>
    Knoxville, Tennessee< BR>

    29.0 LINK

    TAG: < LINK>

    LEVEL: 1

    FUNCTION:

    Indicates a relationship between the HTML document and another

    document.

    COMMENTS:

    Used only within the HEADER of a HTML document. The LINK element is not widely used at the present. Use of a < /LINK> end tag is optional.

    30.0 LIST ITEM

    TAG: < LI>

    LEVEL: 0

    FUNCTION:

    Indicates that the designated text represents an item in a list.

    COMMENTS:

    The < LI> tag is used as the item indicator with the following type of lists: DIRECTORY LIST (< DIR> ), MENU LIST (< MENU> ), ORDERED LIST (< OL> ), and UNORDERED LIST (< UL> ). The < LI> tag has no attributes. Use of a < /LI> end tag is optional.

    EXAMPLE:

    < UL>

    < /UL>

    31.0 MENU LIST

    TAG: < MENU> < /MENU>

    LEVEL: 0

    FUNCTION:

    This tag is used for a list that contains a small number of items.

    COMMENTS:

    Items in a MENU LIST are often hypertext links to other documents. Like a Gopher menu, each item is usually on a single line.

    EXAMPLE:

    < MENU>

  • < LI> < A HREF="http://sunsite.unc.edu/expo/1492.exhibit/Intro.html"> 1492: An Ongoing Voyage Exhibit < /a>
  • < LI> < A HREF="gopher://www.lib.utk.edu/11/Information-by- Subject/S%3a/smokies"> Smoky Mountain Database < /a>

  • < /MENU>

    32.0 NEXTID

    TAG: < NEXTID>

    LEVEL: 0

    FUNCTION:

    A tag provided by HTML editors as a unique identifier for ANCHOR tags. Used in the HEAD of a HTML document.

    COMMENTS:

    This tag is almost never used by authors manually composing HTML documents. All Web browsers do not support this tag. Most users will not encounter this tag.

    33.0 ORDERED LIST

    TAG: < OL> < /OL>

    LEVEL: 0

    FUNCTION:

    Used for a list that contains items in a designated order.

    COMMENTS:

    Items in an ORDERED LIST are automatically numbered when displayed in a Web browser.

    EXAMPLE:

    < OL>

    1. < LI> Identify destination of links.
    2. < LI> Determine text to be anchored as the start of a link.
    < /OL>

    34.0 PARAGRAPH

    TAG: < P> < /P>

    LEVEL: 0

    FUNCTION:

    Indicates that a block of text forms a paragraph.

    COMMENTS:

    Formerly used in HTML to separate paragraphs from each other and from other text elements. The PARAGRAPH element in HTML is now a container rather than a separator, and the < P> tag should come at the beginning of a paragraph. The end tag (< /P> ) is optional.

    EXAMPLE:

    < P> The text of the first paragraph of your document is entered here.

    35.0 PREFORMATTED TEXT

    TAG: < PRE> < /PRE>

    LEVEL: 0

    FUNCTION:

    Indicates that the formatting of enclosed text should be preserved and displayed in a standard monospace font.

    EXAMPLE:

    < PRE> 
    
    $275.43   $128.65   $345.89   $234.96   $674.12

    < /PRE>

    36.0 SAMPLE

    TAG: < SAMP> < /SAMP>

    LEVEL: 1

    FUNCTION:

    Used to indicate a sequence of literal characters.

    EXAMPLE:

    The PROLOGUE of a HTML document must be exactly as follows. < SAMPLE> < !doctype html public "-//W30//DTD W3 HTML 2.0//EN"> < /SAMPLE>

    37.0 STRONG

    TAG: < STRONG> < /STRONG>

    LEVEL: 1

    FUNCTION:

    Indicates that the enclosed text is a statement with strong emphasis that should be displayed in an appropriate font.

    COMMENTS:

    Normally displayed in a normal bold font. Should be used rather than the physical tag < B> to indicate strong emphasis.

    EXAMPLE:

    HTML provides the flexibility of crafting a < STRONG> toolbox of networked resources< /STRONG> that meets your needs.

    38.0 TITLE

    TAG: < TITLE> < /TITLE>

    LEVEL: 0

    FUNCTION:

    Contains the title of the HTML document.

    COMMENTS:

    Used only within the HEAD of a HTML document.

    EXAMPLE:

    < TITLE> Sample Home Page< /TITLE>

    39.0 TYPEWRITER TYPE

    TAG: < TT> < /TT>

    LEVEL: 1

    FUNCTION:

    Indicates that the enclosed text should be displayed in a monospace typewriter font.

    COMMENTS:

    Not to be confused with the PREFORMATTED TEXT tag (< PRE> ). Use < TT> only to highlight a limited number of characters and not a block of text.

    EXAMPLE:

    The stock number for the catalog is < TT> S/N 030-000-00238-5< /TT> .

    40.0 UNORDERED LIST

    TAG: < UL> < /UL>

    LEVEL: 0

    FUNCTION:

    Displays items in no particular order.

    COMMENTS:

    Each item in an UNORDERED LIST is specified through use of the < LI> tag. When displayed in a Web browser, the < LI> tag is replaced by a bullet or some other marking.

    EXAMPLE:

    < UL>

    < /UL>

    41.0 VARIABLE

    TAG: < VAR> < /VAR>

    LEVEL: 1

    FUNCTION:

    Indicates that enclosed text is the name of a variable.

    COMMENTS:

    A logical highlighting element.

    EXAMPLE:

    The do while loop will execute as long as the value of < VAR> counter< /VAR> is less than or equal to 100.


    APPENDIX B: SAMPLE HTML DOCUMENT

    < !doctype html public "-//W30//DTD W3 HTML 2.0//EN">

    < HTML> < HEAD> < TITLE> Sample Home Page< /TITLE> < /HEAD> < BODY>

    < H1> Jeff's Home on the Net< /H1>

    Your < EM> home page< /EM> may include links to other information sources on the network, information about yourself, and even your photograph. HTML provides the flexibility of crafting a < STRONG> toolbox of networked resources< /STRONG> that meets your needs.

    < P> Additional information about the < A HREF="http://info.cern.ch/hypertext/WWW/TheProject.html"> World-Wide Web Project< /A> is available for those interested in learning more about networked hypermedia. < HR>

    < H2> Example of Using Relative URLs< /H2>

    More information about creating documents for the Web can be found in < A HREF="htmlguides.html"> Guides to HTML< /A> . To facilitate the authoring of HTML documents a number of < A HREF="tools/editors.html"> HTML editors< /A> are being developed. < A HREF="../lbryfiles/userguide.html"> Ways of Using Networked Resources< /A> in the library is another document for learning to use the Internet.

    < H2> Examples of Using Lists in HTML< /H2>

    < H3> An Unordered List of Internet Resources< /H3>

    < UL>

    < LI> Visit the < A HREF="http://sunsite.unc.edu/expo/1492.exhibit/Intro.html"> 1492: An Ongoing Voyage< /A> Exhibit by the Library of Congress to learn about the early exploration of the Western Hemisphere.

    < LI> Biodiversity and environmental issues in the Appalachians are the themes of the < AHREF="gopher://www.lib.utk.edu/11/Information-by-Subject/S%3a/smokies"> Smoky Mountain Database< /A> .

    < LI> The < A HREF="telnet://database.carl.org/"> CARL Corporation< /A> provides an excellent interactive service accessible over the Internet.

    < LI> The proceedings < A HREF="ftp://ftp.cni.org/pub/LITA/tiip- forum/proceedings.html"> Principles for the Development of the National Information Infrastructure< /A> from ALA's Telecommunications and Information Infrastructure Policy Forum are available on the Internet. < /UL>

    < H3> Creating Hypertext Links in HTML< /H3>

    < OL>

    < LI> Identify the destination of links. < LI> Determine text to be anchored as the start of a link. < LI> Surround text with anchor tags. < LI> Insert within the starting anchor tag the URL of the link's destination as the value of the HREF attribute. < LI> Use the NAME attribute so that the anchored text may also be the destination of a link. < LI> Be sure to close the anchor with the end tag. < LI> Test the link in a browser. < /OL>

    < H3> Glossary of Selected Acronyms< /H3>

    < DL>

    < DT> URL< DD> Uniform Resource Locator is the standard used to refer to documents and their locations on the Internet.

    < DT> NCSA< DD> The National Center for Supercomputing Applications created the Mosaic browser, which was instrumental in bringing greater attention to the World-Wide Web.

    < DT> WWW< DD> The World-Wide Web, originating out of CERN in Switzerland, provides hypertext on the Internet through the use of HTTP and HTML.

    < DT> SGML< DD> The Standard Generalized Markup Language is an international standard that describes the structure of a document.

    < DT> DTD< DD> A Document Type Definition, specified according to the rules of SGML (e.g., HTML), describes a document's structure for the purposes of a particular application such as Web.

    < DT> CERN< DD> The Swiss organization that started the World-Wide Wed initiative. The words of the acronym translate into English as the European Laboratory for Particle Physics. < /DL>

    < ADDRESS> Jeff Barry jeff@utkux.utcc.utk.edu< /ADDRESS>

    < /BODY>
    < /HTML>


    APPENDIX C: SOURCES OF INFORMATION ABOUT HTML AND THE WEB

    More information about the following topics can be found at the indicated URLs.

    HTTP

    http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html

    World-Wide Web Initiative

    http://info.cern.ch/hypertext/WWW/TheProject.html

    Web-related mailing lists.

    http://info.cern.ch/hypertext/WWW/Administration/Mailing/Overview.html

    Other useful sources of information include.

    o A list server, web4lib@library.berkeley.edu, about the delivery of library services via the Web (to subscribe, send an e-mail message to listserv@library.berkeley.edu that says "sub web4lib your first name your last name").

    o The World-Wide Web FAQ (Frequently Asked Questions), which is maintained by Thomas Boutell. The FAQ is posted to the comp.infosystems.www newsgroups and is available by anonymous FTP from rtfm.mit.edu in the directory /pub/usenet/news.answers/www/faq.