Chapter 16

Creative User Interaction


CONTENTS

JScript offers tremendous flexibility in interacting with the user. You can create entire documents on-the-fly. You can dynamically customize both the content of a document and its appearance according to user criteria and other factors. User input also benefits from this flexibility: Prompts can be dynamically generated, and even free-form input can be processed. In this chapter, you develop techniques for performing all these tasks.

This chapter builds on several of the objects you created in Chapter 15, "Visual Effects," and some of the techniques you developed there, so you might want to review that chapter before you begin, or refer to it as you go along.

Creating Dynamic Output

JScript provides two methods of updating the screen with dynamic content: You can use the document.write() function, or you can write the entire contents of a frame to the screen using the javascript: protocol. As in Chapter 15, you use the latter approach here because it is better suited to the examples in this chapter.

The skeleton frameset shown in Listing 16.1 is essentially the same as that used in Chapter 15.


Listing 16.1  The Skeleton Frameset
<html>
<head>
<title>Creative User Interaction</title>
<script language="JavaScript">
<!-- begin script
var emptyFrame = '<html></html>';
function headFrame () {
  return '<html><body bgcolor="#FFFFFF" text="#000000">' +
    '<h1 align="center">Creative User Interaction</h1>' +
    '</body></html>';
}
function initialize () {
  self.head.location = "javascript:parent.headFrame()";
}
// end script -->
</script>
<frameset rows="52,*" onLoad="initialize()">
  <frame name="head" src="javascript:parent.emptyFrame"
     marginwidth=1 marginheight=1 scrolling="no" noresize>
  <frame name="body" src="javascript:parent.emptyFrame">
</frameset>
<noframes>
<h2 align="center">JScript  or 
JavaScript enabled browser required</h2>
</noframes>
</html>

In this Listing, you start by initializing the frames to point to the emptyFrame variable in the FRAME SRC= tag and then set the true location from the initialize() function, which is the onLoad handler for the frameset. As noted in the preceding chapter, this gets around an alignment bug that appears when loading a frame directly using the javascript: protocol.

You can specify either a variable or a function name on the right side of the colon in a javascript: URL. Normally, you use a variable to return an unchanging, or static, value, whereas you use a function to return dynamic content. If a variable name refers to an object, however, and that object has a toString() method defined, the function associated with the toString() method is called when the object is referenced in a javascript: URL, in which case dynamic content may be returned. You see an example of this behavior in the section entitled "A Random Phrase Generator."

You have defined headFrame() as a function, although it currently returns a static value. You can jazz it up with a little bit of dynamism, as shown in Listing 16.2.


Listing 16.2  Returning Dynamic Content
function headFrame () {
  var now = new Date();
  var body;
  if (now.getHours() >= 6 && now.getHours() < 18)
    body = '<body bgcolor="#FFFFFF" text="#000000">';
  else
    body = '<body bgcolor="#000000" text="#FFFFFF">;
  return '<html>' + body +
    '<h1 align="center">Creative User Interaction</h1>' +
    '</body></html>';
}

Now, when headFrame() is loaded between 6:00 A.M. and 6:00 P.M., it comes up in "daylight" mode, with a white background. From 6:00 P.M. until 6:00 A.M., the function displays its nocturnal mode.

But why stop here? You can give the user an appropriate greeting as well, as shown in Listing 16.3.


Listing 16.3  Returning a Timely Greeting
function headFrame () {
  var now = new Date();
  var hour = now.getHours();
  var body;
  var greeting;
  if (hour >= 6 && hour < 18)
    body = '<body bgcolor="#FFFFFF" text="#000000">';
  else
    body = '<body bgcolor="#000000" text="#FFFFFF">;
  if (hour < 6)
    greeting = "Up late, or up early?";
  else if (hour < 12)
    greeting = "Good morning!";
  else if (hour < 18)
    greeting = "Good afternoon!";
  else
    greeting = "Good evening!";
  return '<html>' + body +
    '<h1 align="center">Creative User Interaction</h1>' +
    '<h3 align="center">' + greeting + '</h3>' +
    '</body></html>';
}

One result of this Listing is shown in Figure 16.1.

Figure 16.1 : The current time is used to generate an appropriate greeting.

Generating Random Numbers

In the preceding example, you used the time of day to determine which colors and messages to use in generating a dynamic header frame. You can take this example even further; for instance, you might define a different message for every hour of the day, or even for every minute of the hour. You can use this technique to give your pages many different looks.

For many applications in which a degree of apparent randomness is desirable, using the Date object alone yields acceptable results. When you need a series of "random" numbers, one after another, however, the Date object isn't of much help. Chances are, whatever calculations you're performing or effects you're creating will be finished before the Date object advances to a new value. A Random Number Generator (RNG) comes into play here.

In truth, no such thing as a software-generated "random" number really exists. Software is hopelessly logical-given a particular set of input values, a function performs a predefined set of steps in a predictable order, yielding predictable results. (Even a buggy function yields predictable, if undesirable, results.)

Even the best algorithms, such as the one presented in the next paragraph, don't generate truly random numbers. Instead, they generate very long sequences of numbers that simulate random behavior. However, eventually the sequences repeat. The numbers generated are therefore properly known as pseudo-random numbers.

The RNG shown in Listing 16.4 is an implementation of the Park-Miller algorithm. (See "Random Number Generators: Good Ones Are Hard to Find," by Stephen K. Park and Keith W. Miller, Communications of the ACM, 31(10):1192-1201, 1988.) The JScript version was written by David N. Smith of IBM's T.J. Watson Research Center. Mr. Smith notes that his version has not been subjected to the rigorous testing required of a mission-critical RNG.


Listing 16.4  The RandomNumberGenerator Object
function NextRandomNumber()  {
  var hi   = this.seed / this.Q;
  var lo   = this.seed % this.Q;
  var test = this.A * lo - this.R * hi;
  if (test > 0)
    this.seed = test;
  else
    this.seed = test + this.M;
  return (this.seed * this.oneOverM);
}
function RandomNumberGenerator() {
  var d = new Date();
  this.seed = 2345678901 +
    (d.getSeconds() * 0xFFFFFF) +
    (d.getMinutes() * 0xFFFF);
  this.A = 48271;
  this.M = 2147483647;
  this.Q = this.M / this.A;
  this.R = this.M % this.A;
  this.oneOverM = 1.0 / this.M;
  this.next = NextRandomNumber;
  return this;
}

In the preceding Listing, the RandomNumberGenerator() constructor uses the system time, in minutes and seconds, to "seed" itself, that is, to create the initial values from which it will generate a sequence of numbers. If you are familiar with random number generators, you might have reason to use some other value for the seed. Otherwise, you should probably not change it.

This RNG is implemented as an object. To use it, you create an instance of the object and then invoke its next() method to return a number. Each call to the next() method returns a new random number, as in the following:

var rand = new RandomNumberGenerator();

// Display five random numbers
for (var i = 0; i < 8; i++)
  document.write ("The number is: " + rand.next() + "<br>");

A simple example page which displays eight random numbers when a button is clicked is given in the CD-ROM file rng.htm. Like many random number generators, this RNG returns a fraction between 0 and 1; for example, .2755983265971, or something similar. To convert this number to an integer between 0 and n, you must multiply n times the random number and then round the result. Here's an example that returns a random number between 0 and 255:

function random255 () {
  return Math.round(255 * rand.next());
}

This example works fine if you need only random integers between 0 and 255. But you can rewrite it to return a number between 0 and any integer, as follows:

function random (n) {
  return Math.round(n * rand.next());
}

By now, you might be starting to wonder, why all the fuss about random numbers? What use are they, anyway? Well, they have many serious and important uses, in simulations, in software testing and verification, and so on. They also happen to be great for creating games and other fun things, as you see in the next section.

Note
You might have noticed that JScript's Math object includes a built-in random() method. The version presented here should work as well as, if not better than, the built-in implementations, and will work uniformly on all platforms.

A Random Phrase Generator

According to one popular theory, if you put enough monkeys in a room full of typewriters, eventually, by hit and miss, they will bang out the complete works of Shakespeare. Although we suspect it would take several generations of monkeys just to tap out a decent sonnet, try a similar, but less ambitious, experiment here, using the random number generator in place of monkeys.

The goal here is to generate some simple phrases by randomly combining words from lists of verbs, articles, adjectives, and nouns. In this exercise, you can cheat a little bit by imposing a structure on the phrases. Your formula is as follows:

verb + article (or possessive pronoun) + adjective + noun.

For example,

Have a nice day.

Random Words

Start creating the random phrase generator by defining an object to contain a list of words, as shown in Listing 16.5.


Listing 16.5  The Word Object Constructor
function Word () {
  var argv = Word.arguments;
  var argc = argv.length;
  this.list = new Object();
  for (var i = 0; i < argc; i++)
    this.list[i] = argv[i];
  this.count = argc;
  this.toString = WordString;
  return this;
}
function WordString () {
  var i = Math.round((this.count - 1) * rand.next());
  return this.list[i];
}

The Word() constructor is designed to accept a variable number of parameters, or arguments. Every JScript function has a built-in property called arguments. The arguments property is an array containing all the arguments passed to the function, indexed beginning at 0. The arguments property also has a length property, which is the number of arguments passed to the function. You assign them to variables argv and argc, primarily because variables by those names play a similar role in C programs.

The Word() constructor takes the words passed in the argument list and puts them in an internal array called list. To keep the numbering scheme simple, you can make list a separate object rather than make each word a property of the Word object itself.

The toString() method WordString() does the monkey's job of returning random words. As you learned in chapter 15, JScript automatically calls an object's toString() method any time the object is used in a context requiring a string. WordString() uses the random number object rand to generate an integer between 0 and the number of words in the list minus one. (If you have a list containing five words, the generated integer is 0-4.) This integer is then used as the index into the list array.

To start, use the Word() constructor to create a list of nouns:

var noun = new Word ("dog", "cat", "shoe", "doorknob", "umbrella");

Now, each time you refer to the noun object in a context requiring a string (thus invisibly calling its toString() method), you get a random selection from the list. For instance, consider this code snippet:

for (var i = 0; i < 5; i++)
  document.write ("Have a " + noun + "<br>");

It returns a list of offerings that might look something like the following:

Have a shoe
Have a umbrella
Have a cat
Have a dog
Have a doorknob

Sample output is shown in Figure 16.2. The source code which generated this output will be found in the CD-ROM file rword.htm.

Figure 16.2 : The RNG object is used to construct simple random sentences.

As you can see, just because you reference the noun object five times does not mean that each of the five words will be listed. Randomness means that you might get just one of the words repeated five times. Or you might get all five perfectly alphabetized. Over time, these instances will roughly average out, but you have no way of knowing what you'll get in any given sample.

Now go ahead and define the rest of the word types, and take a crack at some random phrases, as follows:

var verb = new Word ("have", "get", "eat", "pet", "feed");
var article = new Word ("a", "the", "my", "your");
var adjective = new Word ("nice", "pretty", "smelly", "hairy", "yellow");

for (var i = 0; i < 5; i++)
  document.write (verb + " " + article + " " + 
    adjective + " " + noun + "<br>");

To add variety to the phrases, you can include a couple of possessive pronouns in the article object, as shown in the preceding code. Here's one possible set of generated phrases:

get your nice shoe
pet a smelly umbrella
pet your hairy shoe
feed my yellow dog
eat the hairy doorknob

How many different phrases can you generate? Right now, you can calculate the number of phrases as the number of verbs (5) times the number of articles (4) times the number of adjectives (5) times the number of nouns (5), or a total of 500 phrases. But you're about to increase that number more than tenfold.

The Phrasemaker

Having a function that returns a complete phrase would be convenient. While you're at it, you can add the ability to vary the structure of the phrases and include some special handling for plural nouns, as well as adjectives or nouns that begin with vowels. Listing 16.6 gets you started.


Listing 16.6  The phrase() Function
function isVowel (ch) {
  if (ch == 'a' || ch == 'e' || ch == 'i' ||
      ch == 'o' || ch == 'u')
    return true;
  return false;
}
function phrase (adjs) {
  var plural = (rand.next() >= .5); // Sets plural to true or false
  var vb = verb.toString();
  var first = vb.charAt(0);
  var vb = first.toUpperCase() + vb.substring(1,vb.length);
  var art = article.toString();
  var adj = new Object();
  for (var i = 0; i < adjs; i++)
    adj[i] = adjective.toString();
  var nn = (plural) ? pluralNoun.toString(): singularNoun.toString();
  if (plural && art == "a")
    art = "some";
  if (art == "a") {
    if (adjs > 0)
      first = adj[0].charAt(0);
    else
      first = nn.charAt(0);
    if (isVowel(first))
      art = "an";
  }
  var ph = vb + " " + art + " ";
  for (i = 0; i < adjs; i++)
    ph += adj[i] + " ";
  ph += nn;
  return ph;
}
var singularNoun = new Word ("dog", "cat", "shoe", "doorknob", "umbrella");
var pluralNoun = new Word ("dogs", "cats", "shoes", "doorknobs", 
Â"umbrellas");

In this Listing, the phrase() function takes as a parameter the number of adjectives to be included in the phrase, and it returns a complete phrase with the first letter capitalized. The function uses the random number generator to decide whether the noun will be singular or plural. The article "a" is also given special handling: If the noun is plural, "a" is changed to "some"; if it precedes a word beginning with a vowel, it is changed to "an."

Notice that you call the toString() methods for the Word objects directly here. Because of the special handling you're doing, you aren't using them in a string context, so toString() isn't called automatically by JScript.

In this Listing, you also define new Word objects for singular and plural nouns. Although plural nouns could be created from the singular nouns used here by appending "s" to the end of each, many irregular nouns cannot be transformed easily. Note that the singularNoun and pluralNoun objects could contain completely different words, and even different numbers of words.

Here's an example of using the phrase() function:

document.write (phrase(0) + "<br>");
document.write (phrase(1) + "<br>");
document.write (phrase(2) + "<br>");

The source code which implements this simple example will be found in the CD-ROM file rphrase.htm. The phrases generated might look something like this:

Buy an umbrella
Eat your smelly cats
Have the nice yellow doorknob

Colors, Fonts, and All

Up to this point, you've been testing the phrase generator using simple document.write() calls. Put it in a frameset now, with all the trimmings. Because you're experimenting with random numbers, you can use them in the display process as well, to generate random color schemes and fonts for the phrases.

Briefly review a few of the objects you created in Chapter 15 for putting text on the screen. These objects simplify the process of creating random colors and fonts for the phrase generator. Listing 16.7 shows the Color object.


Listing 16.7  The Color Object Constructor
var hexchars = '0123456789ABCDEF';
function fromHex (str) {
  var high = str.charAt(0);
  var low = str.charAt(1);
  return (16 * hexchars.indexOf(high)) +
    hexchars.indexOf(low);
}
function toHex (num) {
  return hexchars.charAt(num >> 4) + hexchars.charAt(num & 0xF);
}
function Color (str) {
  this.red = fromHex(str.substring(0,2));
  this.green = fromHex(str.substring(2,4));
  this.blue = fromHex(str.substring(4,6));
  this.toString = ColorString;
  return this;
}
function ColorString () {
  return toHex(this.red) + toHex(this.green) + toHex(this.blue);
}

The Color object holds a color. It stores the red, green, and blue components as numbers (0-255). The Color() constructor accepts a hexadecimal triplet of the form RRGGBB, whereas the toString() method, ColorString(), converts the internal values back to this format.

Listing 16.8 shows the BodyColor object.


Listing 16.8  The BodyColor Object Constructor
function BodyColor (bgColor,fgColor,linkColor,vlinkColor,alinkColor) {
  this.bgColor = bgColor;
  this.fgColor = fgColor;
  this.linkColor = linkColor;
  this.vlinkColor = vlinkColor;
  this.alinkColor = alinkColor;
  this.toString = BodyColorString;
  return this;
}
function BodyColorString () {
  return '<body' +
    ((this.bgColor == null) ? '': ' bgcolor="#' + this.bgColor + '"') +
    ((this.fgColor == null) ? '': ' text="#' + this.fgColor + '"') +
    ((this.linkColor == null) ? '': ' link="#' + this.linkColor + '"') +
    ((this.vlinkColor == null) ? '': ' vlink="#' + this.vlinkColor + '"') +
    ((this.alinkColor == null) ? '': ' alink="#' + this.alinkColor + '"') +
    '>';
}

The BodyColor object contains one or more Color objects corresponding to the HTML color attributes that can be specified in a body tag. Its toString() method returns a formatted body tag, including any specified colors.

Listing 16.9 shows the Text constructor.


Listing 16.9  The Text Object Constructor
function Text (text, size, format, color) {
  this.text = text;
  this.length = text.length;
  this.size = size;
  this.format = format;
  this.color = color;
  this.toString = TextString;
  this.substring = TextString;
  return this;
}
function TextString (start, end) {
  with (this) {
    if (TextString.arguments.length < 2 || start >= length) start = 0;
    if (TextString.arguments.length < 2 || end > length) end = length;
    var str = text.substring(start,end);
    if (format != null) {
      if (format.indexOf("b") >= 0) str = str.bold();
      if (format.indexOf("i") >= 0) str = str.italics();
      if (format.indexOf("f") >= 0) str = str.fixed();
    }
    if (size != null) str = str.fontsize(size);
    if (color != null) {
      var colorstr = color.toString(); 
      str = str.fontcolor(colorstr);
    }
  }
  return str;
}

The Text object contains a text string, along with optional font size, font color, and format information. The format string can contain any combination of the letters "b," "i," or "f," corresponding to bold, italic, or fixed (<tt>) formatting. Text objects mimic some JScript string behaviors-they have a length property and a substring() method.

Listing 16.10 shows the Static object.


Listing 16.10  The Static Object Constructor
function Static (body, text) {
  this.body = body;
  this.text = text;
  this.toString = StaticString;
  return this;
}
function StaticString () {
  return '<html>' + this.body + this.text + '</body></html>';
}

The Static object holds a BodyColor object and any text, including HTML, that is to appear between the body and /body tags. Its toString() method returns a complete HTML page ready for display.

Now you can also create a new helper function, center(), to center the text on the page, as follows:

function center (text) {
  return '<table width=100% height=100% border=0 ' +
    'cellpadding=0 cellspacing=0>' +
    '<tr><td align="center" valign="center">' +
    text + '</td></tr></table>';
}

The center() function accepts a text parameter, which can be either a JScript string or any object that has a toString() method. The function returns the text embedded in a one-cell table with width and height set to 100 percent, and horizontal and vertical centering specified. This way, the text is centered within the frame.

To generate random colors for the display, create two new objects, called DarkColor and LightColor, as shown in Listing 16.11.


Listing 16.11  The DarkColor and LightColor Object Constructors
function DarkColor () {
  this.red = Math.round (127 * rand.next());
  this.green = Math.round (127 * rand.next());
  this.blue = Math.round (127 * rand.next());
  this.toString = ColorString;
  return this;
}
function LightColor () {
  this.red = Math.round (127 * rand.next()) + 128;
  this.green = Math.round (127 * rand.next()) + 128;
  this.blue = Math.round (127 * rand.next()) + 128;
  this.toString = ColorString;
  return this;
}

As you know, color component values can range from 0 to 255. The DarkColor() constructor generates a random color, each of the components of which has a value between 0 and 127. The LightColor() constructor generates a random color with component values between 128 and 255. When used together to create foreground and background colors, DarkColor and LightColor almost always produce a readable combination, though it may not always be an attractive combination.

The DarkColor and LightColor objects have an internal structure that is identical to that of the Color object, so they can be used anyplace a Color object can be used. Therefore, it is convenient to think of DarkColor and LightColor as simply being different constructors for a Color object.

Now, you can modify the phrase() function to return a complete HTML page, including random foreground and background colors, font size, and format. The modified function is shown in Listing 16.12. Note that in another application, placing the phrase-generation and page-generation code in separate functions might make sense; but for this example, a single function will do. This code is part of the CD-ROM file c16phr.htm.


Listing 16.12  c16phr.htm-The Modified phrase() Function Returns a Complete Page
function phrase (adjs) {
  var size = "" + (Math.round(rand.next() * 3) + 4);
  var format = " ";
  if (rand.next() >= .5)
    format += "b";
  if (rand.next() >= .5)
    format += "i";
  if (rand.next() >= .5)
    format += "f";
  var body;
  if (rand.next() >= .5)
    body = new BodyColor (new DarkColor(), new LightColor());
  else
    body = new BodyColor (new LightColor(), new DarkColor());
  var plural = (rand.next() >= .5);
  var vb = verb.toString();
  var first = vb.charAt(0);
  var vb = first.toUpperCase() + vb.substring(1,vb.length);
  var art = article.toString();
  var adj = new Object();
  for (var i = 0; i < adjs; i++)
    adj[i] = adjective.toString();
  var nn = (plural) ? pluralNoun.toString(): singularNoun.toString();
  if (plural && art == "a")
    art = "some";
  if (art == "a") {
    if (adjs > 0)
      first = adj[0].charAt(0);
    else
      first = nn.charAt(0);
    if (isVowel(first))
      art = "an";
  }
  var ph = vb + " " + art + " ";
  for (i = 0; i < adjs; i++)
    ph += adj[i] + " ";
  ph += nn;
  var screen = new Static (body,center(new Text(ph,size,format)));
  return screen.toString();
}

First, you generate a random font size from 4 to 7. Smaller fonts can be difficult to read, especially when shown in a fixed-width (<tt>) font or italics. Besides, the generated phrase is the only text in its frame, so you might as well make it big.

Next, you choose format specifiers. Note that you start with a string containing a single space. This is a workaround for the fact that the indexOf() method call in the Text object may generate an alert if it is called for an empty string. (It should simply return -1, indicating the substring was not found.)

For each format specifier, you generate a random number to determine whether it will be included. Testing for greater than or equal to .5 means you have a 50-50 chance that any specifier will be included.

You then decide whether to use a dark-on-light or light-on-dark color scheme, again by generating a random number and comparing it to .5. Note that you specify only the foreground and background colors in the BodyColor() constructor because you aren't using any links.

At the end of the phrase() function, you create a Static object, using the generated BodyColor object, and a centered Text object that includes the generated font size and format specifiers in addition to the generated phrase. You call the Static object's toString() method directly to return the completely formatted HTML page.

All that's left now is to provide a button so the user can request a new phrase and a selection list enabling the user to specify the number of adjectives to use in the phrase. You put these controls in a separate control frame (named head), as shown in Listing 16.13. This code will also be found in the CD-ROM file c16phr.htm.


Listing 16.13  c16phr.htm-The Control Frame
function printPhrase () {
  var adj = self.head.document.cont.adj.selectedIndex;
  if (adj == 3)
    adj = Math.round (rand.next()*2);
  self.body.location = "javascript:parent.phrase(" + adj + ")";
}
var controlFrame =
  '<html><body bgcolor="#808080" text="#FFFFFF">' +
  '<form name="cont">' +
  '<table width=100% height=100% border=0 cellpadding=0 cellspacing=0>' +
  '<tr align="center">' +
  '<td colspan=2><b>Number of Adjectives</b> <select name="adj">' +
  '<option>None' +
  '<option>1' +
  '<option>2' +
  '<option selected>Random' +
  '</select></td>' +
  '<td colspan=2><input type="button" value="Generate Phrase!" ' +
  'onclick="parent.printPhrase()"></td>' +
  '</tr>' +
  '</table>' +
  '</form>' +
  '</body></html>';

When the user presses the Generate Phrase! button, the printPhrase() function is called. This function determines how many adjectives to use by examining the selectedIndex property of the selection list, adj. If the user has selected the fourth option, Random, a random value between 0 and 2 is calculated. Then the frame where you show the phrase, body, is updated by using a javascript: URL that calls the phrase() function.

The complete Listing of the phrase generator is included on the CD-ROM in file c16phr.htm. Figure 16.3 shows an example of the output.

Figure 16.3 : The random phrase generator at work.

Adding and Sorting Words

The phrase generator can produce some pretty amusing results just as it is, but sooner or later, users will want to get into the act and add some words of their own. They also will want to see which words are already defined. In this section, you extend the Word object so that it can accept additional words and produce a sorted (alphabetized) list of its contents. You also add some additional controls to the control (head) frame so that the users can view and add words.

You start by taking a look at the sorting function. Many sorting algorithms are available. The simplest to comprehend and write is the bubble sort. The bubble sort algorithm makes n - 1 passes through a list of n items, comparing and, if necessary, exchanging adjacent pairs. After the last pass, the list is sorted.

Unfortunately, the bubble sort is also just about the slowest sorting algorithm available. If you're sorting only a handful of items, it doesn't make much difference which algorithm you use. But if you are sorting dozens or hundreds of items, or more, that difference becomes very significant. It could mean the difference between waiting a fraction of a second or a couple of minutes for your sort to complete.

Fortunately, much faster algorithms are available, and although they're not as easy to comprehend, you can implement them using only a little more code than a bubble sort. One of the best is the Quicksort algorithm. We won't get into the details of its operation in this chapter, except to say that it takes a divide-and-conquer approach to its comparisons and exchanges.

The JScript Quicksort algorithm shown in Listing 16.14 was written by Achille Hui of Stanford University.


Listing 16.14  A JScript Quicksort Implementation
function _pm_array_qsort(vec,lo,up,cmp_fun){
  var i, j, t;
  while(up > lo){
    i = lo;
    j = up;
    t = vec[lo];
    while(i < j){
      while(cmp_fun(vec[j],t) > 0)
        j -= 1;
      vec[i] = vec[j];
      while((i < j) && (cmp_fun(vec[i],t) <= 0))
        i++;
      vec[j] = vec[i];
    }
    vec[i] = t;
    if(i - lo < up - i){
     _pm_array_qsort(vec,lo,i-1,cmp_fun); lo = i+1;
    } else {
     _pm_array_qsort(vec,i+1,up,cmp_fun); up = i-1;
    }
  }
}
function _pm_array_defcmp(a,b){
  return (a == b) ? 0: (a > b) ? 1: -1;
}
function pm_array_qsort(vec,lo,hi,cmp_fun){
  if(vec == null){
    return;
  } else if(cmp_fun == null){
   _pm_array_qsort(vec,lo,hi,_pm_array_defcmp);
  } else {
   _pm_array_qsort(vec,lo,hi,cmp_fun);
  }
}

To use this Quicksort, call the pm_array_qsort() function, passing it an array object (vec), the starting item number in the array (lo), the ending item number (hi), and optionally, a comparison function (cmp_fun).

The lo and hi parameters you use depend on how your array object is constructed. If you use the first entry (entry[0]) to contain the array length, for instance, then you should pass 1 as the lo parameter and the number of entries in the array as the hi parameter. If you use a zero-based array, then you should pass 0 as the lo parameter, and the number of entries minus one as the hi parameter.

Your Quicksort function includes a default comparison function, which is used to compare two items. The default comparison function can be used if your array consists of strings or numeric items. If you are sorting an array of objects, however, you need to supply a comparison function. The comparison function compares two items, a and b. If they are equal, it returns 0. If a is greater than b, then the function returns 1. If a is less than b, it returns -1.

Now, you can beef up your Word object, as shown in Listing 16.15. The improved Word object will let you add new words, sort the words in the object, produce a list of sorted words, and find a particular word.


Listing 16.15  An Improved Word Object Constructor
function Word () {
  var argv = Word.arguments;
  var argc = argv.length;
  this.list = new Object();
  for (var i = 0; i < argc; i++)
    this.list[i] = argv[i];
  this.count = argc;
  pm_array_qsort (this.list,0,this.count-1);
  this.add = AddWord;
  this.find = FindWord;
  this.print = PrintWords;
  this.toString = WordString;
  return this;
}
function AddWord (str) {
  this.list[this.count++] = str;
  pm_array_qsort(this.list,0,this.count-1);
}
function FindWord (str) {
  for (var i = 0; i < this.count; i++)
    if (this.list[i] == str)
      return i;
  return -1;
}
function PrintWords () {
  var str = "";
  for (var i = 0; i < this.count; i++)
    str += this.list[i] + '<br>';
  return str;
}

The Word() constructor calls the pm_array_qsort function to sort the original list of words supplied when the object is constructed. You could keep them unsorted instead, and sort them only when output is required. In this case, the design decision is arbitrary, but sometimes it is desirable to maintain an array in sorted form at all times.

The add() method, AddWord(), adds a new word to the end of the array and then sorts the array back into alphabetical sequence.

The find() method, FindWord(), searches the array to see whether a word is present. You use this method in conjunction with the add() method to prevent the user from adding duplicate words.

The print() method, PrintWords(), returns the sorted list of words, separated by HTML <BR> tags.

In Listing 16.16, you add some functions that prompt the user for a new word. You also add functions to list all the words for each type.


Listing 16.16  Functions to Prompt the User and List Words
function addVerb () {
  var str = prompt ("Enter a verb (eat, kiss, bite, etc.):","");
  if (str == null || str == "")
    return;
  if (verb.find(str) != -1) {
    alert ("\nThat verb is already listed!");
    return;
  }
  verb.add(str);
  listVerbs();
}
function addAdjective () {
  var str = prompt ("Enter an adjective (pretty, smelly, nice, etc.):","");
  if (str == null || str == "")
    return;
  if (adjective.find(str) != -1) {
    alert ("\nThat adjective is already listed!");
    return;
  }
  adjective.add(str);
  listAdjectives();
}
function addSingular () {
  var str = prompt ("Enter a singular noun (dog, cat, knife, etc.):","");
  if (str == null || str == "")
    return;
  if (singularNoun.find(str) != -1) {
    alert ("\nThat noun is already listed!");
    return;
  }
  singularNoun.add(str);
  listSingular();
}
function addPlural () {
  var str = prompt ("Enter a plural noun (dogs, cats, knives, etc.):", "");
  if (str == null || str == "")
    return;
  if (pluralNoun.find(str) != -1) {
    alert ("\nThat noun is already listed!");
    return;
  }
  pluralNoun.add(str);
  listPlural();
}
function listVerbs () {
  self.body.location =
    "javascript:parent.showList('Verbs',parent.verb.print())";
}
function listAdjectives () {
  self.body.location =
    "javascript:parent.showList('Adjectives',parent.adjective.print())";
}
function listSingular () {
  self.body.location = 
    "javascript:parent.showList('Singular Nouns',parent.singularNoun.print())";
}
function listPlural () {
  self.body.location =
    "javascript:parent.showList('Plural Nouns',parent.pluralNoun.print())";
}
function showList (title,str) {
  return '<html><body bgcolor="#FFFFFF" text="#0000FF"><h1 align="center">' +
    title + '</h1><div align="center"><font size=5>' + str +
    '</font></div></body></html>';
}

The addVerb(), addAdjective(), addSingular(), and addPlural() functions prompt the user for a word. If the word is already present in the list, an error message is displayed. Otherwise, the word is added, and the updated list is displayed.

The listVerbs(), listAdjectives(), listSingular(), and listPlural() functions display the word lists by updating the body frame location using a javascript: URL. This URL includes a call to the showList() function, which returns a formatted HTML page listing the words for the given word type.

Finally, you update the control frame to include a set of buttons for adding and listing words, as shown in Listing 16.17.


Listing 16.17  The Updated Control Frame
var controlFrame =
  '<html><body bgcolor="#808080" text="#FFFFFF">' +
  '<form name="cont">' +
  '<table width=100% height=100% border=0 cellpadding=0 cellspacing=0>' +
  '<tr align="center">' +
  '<td colspan=2><b>Number of Adjectives</b> <select name="adj">' +
  '<option>None' +
  '<option>1' +
  '<option>2' +
  '<option selected>Random' +
  '</select></td>' +
  '<td colspan=2><input type="button" value="Generate Phrase!" ' +
  'onclick="parent.printPhrase()"></td>' +
  '</tr>' +
  '<tr align="center">' +
  '<td><input type="button" value="Add Verb" ' +
  'onclick="parent.addVerb()"></td>' +
  '<td><input type="button" value="Add Adjective" ' +
  'onclick="parent.addAdjective()"></td>' +
  '<td><input type="button" value="Add Singular Noun" ' +
  'onclick="parent.addSingular()"></td>' +
  '<td><input type="button" value="Add Plural Noun" ' +
  'onclick="parent.addPlural()"></td>' +
'</tr>' +
  '<tr align="center">' +
  '<td><input type="button" value="List Verbs" ' +
  'onclick="parent.listVerbs()"></td>' +
  '<td><input type="button" value="List Adjectives" ' +
  'onclick="parent.listAdjectives()"></td>' +
  '<td><input type="button" value="List Singular Nouns" ' +
  'onclick="parent.listSingular()"></td>' +
  '<td><input type="button" value="List Plural Nouns" ' +
  'onclick="parent.listPlural()"></td>' +
'</tr>' +
  '</table>' +
  '</form>' +
  '</body></html>';

The complete listing of the improved phrase generator is included on the accompanying CD-ROM in file c16phr2.htm. One example result after adding the adjective "spongy" is shown in Figure 16.4.

Figure 16.4 : The improved phrase generator includes controls to add and view words.

An Online Bookstore

In this section, you develop an online bookstore application. You create a small database of books and provide a way for the user to look up books by subject, author, or title. Along the way, you learn how to parse free-form user input and how to look up items in an indexed database.

Of course, in real life, you can't expect to keep the entire inventory of a bookstore in a single JScript document. But the techniques you develop here can be applied to many smaller databases or to results returned by a CGI program on the server.

Parsing Free-Form User Input

Usually, when you process input entered in a text field, the value is treated as a whole. If you are expecting a numeric value, you might check to ensure that the value is, indeed, numeric. You might also check to see that it falls within a certain range or ranges. If you expect an alphanumeric value, you might check it against a list of expected values.

But suppose you want to allow the user to enter a series of values in a single field but process the values individually? A good example is a database lookup or search function, where the user can enter a set of keywords. In that case, you need to parse the input field, that is, break it into a list of individual words or terms.

The process of parsing is fairly straightforward. The first step is to define the whitespace characters that can separate the terms in your input field. Whitespace is usually defined as blank, or space, characters, and tabs. It may also include carriage returns and linefeeds, as well as certain other nondisplaying characters.

The isWhitespace() function shown in Listing 16.18 decides whether the input character is whitespace.


Listing 16.18  The isWhitespace() Function
function isWhitespace (ch) {
  if (ch == ' ' || ch == '\n' || ch == '\r' || ch == '\t' || ch == '\f' ||
      ch == '\v' || ch == '\b')
    return true;
  return false;
}

You may also want to test for certain delimiter characters. Common delimiters include commas, forward or backward slashes, periods, and so on. Delimiters can be a meaningful part of the input, or they can be nonessential characters that can be discarded.

The isDelimiter() function shown in Listing 16.19 tests for delimiters.


Listing 16.19  The isDelimiter() Function
function isDelimiter (ch) {
  if (ch == ',' || ch == '?' || ch == '-' || ch == '.' ||
      ch == '\\' || ch == '/')
    return true;
  return false;
}

After you decide which whitespace and delimiter characters can separate your terms, you need a place to put the individual terms you extract from the input field. A simple array can do the trick. In this case, you can define a KeywordList object to hold them, shown in Listing 16.20. This object will come in handy in the bookstore example.


Listing 16.20  The KeywordList Object Constructor
function KeywordList () {
  this.count = 0;
  this.word = new Object ();
  this.add = AddKeyword;
  return this;
}
function AddKeyword (word) {
  for (var i = 0; i < this.count; i++)
    if (this.word[i] == word)
      return;
  this.word[this.count++] = word;
}

In the bookstore example, you want to allow only unique keywords. Therefore, the add() method, AddKeyword(), prevents duplicate keywords from being added. However, this method would not be appropriate in many applications. If you don't want to suppress duplicates in your application, then omit the first three lines in the body of the AddKeyword() function.

Now you're ready to write the parser itself, shown in Listing 16.21.


Listing 16.21  The parseKeywords() Function
function parseKeywords (str) {
  var list = new KeywordList ();
  var inword = false;
  var word = "";
  var len = str.length;
  for (var i = 0; i < len; i++) {
    var ch = str.charAt(i);
    if (isWhitespace(ch) || isDelimiter(ch)) {
      if (inword) {
        list.add(word);
        word = "";
        inword = false;
      }
    }
    else {
      word += ch;
      inword = true;
    }
    if (i + 1 == len && inword)
      list.add(word);
  }
  return list;
}

The parseKeywords() function accepts a string, which can be the contents of an input field. It returns a KeywordList object, containing a list of the extracted terms.

The parseKeywords() function examines each character in the input string to decide whether it is a whitespace or delimiter character. If it is not, the character is added to the current word. If so, the current word, if any, is added to the list, and preparation is made for a new word. You also add the current word to the list when you reach the last character of the input string.

The parseKeywords() function discards delimiter characters because they are not important to the bookstore application. However, delimiters can have special meaning to your application, in which case you might want to add them to the keyword list.

Building an Index

For the example, you want users of your online bookstore to be able to look up books by title, author, or subject. To do so, you must build some indexes for the book database. You use JScript's associative array feature to create the indexes.

Associative arrays enable you to associate a value with an array entry. You can then use that value to retrieve the desired entry. For example, suppose that you create a simple array named animal and associate a name with each entry, as follows:

var animal = new Object();
animal["dog"] = "Woof!";
animal["cat"] = "Meow!";
animal["pig"] = "Oink!";

To retrieve an item you use the value associated with it, as follows:

document.write("A dog says " + animal["dog"] + "<br>");

To take advantage of this capability, you index each word in the book titles, subjects, and author names by creating an entry in an associative array.

The only hitch is that you can have only one entry per value. What if, as is very likely, more than one of the books uses that value in its title, subject, or author? The solution, as it turns out, is fairly simple. Instead of associating an individual book object with each entry in the array, you create a special object that contains a list of the items that match a given value and associate that object with the value.

Listing 16.22 shows the index object, along with its methods and the internal list object:


Listing 16.22  The Index and IndexItemList Object Constructors
function IndexItemList () {
  this.count = 0;
  this.item = new Object();
  this.add = AddIndexItem;
  return this;
}
function AddIndexItem (object) {
  this.item[this.count++] = object;
}
function Index () {
  this.count = 0;
  this.item = new Object();
  this.add = AddToIndex;
  return this;
}
function AddToIndex (object, keywords) {
  for (var i = 0; i < keywords.count; i++) {
    var kw = keywords.word[i];
    var ilist = this.item[kw];
    if (ilist == null) {
      ilist = new IndexItemList();
      this.item[kw] = ilist;
    }
    ilist.add(object);
  }
}

The IndexItemList object is used internally to store a list of objects that contain a particular keyword value. In the bookstore example, it will contain references to one or more Book objects (which you define shortly) that share a given keyword. The IndexItemList object has a single method, AddIndexItem(). You don't need to access the IndexItemList object directly.

The Index object contains the associative array, item. This array contains a list of IndexItemList objects. Each entry in the array is associated with a value, in this case a keyword. The Index object includes a method to add items to the array. You will write another method to look up entries shortly.

The Index object's add() method, AddToIndex(), accepts an object to be indexed and a list of keywords in the form of a KeywordList object. For each keyword in the KeywordList, AddToIndex() first checks to see whether an IndexItemList object is associated with the keyword. Note that it uses the keyword value itself as an index into the array. If no IndexItemList object exists for the keyword, a new object is created and added to the array, again using the keyword as the index. Finally, the object to be indexed is added to the IndexItemList for that keyword value. This process is repeated for each keyword, so a given object can have several index entries.

Defining Book and Catalog Objects

Now you're ready to create the Book object. In Listing 16.23, you also create a Catalog object that contains a list of all books, plus subject, title, and author indexes.


Listing 16.23  The Book and Catalog Object Constructors
function Book (author, title, subject, code, price) {
  this.author = author;
  this.title = title;
  this.subject = subject;
  this.code = code;
  this.price = price;
  return this;
}
function Catalog () {
  this.count = 0;
  this.book = new Object;
  this.author = new Index();
  this.title = new Index();
  this.subject = new Index();
  this.add = AddToCatalog;
  return this;
}
function AddToCatalog (book) {
  this.book[this.count++] = book;
  this.author.add(book,parseKeywords(book.author));
  this.title.add(book,parseKeywords(book.title));
  this.subject.add(book,parseKeywords(book.subject));
}

The Book() constructor simply creates an object containing each of the relevant pieces of information about the book.

The Catalog() constructor creates a simple array, book, which contains a single entry for each book. It also creates three Index objects: author, title, and subject.

The Catalog object's add() method, AddToCatalog(), does the really interesting work. First, it adds the book object to the book array. Next, it updates the author, title, and subject indexes. For each of the indexes, it calls the parseKeywords function to create a list of keywords from the value of the field. The Index object's add() method then creates an index entry for each of these values.

In Listing 16.24, you create a catalog and add some books to it.


Listing 16.24  Creating a Catalog and Adding Books
var cat = new Catalog();
cat.add (new Book ("Kingsolver, Barbara", "Animal Dreams",
  "fiction animals dreams environment Native-American love",
  "ISBN 0-06-092114-5", 13.00));
cat.add (new Book ("Calasso, Roberto", "The Marriage of Cadmus and Harmony",
  "fiction Greek myth mythology Zeus Athens",
  "ISBN 0-679-73348-5", 13.00));
cat.add (new Book ("Le Carre, John", "The Night Manager",
  "fiction suspense spy arms drugs",
  "ISBN 0-345-38576-4", 6.99));
cat.add (new Book ("Rice, Anne", "Interview with the Vampire",
  "fiction vampire New Orleans gothic horror",
  "ISBN 0-345-33766-2", 4.95));
cat.add (new Book ("Garcia Marquez, Gabriel", "One Hundred Years of 
ÂSolitude",
  "fiction South America magic dreams war love",
  "ISBN 0-06-091965-5", 13.00));
cat.add (new Book ("Barkakati, Naba", "Object-Oriented Programming in C++",
  "nonfiction computer language programming object C",
  "ISBN 0-672-22800-9", 29.95));
cat.add (new Book ("Petzold, Charles", "Programming Windows",
  "nonfiction computer programming C windows",
  "ISBN 1-55615-264-7", 29.95));

Well, so far so good. You've got a catalog loaded up with books, and they're all cross-referenced by author, subject, and title. But how do you use this information?

What you need now is a search mechanism. You want the search to return multiple matches for a given set of keywords. Also, ideally, the results should be ranked by how well they match the keywords supplied.

To accomplish this task, you first create an object to hold a list of search results, as shown in Listing 16.25.


Listing 16.25  The Result and ResultList Object Constructors
function Result (object, score) {
  this.object = object;
  this.score = score;
  return this;
}
function ResultList () {
  this.count = 0;
  this.item = new Object();
  this.add = AddResult;
  this.sort = SortResults;
  return this;
}
function AddResult (object) {
  for (var i = 0; i < this.count; i++)
    if (this.item[i].object == object) {
      this.item[i].score++;
      return;
    }
  this.item[this.count++] = new Result (object,1);
}
function SortResults () {
  pm_array_qsort (this.item,0,this.count - 1,CompareResults);
}
function CompareResults (a,b) {
  return (a.score == b.score) ? 0: (a.score <  b.score) ? 1: -1;
}

The Result object holds an individual object-in this case, a Book object. It also contains a score field. This field indicates the number of "hits" the query gets for this particular object, that is, how many of the keywords specified in the query match this object.

The ResultList object contains a list of Result objects. It contains one Result for each object (book) that matches one or more of the specified keywords. The add() method, AddResult, searches the ResultList for a matching object. If it finds one, it increments the score by one. Otherwise, it creates a new Result entry for the object and sets the score to one.

The sort() method, SortResults(), sorts the Result objects in the ResultList in descending order according to score. That is, the objects with the highest score go to the top of the list. Because you are sorting objects rather than simple strings or numbers, you must supply a comparison function, CompareResults, to the pm_array_qsort() function.

Now, you can add the search function. In Listing 16.26, you update the Index object to make SearchIndex() the find() method.


Listing 16.26  Adding a find() Method to the Index Object
function Index () {
  this.count = 0;
  this.item = new Object();
  this.add = AddToIndex;
  this.find = SearchIndex;
  return this;
}
function SearchIndex (keywords) {
  var rlist = new ResultList();
  for (var i = 0; i < keywords.count; i++) {
    var kw = keywords.word[i];
    var ilist = this.item[kw];
    if (ilist != null)
      for (var j = 0; j < ilist.count; j++)
        rlist.add(ilist.item[j]);
  }
  rlist.sort();
  return rlist;
}

The find() method, SearchIndex(), takes a KeywordList object containing a list of words to search for. It first uses the keyword value to do an associative array lookup to retrieve the IndexItemList object, if any, for the given keyword. Then, it calls the ResultList object's add() method to add each matching object to the result list, or increment the score for that object if it was already added to the list. This process is repeated for each search term specified. Finally, the ResultList is sorted by score and returned to the caller.

The Bookstore

You've got all the tricky pieces worked out, so you can build an interface and open your bookstore. You can make a control frame with a selection list for author, title, or subject, and a text field for keyword entry. A Search button starts the search. In Listing 16.27, you write some functions to process this information and display the results.


Listing 16.27  Creating the Bookstore Interface
var controlFrame =
  '<html><body bgcolor="#808080" text="#FFFFFF">' +
  '<form name="cont">' +
  '<table border=0 width=100% height=100% cellpadding=0 cellspacing=0>' +
  '<tr align="center" valign="center">' +
  '<td><b>Search by: </b><select name="stype">' +
  '<option selected>Title' +
  '<option>Author' +
  '<option>Subject' +
  '</select></td>' +
  '<td><b>Keywords: </b><input size=30 name="keywords"></td>' +
  '<td><input type="button" value="Search" onclick="parent.doSearch()"></td>' +
  '</tr></table>' +
  '</form>' +
  '</body></html>';

var results = null;

function doSearch () {
  var index = self.head.document.cont.stype.selectedIndex;
  var keywords = parseKeywords (self.head.document.cont.keywords.value);
  if (index == 0)
    results = cat.title.find (keywords);
  else if (index == 1)
    results = cat.author.find (keywords);
  else
    results = cat.subject.find (keywords);
  self.body.location = "javascript:parent.showList()";
}

function showBook (item) {
  var book = results.item[item].object;
  var detail = book.author + '<br>' + book.title + '<br>' +
    book.subject + '<br>' + book.code + '<br>$' + book.price + '<br>' +
    '<h3><a href="javascript:parent.showList()">Return to list</h3>';
  return '<html><body bgcolor="#FFFFFF" text="#000000" link="#0000FF" ' +
    'alink="#FF0000"><div align="center"><table border=0><tr><td>' +
    detail + '</td></tr></table></body></html>';
}

function showList () {
  var list = "";
  for (var i = 0; i < results.count; i++)
    list += '<a href="javascript:parent.showBook(' + i + ')">' +
      '(' + results.item[i].score + ')&nbsp&nbsp' +
      results.item[i].object.author + ':&nbsp&nbsp' +
      results.item[i].object.title + '</a><br>';
  if (list.length == 0)
    list = '<h2 align="center">Sorry, no matches found</h2>';
  return '<html><body bgcolor="#FFFFFF" text="#000000" link="#0000FF" ' +
    'alink="#FF0000"><div align="center"><table border=0><tr><td>' +
    list + '</td></tr></table></body></html>';
}

Clicking the Search button calls the doSearch() function. This function examines the selection list and performs the appropriate search. The result list is placed in a global variable called results. The doSearch() function then loads the show frame with a javascript: URL that calls showList().

The showList() function, in turn, reads the result list and creates a one-line entry for each book, consisting of the score (number of matching terms), along with the book's author and title. Each entry is enclosed in an HREF that calls the showBook() function to display details about the book.

The showBook() function displays each of the fields in the Book object. It also includes an HREF back to the showList() function so that the user can return to the result list of the current search.

That's it! Your bookstore is open for business. The complete listing is included on the accompanying CD-ROM in file c16book.htm.

Figure 16.5 shows the resulting list of titles based on a subject search of the keyword love, and Figure 16.6 shows the complete author, title, and subject information for one book.

Figure 16.5 : Each book matching the search term is listed.

Figure 16.6 : Selecting a book shows a detailed listing.

Improved Indexing and Searching

The indexing and searching algorithms do a good job of finding matching entries, but they suffer from a couple of drawbacks.

First, the user must enter keywords exactly as they were specified when the Book object was created. One obvious improvement is to store all keywords in lowercase and convert search words to lowercase before beginning the search.

But what if the user enters the plural or past tense version of a word? What if the user includes (or fails to include) apostrophes, quotation marks, or other punctuation symbols? The search engine will break down and fail to return any matching items.

You can address this problem to some extent by normalizing all keywords before adding them to the index or performing a lookup. Normalizing a word means reducing it to something akin to a root word. This process is not easy; volumes have been written and fortunes spent on developing effective indexing and searching algorithms. But you can use a few simple techniques to improve your search results dramatically.

First, create a function to normalize a word, as shown in Listing 16.28.


Listing 16.28  The normalizeWord() Function
function normalizeWord (keyword) {
  var esc = escape (keyword.toLowerCase());
  var kw = "";
  for (var i=0; i < esc.length; i++) {
    var ch = esc.charAt(i);
    if (ch == '%')
      i += 2;
    else
      kw += ch;
  }
  var len = kw.length;
  if (kw.charAt(len-1) == "s" && kw.charAt(len-2) != "s") {
    kw = kw.substring(0,len-1);
    len--;
  }
  if (kw.substring(len-2,len) == "ly") {
    kw = kw.substring(0,len-2);
    len -= 2;
  }
  if (kw.substring(len-2,len) == "ed") {
    kw = kw.substring(0,len-1);
    len--;
  }
  if (kw.substring(len-2,len) == "er") {
    kw = kw.substring(0,len-1);
    len--;
  }
  if (kw.substring(len-2,len) == "ie") {
    kw = kw.substring(0,len-2) + "y";
    len--;
  }
  if (kw.substring(len-3,len) == "ing" && len > 5) {
    kw = kw.substring(0,len-3);
    len -= 3;
    if (isVowel(kw.charAt(len-2)) && !isVowel(kw.charAt(len-3))) {
      kw += "e";
      len++;
    }
  }
  if (kw.charAt(len-1) == "e")
    if (!isVowel(kw.charAt(len-3))) {
      kw = kw.substring(0,len-1);
      len--;
    }
  if (len > 1 && (kw.charAt(len-1) == kw.charAt(len-2))) {
    kw = kw.substring(0,len-1);
    len--;
  }
  return kw;
}

The normalizeWord() function starts by converting the keyword to lowercase. Next, it strips out any punctuation marks or other unusual characters. To strip the characters, it calls JScript's escape() function, which converts all unusual characters to a percent sign (%) followed by two ASCII characters. These characters are then removed.

The normalizeWord() function then makes a series of transformations based on the word ending. Note that the order of these transformations is important. We won't go into detail on each transformation here. The goal is to reach a root version of the word. It isn't necessarily the true English root, but as long as you perform the same transformations on both the indexed words and the search words, you should improve chances of getting a match.

Note
The transformations applied by the normalizeWord() function are useful only for English words. The call to escape() strips out accented letters, for instance, and the word ending transformations are meaningful only for English words. However, creating a similar function for other languages should be possible.

The other problem with this indexing scheme is that it indexes and searches for many words that are extraneous, such as "the," "and," "a," and so on. Most indexing and searching programs use a list of stop words to exclude extraneous words. These lists are often quite extensive, but you can write a simple function to deal with the worst offenders. Listing 16.29 shows a simple isStopword() function.


Listing 16.29  The isStopword() Function
function isStopword (word) {
  var wd = word.toLowerCase();
  if (wd == "a" || wd == "an" || wd == "and" ||
      wd == "or" || wd == "the")
    return true;
  return false;
}

Finally, you can modify the parseKeywords() function to call normalizeWord() and isStopword(), as shown in Listing 16.30.


Listing 16.30  The Improved parseKeywords() Function
function parseKeywords (str) {
  var list = new KeywordList ();
  var inword = false;
  var word = "";
  var len = str.length;
  for (var i = 0; i < len; i++) {
    var ch = str.charAt(i);
    if (isWhitespace(ch) || isDelimiter(ch)) {
      if (inword) {
        if (!isStopword(word))
          list.add(normalizeWord(word));
        word = "";
        inword = false;
      }
    }
    else {
      word += ch;
      inword = true;
    }
    if (i + 1 == len && inword)
      if (!isStopword(word))
        list.add(normalizeWord(word));
  }
  return list;
}

The improvements made here are included in file c16book2.htm on the CD-ROM. You can try this more sophisticated code by doing a subject search for "the love." The results which you will obtain will be identical to those shown in Figure 16.5, since the stop word "the" has been removed. A number of additional improvements could be made, including better handling of symbols, and Boolean AND and OR operations, to name a few.