.so page1.s
.Fc "Splay Tree Library" 1.1 "14th Sept 1988"
.H 1 Introduction
This document describes the \fIsplay\fR tree library. Splay trees
are a form of self-balancing trees, designed to minimise the number
of comparisons needed when doing lookup (and hence insertion) operations
onto binary trees.
.P
The code described in this document is based on the reference, [1],
and the public domain source code distributed on the Usenet network,
in the comp.sources.misc section in early 1988. Unfortunately
I have lost the names of the original authors of the software.
.P
This version of the splay tree library differs in that 
substantial portions of the code was rewritten, to make it work
and to optimise it for the environments for
which it is being used for, ie the CRISP editor.
.H 2 References
.RL
.LI
An Empirical Comparison of Priority-Queue and Event-Set Implementations,
by Douglas W. Jones, Comm. ACM 29, 4 (Apr. 1986) 300-311.
.LI
The Art of Computer Programming, D. E. Knuth, Vols 1-3. Addison-Wesley.
.LE
.H 1 Overview
The splay tree library is a set of functions which manipulate binary
trees; the binary tree nodes are represented by structures
with some vacant fields for use by the application. These fields
will typically point to strings or other structures, allowing the
application program to build up splay trees of arbritrary data types.
.P
The splay tree library attempts to \fIhide\fR the actual implementation
by not documenting fields used in the node structures which are of
use only to the library itself, and provides functions for
performing common operations.
.P
The following generic capabilities are provided for
in the library:
.AL
.LI
Initialising a new splay tree.
.LI
Adding a new entry to the tree.
.LI
Finding an entry within the tree.
.LI
Deleting aa node from the tree.
.LI
Traversing the tree (allowing the application
to provide a callback function as each node is
visited).
.LI
Printing statistics on the performance of the tree access routines.
.LE
.P
Splay trees are essentially binary trees, except the root of the
tree is modified on each lookup operation. It is expected that
lookup operations will be performed more often than deletions or insertions.
.P
Splay trees are not as optimal a data structure for lookups, as say
hash tables, but they do not suffer the problems of hash tables
in the pathological cases, eg when the table fills up,
or when a bucket is already full.
.P
In a normal balanced binary tree, every time the tree is modified,
the tree is re-organised so that the longest branch in the tree
is no longer than log2(N). Unbalanced trees suffer from
the problem that inserting entries into the tree in
alphabetic order causes the tree data structure to degenerate into
a linear list.
.P
Balanced binary trees suffer from the amount of processing needed
to maintain the tree structure in a balanced manner. Splay trees
offer a much faster \fIaverage\fR lookup rate, in return for
a worse worst-case lookup. This is done by moving a node
to the root of the tree whenever it is looked up,
following the rules given in [1]. This means that frequently looked
up items tend to cluster around the root of the tree, and
in addition, tends to flatten the tree in the process so that
the tree tends towards the form of a balanced tree.
.P
Splay trees, like balanced and unbalanced trees, have the
useful property of maintaining topographical order of the items
inserted, allowing easy traversal of the structure for printing, etc.
.P
The remainder of this document consists of manual page
entries for the functions in the splay tree.
.H 2 "Compiling and Linking"
The splay library comes with a \fImakefile\fR for building on the
following operating systems:
.AL a
.LI
Xenix 386
.LI
Xenix 286
.LI
IBM 6150 running AIX
.LI
Microport System V.2/286
.LI
Microport System V.3/386
.LI
SunOS 3.x/4.x
.LE
This \fImakefile\fR creates the file \fIlibsptree.a\fR. This file should
be placed in some convenient place on your system (eg /lib or /usr/lib).
.P
All C programs should contain the following line:
.DS I
# include    "sptree.h"
.DE
.H 2 Warning
The splay tree library makes use of the functions
.I chk_alloc(),
and
.I chk_free().
These are part of the 
.I foxlib.a
library. If you do not have access to these functions, simply
#define 
.I chk_alloc 
as
.I malloc,
and 
.I chk_free
as
.I free.
.H 1 "Trees & Nodes"
There are two structure definitions defined in the include file
which are needed for accessing the splay tree library -
SPTREE which is used to contain the information to describe
the pointer to the root node, and SPBLK which describes an
individual node entry.
.P
The SPBLK structure contains two fields which may be accessed and
modified by the application -
.I key
and
.I data.
.I key
is defined as a 
\fIchar *\fR
and is used to contain a key for ordering the nodes in the tree.
.I data
is a pointer sized element, which may be used for any
purpose by the application, eg pointing to a data structure.
.P
Memory for SPBLK's and the data structure pointed to be
.I data
need to be allocated by the calling application. SPTREE
structures are allocated via the 
.I spinit() 
function which is responsible for setting the initial contents
of the tree to reasonable values.
.P
Typically an application will declare a local data structure, such as:
.DS
struct mystruct {
	char	*name;
	 .
	 .
	 .
	};
.DE
and have
.I key
point to the 
.I name
field of the structure.
.bp
.Fo Name
spapply - apply a function to each node in a tree
.Fo Synopsis
.DS
\f(CW
# include	"sptree.h"

void spapply(SPTREE *tree, (void (*)()) func, int arg);
\fR
.DE
.Fo Description
.I spapply
is used to walk down all nodes in the tree
.I tree
(in alphabetic order),
and for each node visited, 
.I func
is called with two parameters - the first is a pointer
to the node being visited, and the second is the 
.I arg
argument.
.P
This function can be used to print out each node in the tree, etc.
.P
NOTE: do not try to modify the tree whilst walking the tree, as this
will confuse 
.I spapply().
.I spapply()
sets a flag to avoid splays being performed by virtue of 
.I splookup().
.bp
.Fo Name
spempty - test whether a tree is empty
.Fo Synopsis
.DS
\f(CW
# include    "sptree.h"

int spempty(SPTREE *tree);
\fR
.DE
.Fo Description
This function returns TRUE if the tree 
.I tree
contains no elements; it returns FALSE otherwise.
.bp
.Fo Name
spdeq, spenq - add or delete nodes in a splay tree
.Fo Synopsis
.DS
\f(CW
# include    "sptree.h"

void spenq(SPBLK *node, SPTREE *tree)
spdeq(SPBLK *node, SPTREE *tree);
\fR
.DE
.Fo Description
.I spenq() takes the node pointed by
.I node
and inserts it into the correct position in 
.I tree.
The correct position is found by performing a 
.I strcmp()
on the 
.I node->key
field with other nodes in the tree.
Duplicate entries are met with a fatal error.
.P
Insertion into the tree is done by scanning the tree for a leaf node
and appending the entry onto the end of the tree. This has the currently
undesirable property that if insertions are performed in 
alphabetic order, then insertions degenerate into creating a linked
list.
.I spdeq()
is used to delete a node from a splay tree. The node is given by
.I node.
.I node
must be in the tree
.I tree
otherwise an infinite loop will result.
.P
The removal of
.I node
will result in some re-organisation of the tree, in order to
maintain the topological ordering.
.P
Remove of
.I node
is purely in terms of the tree - the memory allocated
to
.I node
is NOT freed, nor are any data objects pointing off of
.I node.
.P
The algorithm used to delete the node from the tree is taken from
reference [2], Vol.3 Algorithm D, page 420.
.Fo Examples
The following example creates a new node, and inserts it into
the tree:
.DS I
\f(CW
SPTREE	*tree = spinit();
SPBLK	*node = chk_alloc(sizeof (SPBLK));

node->key = "hello, world!";
spenq(node, tree);

\fR
.DE
The following code may be used to clear a tree:
.DS
\f(CW
	SPTREE	*sym_tbl;

	while (!spempty(sym_tbl)) {
		SPBLK *sp1 = sphead(sym_tbl);
		spdeq(sp1, sym_tbl);
		}
\fR
.DE
.bp
.Fo Bugs
Duplicate entries should be signified by returning a result code;
aborting with a printf() is not a good programming practise!
.P
A future version will keep a count of insertions, and after some
arbritrary number (16), a resplay will be done. This will avoid
degenerate trees. Splays are not done on each insertion to
avoid the overhead incurred in re-splaying.
.bp
.Fo Name
sphead - return the root of the tree
.Fo Synopsis
.DS
\f(CW
# include    "sptree.h"

SPBLK	*sphead(SPTREE *tree);
\fR
.DE
.Fo Description
.I sphead()
returns a pointer to the root node of the tree
.I tree.
.bp
.Fo Name
spinit - create a new tree
.Fo Synopsis
.DS
\f(CW
# include    "sptree.h"

SPTREE *spinit()
\fR
.DE
.Fo Description
.I spinit()
allocates a new SPTREE structure which contains the information describing
a new splay tree. It is initialised to contain no entries.
.Fo Example
.DS
\f(CW
SPTREE	*tree_root = spinit();
\fR
.DE
.bp
.Fo Name
splookup - lookup a node in the tree.
.Fo Synopsis
.DS
\f(CW
# include	"sptree.h"

SPBLK	*splookup(char *key, SPTREE *tree);
\fR
.DE
.Fo Description
.I splookup()
is used to search the splay tree
.I tree
and return a pointer to the node containing 
.I key.
If no node contains
.I key,
then NULL is returned.
.P
.I splookup()
will automatically resplay the tree around the located node.
.bp
.Fo Name
splay - shuffle a node to the root position in the tree
.Fo Synopsis
.DS
\f(CW
# include	"sptree.h"

void	splay(SPBLK *node, SPTREE *tree);
\fR
.DE
.Fo Description
.I splay
is used to move an arbritrary node in
.I tree
so that it occupies the root position in the tree. In a totally
unbalanced tree this has the effect of 
.I flattening
the tree so that the worst case lookup operations will perform significantly
better.
.P
Re-splaying of a node is performed automatically by 
.I splookup().
Replaying of a node should not be necessary under other circumstances,
unless it can be determined that insertions & deletions
are bottlenecks in an application.
.P
Although reference [1] implies that resplaying should be applied
to all nodes in the tree, in practise, 
.I splay()
will not resplay a node which is the root of
.I tree, 
or a child of the root. This is to avoid the case where 
.I splookup()
is alternately used to lookup one of two or three nodes in the
tree, and thus causing the root or its children to continually
oscillate positions.
.bp
.Fo Name
spstats - return a string containing performance statistics of a tree
.Fo Synopsis
.DS
\f(CW
# include	"sptree.h"

char	*spstats(SPTREE *tree);

\fR
.DE
.Fo Description
.I spstats
returns a pointer to a local static buffer which contains a string
similar to the following:
.DS
Lookups(147 3.99) Insertions(18 3.38) Splays(49 0.91)
.DE
.P
This function may be used to test the effectiveness of the splay
operations, and to spot whether they are being effective,
or whether splays need to be done after all insertions or deletions.
.P
The first number in each set is the total number of operations performed
on the tree so far.
The second number is the number of 
.I strcmp()
functions (and hence nodes traversed) divided by the first number.
Thus, in the example above, 147 lookups were performed, and each
lookup performed an average of 3.99 
.I strcmp() s.
.P
The number of splays should be watched. If the second number for
splays is much greater than one, then the tree is being used
badly, and maybe splay trees are not appropriate for this application.
.P
The above example is taken from the global symbol table for
CRISP when it is exited immediately after startup. The figures for
lookups & splays are reasonable -- although not optimal, due
to the fact that the symbol table has not been exercised very much
by the startup macros.
.TC
