Jade - James' DSSSL Engine

Contents

What is Jade?

Jade is an implementation of the DSSSL style language. The current version is 0.5, which is the first beta version.

For general information about DSSSL, see my DSSSL page.

Jade includes the following components:

Jade Copyright

Jade is licensed under the same terms as SP. This imposes almost no restrictions even for commercial use.

If you do use Jade in a commercial product, I would ask you, as a courtesy, to let me know about it and acknowledge the use of Jade.

Getting Jade

If you're using Windows 95 or Windows NT, then you all you need is in the binary distribution.

Otherwise you will need to build it yourself from source. You need the sources for Jade. This package includes the sources for a compatible version of SP (which may be different from the latest released version of SP).

Create a new directory and then unpack the sources file in this directory. The files in sources zip file use CRLF delimited lines, so use unzip -a to unpack on Unix. On Windows you must use an unzip that preserves long filenames, such as WinZip 6.1. You should also make sure that your unzip preserves the case of filenames; this requires using a -U option with some versions of unzip.

Building Jade

Win32

Only Microsoft Visual C++ 4.2 is supported. It worked with 4.1 last time I tried, but it's only been tested with 4.2.

Open SP.mak as a makefile and build at least the lib - UnicodeDebug and lib - UnicodeRelease configurations.

Then open jade.mak as a makefile and build the jade configurations.

With the debug configuration the DLLs and EXEs end up in the lib\UnicodeDebug directory; with the release configuration they end up in the bin directory.

Unix

Only gcc 2.7.2 is supported (2.6.3 and 2.7.1 may also work; don't use 2.7.0).

Building on Unix is tested irregularly, so be prepared to fix build glitches (typically the fix involves adding some template instantiations).

You'll need a Makefile that supports the include directive.

Build with make -f Makefile.jade. Note that jade requires an SP compiled with -DSP_MULTI_BYTE. If you plan to do any development, also do make -f Makefile.jade depend.

Using Jade

Add the directory containing the jade binary to your path, change directory to the dsssl directory, and do

jade demo.sgm
nsgmls -s demo.fot

If everything is working, you shouldn't get any errors.

Jade supports the following options in addition to the normal SP options:

-d dsssl_spec
This specifies that dsssl_spec is the DSSSL specification to be used; dsssl_spec must be an SGML document conforming to the DSSSL architecture. For an example, see dsssl/demo.dsl. If the -d option is not specified, jade will expect to find the DSSSL specification in file.dsl for an SGML document file.sgm.
-t output_type
output_type specifies the type of output as follows:
fot
An SGML representation of the flow object tree
rtf
Microsoft's Rich Text Format
html
Hypertext Markup Language
tex
TeX
sgml
SGML (used for SGML transformations)
-o output_file
Write output to output_file instead of the default. The default filename is the name of the last input file with its extension replaced by the name of the type of output. If there is no input filename, then the extension is added onto jade-out.

With HTML output more than one output file will be created. The name of the first will be output_file; the names of the others will be created by adding a serial number before the extension, if any, on output_file.

-V variable
This is equivalent to doing
(define variable #t)
except that this definition will take priority over any definition of variable in a style-sheet.

Jade ignores the SP_CHARSET_FIXED and SP_SYSTEM_CHARSET environment variables and always uses Unicode as its internal character set, as if SP_CHARSET_FIXED was 1 and SP_SYSTEM_CHARSET was unset. Thus only the SP_ENCODING environment variable is relevant to Jade's handling of character sets.

Jade Extensions

The following external procedures are available. These external procedures are defined by a prototype in the same manner as in the standard. To use one of these external procedures, you must make use of the standard external-procedure procedure, using a public identifier of "UNREGISTERED::James Clark//Procedure::name" where name is the name given here, typically by including the following in the DSSSL specification:

(define name
  (external-procedure "UNREGISTERED::James Clark//Procedure::name"))

Note that external-procedure returns #f if it doesn't know about the specified public identifier. You can use this to enable your DSSSL specifications to work gracefully with other implementations which do not support these extensions.

Debugging

(debug obj)

Generates a message including the value of obj and then returns obj.

Simple-page-sequence header/footer control

(if-first-page sosofo1 sosofo2)

This can be used only in the specification of the value of one of the header/footer characteristics of simple-page-sequence. It returns a sosofo that will display as sosofo1 if the page is the first page of the simple-page-sequence and as sosofo2 otherwise.

(if-front-page sosofo1 sosofo2)

This can be used only in the specification of the value of one of the header/footer characteristics of simple-page-sequence. It returns a sosofo that will display as sosofo1 if the page is a front (ie recto, odd-numbered) page and as sosofo2 if it is a back (ie verso, even-numbered) page

Current Jade Limitations

This section describes the limitations of the front-end (the general-purpose DSSSL engine): each backend also has its own limitations.

Only the DSSSL Online subset of DSSSL is implemented with the following additions (all part of full DSSSL)

Note that only inherited characteristics that are applicable to some DSSSL Online flow object can be specified.

Character/glyph handling

It only supports a single pre-defined character repertoire. A character name of the form U-XXXX where XXXX are four upper-case hexadecimal digits, is recognized as referring to the Unicode character with that code. For many characters, it is also possible to use the ISO/IEC 10646 name in lower-case with words separated by hyphens.

Some common SDATA entity names from the ISO entity sets are recognized and mapped to characters. In addition an SDATA entity name of the form U-XXXX, where XXXX are four upper-case hexadecimal digits, is mapped to the Unicode character with that code.

Jade does not make use of any of the declaration architectural forms related to characters and glyphs.

The following style language declarations (as well as the non-DSSSL Online declarations) are ignored:

declare-char-characteristic+property
declare-char-property
add-char-properties
define-language
declare-default-language

Validation

Several things that it would be desirable to have checked aren't checked:

Other limitations

The following primitives are just stubs:

char-script-case
Always returns last argument.
char-property
Always returns #f or specified default value.
address-visited?
Always returns #f.

Backends

RTF backend

Only the following flow object classes are implemented:

sequence
character
paragraph
paragraph-break
line-field
Only at the beginning of a paragraph.
display-group
simple-page-sequence
score
Only type after and through
rule
Only horizontal orientation. Rules only show up in Page Layout View.
box
Changing indentation inside a box will not work.
leader
The content of the flow object is ignored: a dotted leader will always be used. The specified length is ignored: it always fills out the line.
external-graphic
link
Only destinations that are single elements in the same document.
table
table-part
table-column
The table-auto-width feature isn't properly supported: it's not really possible in RTF.
table-row
table-cell
table-border

Many DSSSL characteristics cannot be implemented in RTF. The backend does the best it can.

In order to get correct page numbers in Microsoft Word, type the following after opening the document:

  1. CTRL+END
  2. CTRL+A
  3. F9

The RTF backend supports some additional characteristics. To use a characteristic named here as C, declare it using declare-characteristic with the public identifier:

"UNREGISTERED::James Clark//Characteristic::C"
These characteristics are all applicable to simple-page-sequence:
page-number-format
Value is a string as for format-number procedure. This controls the format of the number used by page-number-sosofo and current-page-number-sosofo for references to pages in the simple-page-sequence. The initial value is "1".
page-number-restart?
Value is a boolean. If true, then for the purposes of page-number-sosofo and current-page-number-sosofo, the page numbers for this simple-page-sequence will restart from 1. The initial value is #f.
page-n-columns
Value is a strictly positive integer, specifying the number of columns. The initial value is 1.
page-column-sep
Value is a length, specifying the separation between columns. The initial value is .5in.
page-balance-columns?
Value is a boolean. If true, the columns on the final page of the page-sequence should be balanced. The initial value is #f.

SGML Flow Object Tree backend

Most non-inherited characteristics for the character flow object aren't reported.

HTML backend

The HTML backend does not attempt to implement the semantics of most DSSSL flow objects in HTML. The widely deployed HTML browsers are not yet powerful enough for this. When support for CSS is more widespread this may be possible.

Instead the HTML backend provides an application-defined flow object class that allows a DSSSL specification to specify the HTML markup to be used. This flow object class can be declared as follows:

(declare-flow-object-class formatting-instruction
  "UNREGISTERED::James Clark//Flow Object Class::formatting-instruction")

This class has a single non-inherited characteristic data: which specifies a string to be inserted into HTML output. The string will be inserted without change into the output. Note that when character flow objects are output, occurrences of the characters <>& are translated into the appropriate entity references. For example, the following would map an XMP source element onto an HTML PRE element:

<[ CDATA [
(element XMP
  (sosofo-append
    (make formatting-instruction
          data: "<PRE>")
    (process-children)
    (make formatting-instruction
          data: "</PRE>")))
]]>

Obviously, procedures can be used to make this more convenient.

Note that this construction rule typically needs to be enclosed in a CDATA marked section to ensure that the PRE start- and end-tags are not recognized during the SGML parse of the style sheet.

Each scroll flow object is mapped onto a separate HTML document. The document names are chosen according to the system described in Using Jade. Any flow objects not inside a scroll flow object will produce no output.

The HTML backend implements a few other flow objects and characteristics. The link flow object is mapped onto an A element with an HREF attribute. When the link destination is a node in the current document, then the link will refer to the start of the primary flow object for the node, as defined in section 12.4.2 of the DSSSL standard. The backend will automatically insert a matching A element with a NAME attribute at this point.

A paragraph flow object is mapped onto a DIV element. The font-posture:, font-weight:, color: and quadding: are also automatically mapped to the appropriate HTML markup.

The HTML backend also implements the application defined characteristic with public identifier:

"UNREGISTERED::James Clark//Characteristic::scroll-title"

This applies to the scroll flow object and specifies a string to be used as the content of the TITLE element of the corresponding HTML document. If this title is to come from content within the document, you must create the scroll flow object in the document element construction rules and not in the root construction rule, as there is no mechanism currently implemented for accessing the document-element property of the sgml-document node. For example,

(element report
         (make scroll
	       scroll-title: (data
	                       (node-list-first
                                 (select-elements
                                   (children (current-node))
                                   "TITLE")))))

The example assumes that the title of the HTML document corresponding to this scroll object should be the data content of the first child of the document element node that is a TITLE element.

Note that the HTML backend generates the HTML and BODY elements automatically, so the formatting-instruction flow object cannot be used to get markup inside the HTML HEAD element.

Reading the Jade Sources

Start with the following headers:

grove/Node.h
style/FOTBuilder.h
style/StyleEngine.h

Reporting Bugs in Jade

If you find a bug in Jade, please take the time to report it. If Jade crashes on any input whatever, that's a bug and I want to hear about it. If Jade fails to process a specification that conforms to the DSSSL standard in the manner required by the DSSSL standard in a way that is not documented here as a current limitation nor is documented in my list of DSSSL errata, that's a bug and I want to hear about it.

Please report bugs by email to me, jjc@jclark.com. Do not post them to comp.text.sgml nor to the sp-prog list.

I do not want to get bug reports about documented limitations, so please read the list of limitations carefully. However, feel free to let me know which of the current limitations you would most like to see addressed.

I also at this stage do not want to hear about bugs in your C++ compiler that prevent it from compiling Jade: if your compiler refuses to compile Jade, I want to hear about it only if

Before reporting a bug, please check that your snapshot is current.

The most important thing in reporting a bug is to include a complete set of files on which I can run jade and reproduce the problem. Also tell me what command line I should use, and what is incorrect about the behaviour of jade. If the files are large package them up as a tar or zip file and upload them to ftp://ftp.jclark.com/incoming.

It is useful if you have a fix for a bug, but please don't delay sending in the bug while you work on a fix and don't send in a fix without giving me the files to reproduce the bug it fixes.

Contributing to Jade

Here are some ways you can contribute to Jade:

James Clark