TeX with SISISI-Hyphenation =========================== Technical Guide for Expert Users ================================ (Note: This file contains technical information only. You don't have to read it, unless you are *really* interested in the problems of porting SISISI to UNIX, or unless somthing went wrong with the normal installation process.) 0) Motivation. -------------- The SISISI-Hyphenation comes as a WEB-changefile (sitex3.ch), which is usually tangle'd into tex.web. The problem with the C version of TeX is that the folks who wrote the web2c converter considered only a WEB-specific subset of PASCAL, but sitex3.ch makes full use of standard PASCAL constructs. For example, the web2c converter can't handle statically nested procedures, which SISISI uses quite extensively. Therefore, for the C version of TeX, it was necessary to modify sitex3.ch to allow conversion with web2c, and, since web2c shows some strange and inexplicable behaviour (read: bugs :-), it is also necessary to modify the C files that are generated by the converter. This was performed by students of computer science Hans-J"urgen Szoldatics and Florian Zwerina. 1) Incorporating SISISI into TeX. --------------------------------- The first problem is that tangle takes a .web and a single .ch file, and generates a .p file. But in the C version, we have two changefiles, sitex3.ch and ctex.ch, and we first have to tangle tex.web with sitex3.ch, and the resulting file with ctex.ch, which is not possible. Fortunately, there's a program called TIE, which can be used to incorporate many changefiles into a .web master file, and which is available in a C version from ftp.th-darmstadt.de [130.83.55.75] directory pub/programming/literate-programming/Tools file tie-.tar.Z (Thanks to Klaus Guntermann for providing this information). For convenience, tie-2.4.tar.Z is included in this distribution. Also for convenience, this distribution contains a file called sitex.web, which is the original tex.web (version 3.14) with sitex3.ch tie'd into it. In case you want to install a newer (or maybe an older) version of TeX, you will have to compile tie.c and tie sitex3.ch into your tex.web, thus creating your own sitex.web. 2) Converting tex.web (with SISISI) into a C program. ----------------------------------------------------- As soon as tex.web has been replaced with sitex.web, you can start `make', just as with the normal installation of TeX. There are some differences, however: SiTeX uses up all the namespace in tangle and in web2c, and you have to increase the symbol tables in these programs. That's the reason why tangle.ch and web2c.? are provided and copied to their appropriate places by `install'. Some definitions in ctex.ch have to be changed, and so this file is also provided. Finally, the makefiles have to be changed to take into account the 9th C-file generated by the conversion process and the SISISI format files. 3) Now the problems start. -------------------------- As soon as tex.web has been tangle'd into tex.p and tex.p been converted into several .c and .h files, the bugs in web2c strike. For some strange reason, the sequence '] ] )' is twisted into ') ] ]', with the obvious result of a C syntax error. Also, the pass-by-value mechanism of PASCAL turns into pass-by-reference as soon as arrays are used as actual parameters. This comes from the fact that an array variable in C is the same as a pointer to the first array element. Therefore, instead of a copy of the entire array (which is what happens in PASCAL), only the pointer to the first element is passed to the function. Guess what happens when the array is modified inside the function... As a consequence, a temporary array has to be introduced into the function which stores a copy of the original parameter and restores it at the end of the function. Needless to say, the PASCAL assignment for arrays has to be converted manually into a C `memcpy' library call. Another bug is web2c's handling of input/output. PASCAL's write() is not properly translated into C's fprintf(), when a field width is specified. That is, write(x:4) is translated into printf("%d",x), but really should be printf("%4d",x). Similar problems arise with the conversion of read() into fscanf(), when non-standard data types are used. This causes SISISI's hash table to be written to file hf3 in wrong format, and to be read back into memory in a wrong way. Other problems arise from the nifty mechanism web2c employs for the handling of PASCAL's VAR-parameters. The PASCAL functions and procedures are renamed and introduced C macros with the names of the original functions and procedures, which apply the C address operator (&) to the arguments that are pass-by-ref, and subsequently call the now renamed function. Unfortunately, web2c does too much of the good and also generates type casts for the function parameters to ensure proper data types, which causes GNU C to complain about `cast specifies array type', when casts to array types are attempted. It is therefore necessary to remove these casts from coerce.h, where macros and function prototypes are defined. For convenience, the files tex6.c, coerce.h and texd.h (with all bugs removed) are provided and can be copied into TEXSRC/tex as soon as the `make' stops for the first time (usually with the aforementioned complaint about casts to array types), replacing the buggy files. The `install' script does this for you. The danger here is that your tex6.c is not necessarily the same as my tex6.c (i.e. the tex6.c I got out of my converter and applied the necessary bug fixes to). If this should be the case, the compilation will again terminate with an error, either that some symbols are declared twice or that some symbols are missing. If this should happen, please contact me immediately. If no such errors occur, compilation should terminate normally and you should get initex and virtex. 4) Installing SiTeX on your system. ----------------------------------- After initex and virtex have been successfully generated, it's now time to install the program and format files on your system. To get sitex.fmt and silatex.fmt, look into SISISIDIR/Makefile how these files are generated. The `install' script performs this step automatically, so if you used `install' to create SiTeX, you will already have the SISISI format files. You will find all these files in your TEXDIR, along with a file named sitex.pool and a file named hf3. First, you should copy hf3 to your TEXINPUTS directory (more precisely: into a directory contained in the TEXINPUTS path), and sitex.pool to the TEXPOOL dir. Then copy sitex.fmt and silatex.fmt into TEXFORMATS, and create a link to sitex.fmt named siplain.fmt in the same directory. Now you have to decide whether you want to retain TeX with the original hyphenation algorithm (virtex, tex and latex). If not, simply remove the old virtex and initex and all the links to it (such as tex, latex, slitex, glatex etc.) and copy the new initex and virtex into a directory holding your executables. Then create links to the now new virtex with names sitex and silatex. If you want to keep the original TeX, you have to copy initex and virtex into your standard search path under different names, preferably siinitex and sivirtex, and create the sitex and silatex links. Note however, that you can't use silatex for non-German documents, as the hyphenation algorithm is designed for the German language only. So if you want to work with German and English documents, you have to retain the original TeX. 5) Troubleshooting. ------------------- I have tested the installation process on a 486 PC running Interactive SVR3.2, and on an HP 9000/705 running HP-UX 8.07. silatex gives the same results on both machines. I had no opportunity, however, to make and test silatex on other systems, especially not BSD systems and AIX. SiTeX has therefore to be regarded as being still in its beta version, and any bug reports, comments, critics or praise (no, seriously ! If you have successfully installed SiTeX please let me know too !) are welcome. Send EMail to heinz@eiunix.tuwien.ac.at .