JIS (Japanese Code Conversion Program) Written by Ken R. Lunde, University of Wisconsin-Madison klunde@vms.macc.wisc.edu OR klunde@wiscmacc.bitnet January 3, 1989 This program converts Japanese electronic codes. There are two major types of Japanese codes: 7-bit and 8-bit. Japanese characters are represented by two-bytes, namely that two ASCII characters represent one Japanese character. The 7-bit code uses escape sequences to signal a shift from one-byte to two-byte character sequences. There are three major escape sequences, all of which are handled by this program. These are called NEW-JIS, OLD-JIS, and NEC CODE. The 8-bit codes do not require escape sequences, and depend on the first byte to distinguish one-byte and two-byte character sequences. These codes are called SHIFT-JIS and Extended UNIX Code (EUC). JIS Program Operation: ^^^^^^^^^^^^^^^^^^^^^ If you are compiling this program on your own, you will have to follow the steps below. 1) Compile JIS.PAS file $ pascal jis.pas 2) Link JIS.OBJ file $ link jis 3) Run JIS.EXE file $ run jis If you are using SMAIL, namely if your account is on WIRCS2, you only need to type the command "JIS" at the "$$" prompt. $$ jis JIS Program Description: ^^^^^^^^^^^^^^^^^^^^^^^ This program performs several operations on the Japanese character set as specified in JIS X 0208. It uses menus to select which conversion to use, and allows the user to specify the infile and outfile names. These are the operations performed by this program: 1) Convert an 8-bit Japanese file (choice of SHIFT-JIS or Extended UNIX Code) to 7-bit format (choice of NEW-JIS, OLD-JIS, or NEC CODE). 2) Convert a 7-bit Japanese file (recognizes NEW-JIS, OLD-JIS, and NEC CODE) to 8-bit format (choice of SHIFT-JIS or Extended UNIX Code). 3) Convert an 8-bit Japanese file (choice of SHIFT-JIS or Extended UNIX Code) into a different 8-bit Japanese code (choice of SHIFT-JIS or Extended UNIX Code). 4) Convert a 7-bit Japanese file (recognizes NEW-JIS, OLD-JIS, and NEC CODE) into a different 7-bit Japanese code (choice of NEW-JIS, OLD-JIS, or NEC CODE). This program also removes line feed and form feed characters as well as eliminating redundant escape sequences (some Japanese communication software inserts extra escape sequences). SHIFT-JIS files, when uploaded, may be in one long record (a one-line file). This program recognizes this, and separates the file into separate lines. This program does not reverse this process when converting to SHIFT- JIS format; it is not necessary. JIS Program Menus: ^^^^^^^^^^^^^^^^^ 1) First Menu $ run jis (or $$ jis) Japanese Code Converter version 1.0 (January 3, 1990) Convert SHIFT-JIS Japanese file to 7-bit format -> "1" Convert SHIFT-JIS Japanese file to EUC format -> "2" Convert EUC Japanese file to 7-bit format -> "3" Convert EUC Japanese file to SHIFT-JIS format -> "4" Convert 7-bit Japanese file to SHIFT-JIS format -> "5" Convert 7-bit Japanese file to EUC format -> "6" Convert 7-bit Japanese file to another 7-bit format -> "7" MAKE YOUR SELECTION -> This menu expects an input value from 1 to 7. If the value is not in this range, an error message is displayed, and the menu reappears. 2) Second Menu Infile name -> Outfile name -> This menu allows the user to specify the infile and outfile names, and if the user wishes to retain the same name for the outfile, he/she simple needs to input a carriage return at the "Outfile -> " prompt. 3) Third Menu * * * * * * * * Please Select 7-bit Japanese Code * * * * * * * * NEW-JIS (KI = "$B"/KO = "(J") -> "1" OLD-JIS (KI = "$@"/KO = "(J") -> "2" NEC CODE (KI = "K"/KO = "H") -> "3" MAKE YOUR SELECTION -> This menu appears only if the output of the program will be in 7-bit Japanese code. It allows the user to select the escape sequences as will be used in the file. As before, if the selected value is not in the correct range, an error message is displayed, and the menu reappears. Once the code is selected, a message will appear which indicates which code was selected. For example, if "1" was selected, the following message will appear: 7-bit Japanese code is set at NEW-JIS JIS Program Applications: ^^^^^^^^^^^^^^^^^^^^^^^^ The IBM world has not been blessed with a Japanese terminal emulator program. Furthermore, there is only a handful of Japanese word processors designed for the IBM, and these can only generate 8-bit Japanese code (SHIFT-JIS). However, electronic mail communication in Japanese requires the file to be in 7-bit format. The following steps can be used to send and receive Japanese text using an IBM PC (or clone) assuming that a Japanese word processing program is being used. Sending 8-bit Japanese text: 1) Save Japanese text as a textfile 2) Upload file to VMS 3) Convert to 7-bit format using JIS Program 4) Send file using EMAIL Receiving 7-bit Japanese text: 1) Save 7-bit Japanese message to a textfile (use the EXTRACT command) 2) Convert to 8-bit format using JIS Program 3) Download file to IBM PC 4) View file on Japanese word processing program Several Japanese terminals only recognize certain 7-bit codes, and the ability to convert the escape sequences in a Japanese file becomes a very necessary feature. For example, one of my terminals recognizes only NEC CODE, another recognizes NEW-JIS and OLD-JIS, and still another recognizes all three. 7- and 8-bit Japanese Electronic Code Specifications: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ SHIFT-JIS (8-bit Japanese Electronic Code): DECIMAL OCTAL HEXADECIMAL TYPE 1: first byte 129-159 201-237 81-9F second byte 064-158 (not 127) 100-236 (not 177) 40-9E (not 7F) TYPE 2: first byte 224-239 340-357 E0-EF second byte 064-158 (not 127) 100-236 (not 177) 40-9E (not 7F) TYPE 3: first byte 129-159 201-237 81-9F second byte 159-252 237-374 9F-FC TYPE 4: first byte 224-239 340-357 E0-EF second byte 159-252 237-374 9F-FC Extended UNIX Code (or EUC) (8-bit Japanese Electronic Code): first byte 161-254 241-376 A1-FE second byte 161-254 241-376 A1-FE NEW-JIS, OLD-JIS, and NEC CODE (7-bit Japanese Electronic Codes): first byte 033-126 041-176 21-7E second byte 033-126 041-176 21-7E ESCAPE SEQUENCES (For 7-bit Japanese Electronic Codes): NEW-JIS (1983): kanji-in 027 + 036 + 066 033 + 044 + 102 1B + 24 + 42 kanji-out 027 + 040 + 066 033 + 050 + 102 1B + 28 + 42 OLD-JIS (1978): kanji-in 027 + 036 + 064 033 + 044 + 100 1B + 24 + 40 kanji-out 027 + 040 + 074 033 + 050 + 112 1B + 28 + 4A NEC CODE: kanji-in 027 + 075 033 + 113 1B + 4B kanji-out 027 + 072 033 + 110 1B + 48 CONCLUSION: ^^^^^^^^^^ I would appreciate any comments on this program. I can be contacted at the electronic mail address given at the beginning of this information bulletin.