This page is valid strict XHTML. If it is not rendered properly on your screen, you may be using an old or non-compliant browser. Try Mozilla.

How to create Japanese language documents under GNU/Linux using LaTeX

Mark Alford. Last updated: 2007-Jan-30.

If you just want to create a shift-JIS-encoded Japanese text file then skip to the section on Japanese input in emacs below.

Chinese, Japanese, and Korean language support is available in LaTeX via the CJK package. These are the steps that I followed to get Japanese. They work for CJK-4.7.0 (2006-10-17 version) under Fedora Core 6, and they worked previously for CJK-4.5.2 (2004-04-29 version) under Fedora Core 1 to 5, and for CJK-4.4.0 (Apr 2001) under RedHat 6.2, 7.x, 8.0.

I did everything as root, because I was adding CJK as a system-wide component, although I put it in a separate tree, /usr/local/share/texmf/, so that it is not mixed up with teTeX and does not get clobbered when teTeX is upgraded. But if the sysadmin has set up the TeX search paths correctly you should be able to install CJK for your personal use by putting everything in $HOME/texmf/ and $HOME/bin/, without being root.

  1. Add Japanese language capability ("CJK") to LaTeX.

    1. At the UK TeX archive you can find CJK and CJK fonts. The latest version of CJK is also supposed to be available at this URL. The only font file you need for Japanese is the kanji one, which should be called something like "kanji48.tar.gz" and be found at this URL. In these instructions I will assume that CJK has been downloaded to ~/cjk-current/, and the kanji font to ~/CJK_fonts/kanji48.tar.gz.
    2. Copy the CJK tex input files to a place where tex will find them:
      > cp -r ~/cjk-current/texinput /usr/local/share/texmf/tex/latex/CJK
      
    3. Install hbf2gf in /usr/local/bin
      > cd ~/cjk-current/utils/hbf2gf
      > ./configure --prefix=/usr/local/ \
                    --with-kpathsea-include=/usr/include \
                    --with-kpathsea-lib=/usr/lib
      > make
      > make install
      
      The with-kpathsea-include dir is wherever kpathsea/kpathsea.h (note: not kpathsea.h) lives. The with-kpathsea-lib dir is wherever libkpathsea.a lives. NB: the command "locate <filename>" is very useful. Make sure that the hbf2gf command is now accessible ("which hbf2gf").
    4. Install Japanese fonts in /usr/local/share/texmf/fonts/:
      > cd /usr/local/share/
      > cp ~/CJK_fonts/kanji48.tar.gz .
      > tar -zxf kanji48.tar.gz
      > rm kanji48.tar.gz
      
      Since kanji48.tar.gz contains the files in the proper directory structure, it is sufficient to unpack it in a directory that contains the target texmf directory.
    5. Tell teTeX where to find .hbf files, by modifying /usr/share/texmf/web2c/texmf.cnf:
      ----
      MISCFONTS = .;$TEXMF/fonts/misc//;$TEXMF/fonts/hbf//
      ----
      
      For non-root installation it may be possible to set this as an environment variable (see ~/cjk-current/doc/INSTALL and man hbf2gf). I expected to have to add a new variable, HBF2GFINPUTS = $TEXMF/hbf2gf// to texmf.cnf, which tells it where to find .cfg files, but for me it works fine without it.

      Important point: Here you have modified a system config file. If you later upgrade your tetex package or your operating system, this change may be overwritten, and you will have to go back and do it again. Also note that the format of the texmf.cnf file changed dramatically from tetex-2.x to tetex-3.x (Fedora Core 3 to Fedora Core 4), so if you just restore your old texmf.cnf file, or if you stop the upgrade from overwriting it (by modifying its first line) you may break your whole TeX system, producing bewildering errors like fmtutil: format directory `/web2c' is not writable. I can't find the format file `latex.fmt'! whenever you try to latex anything. Use the new texmf.cnf file, and just modify the MISCFONTS line as described above.

      Debugging: You can obtain debugging output from kpathsea by typing "setenv KPATHSEA_DEBUG -1" before running latex or xdvi/dvips (you will want to redirect output to a file). Look for the search that failed: search the output for "Couldn't find", or "failed", and then backtrack to see if it was looking in the right places. A useful command for checking the variable settings in your texmf.cnf is kpsewhich. For example, to see if MISCFONTS is set right:
      > kpsewhich -progname=hbf2gf -expand-var='$MISCFONTS'
      
    6. Update tex database
      > mktexlsr  # (same as "texhash")
      

    Now see if it works. Unfortunately, the example files given in recent versions of CJK do not work! So try my example, (japanese_template.cjk):

    > latex japanese_template.cjk
    

    You should see output indicating that it found the CJK files:

    (/usr/local/share/texmf/tex/latex/CJK/CJK.sty
    (/usr/local/share/texmf/tex/latex/CJK/mule/MULEenc.sty)
    (/usr/local/share/texmf/tex/latex/CJK/CJK.enc))
    (/usr/share/texmf/tex/latex/base/article.cls
    Document Class: article 2001/04/21 v1.4e Standard LaTeX document class
    (/usr/share/texmf/tex/latex/base/size12.clo))
    (/usr/local/share/texmf/tex/latex/CJK/ruby.sty) (./japanese_template.aux)
    (/usr/local/share/texmf/tex/latex/CJK/standard.bdg)
    (/usr/local/share/texmf/tex/latex/CJK/standard.enc)
    (/usr/local/share/texmf/tex/latex/CJK/standard.chr)
    (/usr/local/share/texmf/tex/latex/CJK/JIS/c40song.fd) [1]
    (./japanese_template.aux) )
    Output written on japanese_template.dvi (1 page, 632 bytes).
    Transcript written on japanese_template.log.
    

    Now look at the results:

    > xdvi japanese_template
    

    You should see Japanese!

  2. Use emacs (mule) to produce LaTeX files that include Japanese.

    If you just want to create shift-JIS encoded Japanese text files, you can skip steps 2 and 3 below, and ignore all mentions of ".ckj" files. If you want to do Japanese LaTeX then you will keep each file in two forms. There will be file.tex, which is encoded however you like (I will use shift-JIS encoding, but emacs offers many others), and which you edit and work on. And there is file.cjk, which is JIS-encoded, and which can be LaTeXed.

    1. Make sure that the emacs-leim (library of emacs input methods) package is installed. It is part of the Fedora Core distribution but not necessarily installed by default.
    2. Install the necessary emacs lisp macros in /usr/local/share/emacs/site-lisp (create this dir if necessary).
      > cd ~/cjk-current/utils/lisp
      > cp cjktilde.el /usr/local/share/emacs/site-lisp/
      > cp emacs-20.3/cjk-enc.el /usr/local/share/emacs/site-lisp/
      
    3. Tell your emacs sessions where to find them. Put the following line into your ~/.emacs file:
      ----
      (setq load-path (cons "/usr/local/share/emacs/site-lisp" load-path))
      (load-library "cjk-enc")
      (global-set-key "\C-c\C-w" 'cjk-write-file) ; CTRL C CTRL W writes CJK file
      ----
      
    4. You can use this template latex file (annotated version). Do not try to use the templates in the CJK examples directory, since they do not work.
      ----
      %-*- coding: japanese-shift-jis; current-input-method: japanese -*-
      \documentclass[12pt]{article}
      \usepackage[overlap, CJK]{ruby}
      \CJKencfamily{JIS}{song}
      \renewcommand{\rubysep}{-0.3ex}
      \begin{document}
      This is in English.
       { łB
      \end{document}
      ----
      
      Note that we specify the coding and input method in a TeX comment on the first line. (Alternatively, this can be done by explicit commands to emacs.) Possible codings include emacs-mule, japanese-euc, japanese-shift-jis, etc. When you load the file into emacs, it will automatically detect the coding and display the characters correctly.

      Shift-jis is also recognized by most web browsers and many email programs, so you can use this method to create Japanese language web pages or send Japanese email. For those applications you may need to delete the "%-*- ..." initial line. The file will still be SJIS-encoded, but you will need to use explicit emacs commands to edit it using emacs.
    5. To go to Japanese input mode, you just type CTRL-\ twice (ignore the "value is nil" message). After that, typing CTRL-\ will toggle back and forth between English and Japanese input methods.
    6. How to enter Japanese text.

      Once you have set the input method to Japanese, you can type in romaji, and it will come out as hiragana, underlined. The underlined bit is the "conversion region", which is still malleable, i.e. not yet fixed. To fix it, and start a new conversion region, press return. To convert it to katakana, type "K" (uppercase). To toggle among a range of kanji for the conversion region, keep pressing space. (To go back and forth in the kanji list use CTRL-P and CTRL-N.) When you get the one you want, press Return to fix it. To get a Japanese "n" type "n" then some non-vowel like "q" (after that you can type "K" for Katakana) then Return to fix it. (For Hiragana "n", the non-vowel can be "n" itself.)
    7. To save the file in its shift-JIS-encoded form, just do the usual CTRL-X CTRL-C. If you are creating a Japanese LaTeX file, you will want to save it using CTRL-C CTRL-W (see the binding in your .emacs file above). This invokes 'cjk-write-file, which saves file.tex, and also creates the JIS-encoded version file.ckj which can be LaTeXed.
    8. To latex the file,
      > latex file.cjk   # [NOT file.tex!]
      
      Then you can use xdvi, dvips, etc to view and print it.

For more general information on Japanese and computing, see Jim Breen's Japanese page.

Please sent comments, corrections, improvements to alford(at)physics.wustl.edu. Thanks to those who have done so, including Paul Wyatt of Toshiba and Andrew A. Adams of Reading University.

Mark Alford's IBM Thinkpad GNU/Linux page,

Valid XHTML 1.0!