Non Latin Scripts

From Parallel Grammar Wiki
Jump to navigation Jump to search

XLE supports all types of scripts. The relevant parts of the XLE documentation are the sections on Emacs and non-ASCII character sets (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC3) and Character Encodings (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC23).

Here is an example of how the Georgian grammar is set up.

In the Configuration file of the grammar, the following line should be added:

CHARACTERENCODING utf-8.

This tells XLE to expect utf-8.

If you are using emacs to write your grammars, you can add the following line to the very top of the main grammar file (e.g., georgian.lfg):

   ";;; -*- Encoding: utf-8 -*-"

This tells emacs that the file is in utf-8.

However, emacs tends to be tricky about utf-8, so you might also want to create a .emacs file and put the following information in it:

 (set-language-environment "UTF-8")
 (setq process-coding-system-alist '(("xle" utf-8 . utf-8)
    ("shell" utf-8 . utf-8)
    ("slime" utf-8 . utf-8)))
 (setq default-process-coding-system '(utf-8 . utf-8))

If you already have a .emacs file, then simply add this information.

Another option is to not use emacs and write your test suite and grammar in a more utf-8 friendly editor. In this case, you can access your test sentences via the "parse-testfile" command. To get more information on this command, while in XLE type:

 % help parse-testfile

The version we want takes the name of a testsuite as a parameter and the number of the sentence you want to parse. For example:

 % parse-testfile your-testsuite.lfg 3