Non Latin Scripts
XLE supports all types of scripts. The relevant parts of the XLE documentation are the sections on Emacs and non-ASCII character sets (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC3) and Character Encodings (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC23).
Here is an example of how the Georgian grammar is set up.
In the Configuration file of the grammar, the following line should be added:
CHARACTERENCODING utf-8.
This tells XLE to expect utf-8.
If you are using emacs to write your grammars, you can add the following line to the very top of the main grammar file (e.g., georgian.lfg):
";;; -*- Encoding: utf-8 -*-"
This tells emacs that the file is in utf-8.
However, emacs tends to be tricky about utf-8, so you might also want to create a .emacs file and put the following information in it:
(set-language-environment "UTF-8") (setq process-coding-system-alist '(("xle" utf-8 . utf-8) ("shell" utf-8 . utf-8) ("slime" utf-8 . utf-8)))
(setq default-process-coding-system '(utf-8 . utf-8))
If you already have a .emacs file, then simply add this information.
Another option is to not use emacs and write your test suite and grammar in a more utf-8 friendly editor. In this case, you can access your test sentences via the "parse-testfile" command. To get more information on this command, while in XLE type:
% help parse-testfile
The version we want takes the name of a testsuite as a parameter and the number of the sentence you want to parse. For example:
% parse-testfile your-testsuite.lfg 3