Non Latin Scripts

From Parallel Grammar Wiki
Revision as of 10:52, 12 January 2023 by Jessica.2.zipf (talk | contribs) (Created page with "XLE supports all types of scripts. The relevant parts of the XLE documentation are the sections on Emacs and non-ASCII character sets (http://ling.uni-konstanz.de/pages/xle/d...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

XLE supports all types of scripts. The relevant parts of the XLE documentation are the sections on Emacs and non-ASCII character sets (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC3) and Character Encodings (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC23).

Here is an example of how the Georgian grammar is set up.

In the Configuration file of the grammar, the following line should be added:

CHARACTERENCODING utf-8.

This tells XLE to expect utf-8.

If you are using emacs to write your grammars, you can add the following line to the very top of the main grammar file (e.g., georgian.lfg):

   ";;; -*- Encoding: utf-8 -*-"

This tells emacs that the file is in utf-8.

However, emacs tends to be tricky about utf-8, so you might also want to create a .emacs file and put the following information in it:

 (set-language-environment "UTF-8")
 (setq process-coding-system-alist '(("xle" utf-8 . utf-8)
    ("shell" utf-8 . utf-8)
    ("slime" utf-8 . utf-8)))
 (setq default-process-coding-system '(utf-8 . utf-8))

If you already have a .emacs file, then simply add this information.

Another option is to not use emacs and write your test suite and grammar in a more utf-8 friendly editor. In this case, you can access your test sentences via the "parse-testfile" command. To get more information on this command, while in XLE type:

 % help parse-testfile

The version we want takes the name of a testsuite as a parameter and the number of the sentence you want to parse. For example:

 % parse-testfile your-testsuite.lfg 3