Main Page: Difference between revisions

From Parallel Grammar Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 16: Line 16:
Here are some discussions and information about ParGram topics (in alphabetical order).
Here are some discussions and information about ParGram topics (in alphabetical order).


*[[Auxiliaries]]
=== [[Auxiliaries]] ===
*[[M-structure]]


*[[CHECK feature]]
=== [[M-structure]] ===


*[[Complex Predicates]]
=== [[CHECK feature]] ===


*[[Discourse Functions]]
=== [[Complex Predicates]] ===


*[[Modality]]
=== [[Discourse Functions]] ===


*[[Negation]]
=== [[Modality]] ===


*[[Tense Aspect Mood]]
=== [[Negation]] ===


*[[Nouns]]
=== [[Tense Aspect Mood]] ===


*[[Testsuites]]
=== [[Nouns]] ===


*[[Questions]]
=== [[Testsuites]] ===


*[[Predicatives]]
=== [[Questions]] ===


*[[Non Latin Scripts]]
=== [[Predicatives]] ===
 
=== [[Non Latin Scripts]] ===


==Links to ParGram Groups==
==Links to ParGram Groups==
Line 72: Line 73:


*INESS treebanking infrastructure: http://clarino.uib.no/iness/page<br />
*INESS treebanking infrastructure: http://clarino.uib.no/iness/page<br />
==Non Latin Scripts==
XLE supports all types of scripts.  The relevant parts of the XLE documentation are the sections on Emacs and non-ASCII character sets (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC3) and Character Encodings (http://ling.uni-konstanz.de/pages/xle/doc/xle.html#SEC23).
Here is an example of how the Georgian grammar is set up.
In the Configuration file of the grammar, the following line should be added:
CHARACTERENCODING utf-8.
This tells XLE to expect utf-8.
If you are using emacs to write your grammars, you can add the following line to the very top of the main grammar file (e.g., georgian.lfg):
    &quot;;;; -*- Encoding: utf-8 -*-&quot;
This tells emacs that the file is in utf-8.
However, emacs tends to be tricky about utf-8, so you might also want to create a .emacs file and put the following information in it:
  (set-language-environment &quot;UTF-8&quot;)
  (setq process-coding-system-alist '((&quot;xle&quot; utf-8 . utf-8)
    (&quot;shell&quot; utf-8 . utf-8)
    (&quot;slime&quot; utf-8 . utf-8)))
  (setq default-process-coding-system '(utf-8 . utf-8))
If you already have a .emacs file, then simply add this information.
Another option is to not use emacs and write your test suite and grammar in a more utf-8 friendly editor.  In this case, you can access your test sentences via the &quot;parse-testfile&quot; command.  To get more information on this command, while in XLE type:
  % help parse-testfile
The version we want takes the name of a testsuite as a parameter and the number of the sentence you want to parse. For example:
  % parse-testfile your-testsuite.lfg 3


<br />
<br />

Latest revision as of 10:54, 12 January 2023

ParGramWiki

Welcome to PargramWiki, a wiki for documenting the ParGram project and its process. The wiki is permanently under construction. If you would like to have a user account, please contact Jessica Zipf.

Useful Links

  • ParGram Homepage
  • ParGram Workspace : Here, you can find meeting notes/slides/other material from past ParGram meetings and a search interface. The site was implemented by Anja Leiderer. If you would like access to this site (it is currently password protected), please contact Jessica Zipf.
  • ParGram Starter Grammars: Some of the knowledge accumulated in the ParGram effort over the years has been included as part of the XLE Documentation in terms of a Starter Grammar, A Walk Through and files containing features and templates that grammars have developed (and used in common). Grammar Writers who are just beginning work on a new language will find this repository of information helpful. In particular, common features and conventions developed within ParGram are explained as part of the Starter Grammar.
  • Starter Grammar, Walk Through and some useful tips: http://ling.uni-konstanz.de/pages/xle/doc/
  • Common Features & Common Templates: For files, see bottom of the page.

ParGram Topics

Members of ParGram have been meeting regularly since 1995 and have come together in so-called _Feature Committee_ meetings in which analyses across grammars are compared and discussed. As far as possible, common analyses and naming conventions are agreed upon. Some of the body of cross linguistic grammar engineering knowledge that has been accumulated was documented in the Grammar Writer's Cookbook (http://www.stanford.edu/group/cslipublications/cslipublications/site/1575861704.shtml). The material in this Wiki is an effort to share further knowledge that we have established together over the years with the wider community.

Here are some discussions and information about ParGram topics (in alphabetical order).

Auxiliaries

M-structure

CHECK feature

Complex Predicates

Discourse Functions

Modality

Negation

Tense Aspect Mood

Nouns

Testsuites

Questions

Predicatives

Non Latin Scripts

Links to ParGram Groups

Here are some links to the Wikis or sites of individual grammar groups. It might be useful to check out languages that are similar to the one you wish to work on (or are working on). If you would like to obtain a particular ParGram grammar, you should contact the groups directly. For example, the Polish grammar is available under the GNU General Public License (version 3).

XLE

XLE consists of cutting-edge algorithms for parsing and generating Lexical Functional Grammars (LFGs) along with a rich graphical user interface for writing and debugging such grammars. It is the basis for the ParGram project, which is developing industrial-strength grammars for English, French, German, Norwegian, Japanese, and Urdu. XLE is written in C and uses Tcl/Tk for the user interface. It currently runs on Solaris Unix, Linux, and Mac OS X.

More information on XLE including its availability can be found on the XLE homepage at: http://ling.uni-konstanz.de/pages/xle/

XLE-Web Interface

The XLE-Web Interface allows access to several ParGram grammars. One very good way of gaining an understanding of how different phenomena are treated within ParGram is to go to this website and parse example sentences online.

The XLE-Web interface along with several ParGram grammars is here:

ParGramBank

ParGramBank is a collection of parallel treebanks currently involving ten languages from six language families. All treebanks included in ParGramBank are constructed using output from individual ParGram grammars. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena.

The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. ParGramBank is a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.

ParGramBank can be accessed and downloaded for free via the INESS treebanking infrastructure: