Testsuites

From Parallel Grammar Wiki
Jump to navigation Jump to search

This page lists some interesting testsuite resources. Thanks go to Emily Bender and Dan Flickinger for naming some of the resources.

TSNLP testsuites

Testsuites put together by the TSNLP project. Very linguistically principled, but not a very large range of languages:

http://www.dfki.de/lt/project.php?id=Project_380&l=en

(This is apparently no longer available for download. It might be worth contacting the people behind the project for a copy of the testsuites created.)

The TSNLP testsuite is available from the ELRA catalogue:

http://islrn.org/resources/717-350-913-018-8/

The Konstanz site has gotten a hold of it via ELRA. Contact Sebastian Sulger for instructions on how to license the TSNLP package.

MRS testsuite (DELPH-IN)

There is also the MRS testsuite, created by DELPH-IN. This started as a resource for English and has been translated to a few languages. Its focus is on illustrating core semantic phenomena:

http://moin.delph-in.net/MatrixMrsTestSuite

This testsuite is also part of the [incr tsdb()] software package (http://www.delph-in.net/itsdb/) for several languages, but there is a more comprising collection online, accessible via the link above.

I have compiled a package of MatrixMRS testsuites for multiple languages here: http://ling.uni-konstanz.de/pages/home/sulger/files/MatrixMRSTestSuite.tar.gz The testsuites have varying formats since the source page presented them in differing formats.

Another semantically-oriented testsuite (to augment the MRS testsuite above)

Recent work on documenting the semantic analyses in the English Resource Grammar has led to another semantically-oriented testsuite, to augment the MRS testsuite. This one is monolingual, though.

http://moin.delph-in.net/ErgSemantics http://svn.emmtee.net/trunk/uio/wesearch/esd.txt

FraCaS test suite

FraCaS test suite from 1996, focused on linguistic phenomena related to logical inference, described in a technical report by Cooper et al. from 1996. Here is the reference and a link to the paper:

Robin Cooper, Dick Crouch, Jan Van Eijck, Chris Fox, Josef Van Genabith, Jan Jaspars, Hans Kamp, Manfred Pinkal, David Milward, Massimo Poesio, and Steve Pulman. 1996. Using the Framework. Technical report, FraCaS: A Framework for Computational Semantics. FraCaS deliverable D16.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.7694&rep=rep1&type=pdf

An XML version of this data is available for download from Bill MacCartney at this website:

http://www-nlp.stanford.edu/~wcmac/downloads/

The FraCaS data was also used for a bilingual English-Swedish test suite by Peter Ljunglöf and Magdalena Siverbo, described in this paper:

http://gup.ub.gu.se/records/fulltext/168967/168967.pdf

HP NL Testsuite

Hewlett-Packard Natural Language Testsuite, originally by Dan Flickinger, Marilyn Friedman, Mark Gawron, John Nerbonne, Carl Pollard, Geoffrey Pullum, Ivan Sag, and Tom Wasow.

http://www.ual.es/personal/nperdu/hpsuite.htm

I put a copy of that testsuite up here, in case the above link stops working:

http://ling.uni-konstanz.de/pages/home/sulger/files/hp-nl-testsuite.txt

Testsuites inspired by Linguist Fieldwork

Wayan Arka says the following:

Field linguists typically have a kind of opportunistic data collection techniques in the field but they often use Lingua questionnaires, or make use of the available elicitation materials e.g. created by the MPI (http://fieldmanuals.mpi.nl/), or create their own elicitation materials.

The attached questionnaires created for my NSF-funded Voice project seem to look like test suites that we use in ParGram. As you will see, they have English and Indonesian for each item. We can adapt the questionnaires, if you like.

I put up the testsuites sent by Wayan below.