|
Wraetlic tools
|
Wrætlic computational linguistics tools
The Wrætlic computational linguistics tools are the ones that
have been used for the research described in the thesis above. They
include:
- Tokenisation and sentence splitting
- PoS tagging
- Morphological analysis
- Named Entity Recognition and Classification
- Chunking and partial parsing
- Word-Sense Disambiguation
- Extract and headline generation
The following people have contributed to the Wraetlic tools:
Version 2.0
Version 2.0 has been finally packaged in September 2005. Some
features that have been added are:
- Correction of several errors reported for version 1.0.
- Correction of several incompatibilities with the java 1.5 interpreter.
- Named Entity Identification (still not retrainable)
- Automatic summary generation, with several possibilities:
selection of sentences, generation of headlines.
- Lesk's based Word Sense Disambiguation.
- Some of the modules trained for the Spanish language.
The tools include a MySQL version of the Princeton WordNet, (c)
Princeton University, http://wordnet.princeton.edu/.
Note that wraetlic is no longer maintained and available for download.
If you are interested in downloading it, please send me an email.
|
|
|