The Wrætlic computational linguistics tools are the ones that have been used for the research described in the thesis above. They include:

  1. Tokenisation and sentence splitting
  2. PoS tagging
  3. Morphological analysis
  4. Named Entity Recognition and Classification
  5. Chunking and partial parsing
  6. Word-Sense Disambiguation
  7. Extract and headline generation
The following people have contributed to the Wraetlic tools:

Version 2.0

Version 2.0 has been finally packaged in September 2005. Some features that have been added are:
  • Correction of several errors reported for version 1.0.
  • Correction of several incompatibilities with the java 1.5 interpreter.
  • Named Entity Identification (still not retrainable)
  • Automatic summary generation, with several possibilities: selection of sentences, generation of headlines.
  • Lesk's based Word Sense Disambiguation.
  • Some of the modules trained for the Spanish language.
The tools include a MySQL version of the Princeton WordNet, (c) Princeton University,

Note that wraetlic is no longer maintained and available for download. If you are interested in downloading it, please send me an email.