Contents
Publications
Research lines
Ontology building
Information Extraction
Automatic Summarisation
Question-Answering
Adaptive hypermedia
Free-text CAA
APL and APL2
Others
Ph.D. Thesis (2003)
Introduction
Thesis (full text)
APL2-WordNet Interface
Downloads
Wraetlic tools
Instances in WordNet 1.7
Ph.D. Thesis full text

Ph.D. Thesis

An approach for automatic generation of on-line information systems based on the integration of Natural Language Processing and Adaptive Hypermedia techniques

Abstract

It is a fact that the Internet has consolidated as a widely used mean to convey information. It was soon appreciated that different people access the web with different needs, a fact which motivated the appearance of web sites that provided different information and were structured in different ways depending on the user. Nowadays many web-based systems store user profiles containing some characteristics of the users. These profiles are used to decide which particular information will be shown to each particular visitor, and how it will be organised.

Moreover, different kinds of applications need to know different characteristics of the users. For instance, e-commerce applications use the shopping history and the user's tastes in order to suggest further products; on-line educational systems keep track of the concepts that have already been studied, and the tests that have been successfully solved by the student; and on-line information systems and retrieval applications have to know precisely the information needs of the user in order to provide the most relevant data. In the same way, the procedures for deciding the contents and structure of the web sites in function of the user profiles vary across applications.

Even though there are applications for authoring web sites, constructing them is not yet particularly easy. Amongst the limitations of current authoring tools for on-line information systems are that the kinds of information stored in the user profiles or the rules for adaptation are usually restricted to a few pre-defined types; but, most importantly, they usually require the web author to write all the particular chunks of texts that will be presented to the different users. Therefore, the web author probably has to write as many different versions of the same texts as the number of possible user profiles that affect the contents of the site.

This work describes a framework that combines techniques from different fields in order to create, in a fully automatic way, on-line information systems from linear texts in electronic format, such as textbooks. It borrows ideas from User Modelling and Adaptive Hypermedia for storing and updating the user profiles, and for changing the contents and the structure of the web site according to them. Natural Language Techniques are also applied in order to gather automatically information about the relevant terms found in the original texts, and for adapting the output contents of the site, using automatic filtering and summarisation techniques. The architecture is divided into two steps: an off-line processing step, which collects information about the original linear text, and an on-line step, which executes when a user connects to the system with a web browser, and the contents and hyperlinks are generated.

The framework has been implemented as the Welkin system, which has been used to build three adaptive on-line information sites in a quick and easy way. Some controlled experiments have been performed with real users aimed to provide positive feedback on the implementation of the system.

To download the thesis, click here.