Tuesday, August 4, 2009

Colazzo and Costantino (1998)

Luigi Colazzo and Marco Costantino (1998): 'Multi-user hypertextual didactic glossaries' (International Journal of Artificial Intelligence in Education, 9: 111-127).

MOTIVATION: Traditional technical glossaries (either printed or electronic, either monolingual or bilingual) suffer from a number of limitations from the perspective of the user: (a) 'loss of reading context' - the user must 'move away' from the text he is reading to look a word up in the glossary; (b) it can take a non-native speaker of the language a significant amount of time to identify the correct 'citation form' of the word he wants to look up in the dictionary; (c) glossaries are traditionally 'closed' or 'static' - it is not possible for the user to MODIFY entries or ADD new entries; (d) users who are used to working with traditional printed glossaries often fail to make use of the most useful 'search' features of electronic glossaries (i.e. they lack a sophisticated 'mental model' of the glossary - Marchionini, 1989). There are also many problems for authors of 'hypertext glossaries' (where words or phrases in the text are marked up as 'anchors' (i.e. hyperlinks) to glossary entries): (a) such annotation is very time consuming when done by hand; (b) the glossary text may not be stored in a linkable format (e.g. binary doc files); (c) if annotation is to be done automatically, then the source text needs to be morphologically analysed to identify the underlying lexemes; (d) although overlapped/embedded anchors are occasionally desirable, they are impossible to code.

Four models of hypertext glossary lookup (Black, Wright, Black and Norman, 1992): (a) TABS - the glossary entry for the selected word completely replaces the source text (it is not possible to view both at the same time, but the user had to toggle between the two tabs); (b) POPUPS - the glossary entry for the selected word effaces only a small (hopefully unimportant) part of the source text; (c) SIDEBAR - there is a permanent sidebar for displaying the selected glossary entry; (d) PREDICTIVE SIDEBAR - there is a permanent sidebar containing the glossary entries for all relevant words in the current paragraph of the source text. However, none of these models make it clear how to handle RECURSIVE lookups, i.e. where the glossary definition itself contains a word which needs to be looked up.

SOLUTIONS: (a) the glossary is indexed by 'stems' rather than particular word-forms; (b) the lookup method relies on a 'word-stemming algorithm' (what about irregular forms?); (c) links are created 'automatically' from the text to the glossary, i.e. no explicit 'anchors'; (d) the popup window containing the glossary definition should be a proper window, able to be moved around, iconised and destroyed independently of the main text window - this also provides a solution to the 'recursive lookup' problem. Also, the user can access the glossary in two distinct ways: (a) RETRIEVAL - send the glossary a text string and get back a definition; (b) BROWSING - either using an alphabetical index organised as a card file, or performing a sequential scan of entries. There is a dynamic (i.e. can be extended by the user) word-stemming algorithm, which is based on a list of all the stems in Italian, as well as the regular suffixes (but it cannot handle irregular morphology or do any POS disambiguation). The system also allows the glossary to updated by multiple users (i.e. it is a wiki).

No comments:

Post a Comment