Little things: August 2009

Saturday, August 29, 2009

MSc in Cognitive Computing at Goldsmiths

Goldsmiths' Department of Computing offers an interdisciplinary MSc in Cognitive Computing, aimed at humanities graduates. It claims to offer "a broad exploration of radical new theoretical approaches, characterised by their emphasis on embodiment, enactivism and European phenomenology". Some of the courses:

Cognitive science and its critics: This is the core module of the course and covers the history of cognitive science from the British empiricists to mind as motion; second order cybernetics and the embodied mind. You will look at: computing machinery and intelligence – the fundamentals of computing, program speedup, limitations of computing, what is a computer; the philosophy of artificial intelligence – critical review of key papers in the foundations of artificial intelligence; problems with computationalism – review of critiques by Dreyfus, Searle, Varela, Brooks, Penrose, Putnam, van Gelder etc.

Human cognition: The focus of this course is on the experimental investigation of cognition. The topics covered will include: expertise, talent, and savants; implicit and explicit memory; and face recognition and naming. The course will draw on behavioural, neuro-imaging, and neuropsychological studies, developmental approaches, computational modeling.

Topics in neuropsychology: This course covers a range of issues fundamental to developments in understanding the neuropsychology of both normal and abnormal human functioning. Specific topics will include: causes and psychological sequelae of brain injury; dysfunctions of memory, perception, language, and executive processes; neuro-imaging techniques; disorders of motivation, behaviour, and mood; neuropharmacology of cognitive dysfunction.

Technology of thought/Artificial Intelligence: This course provides an introduction to some of the ideas and techniques of artificial intelligence. The course will concentrate upon formal approaches to artificial intelligence, where logic is used as language for representation and reasoning with problems. The aim of the course is to encourage critical and analytical thinking.

Neural networks: This course introduces the theory and practice of neural computation. It provides the principles of neural computing with artificial neural networks widely used for addressing real world problems such as classification, regression, system identification, pattern recognition, data mining, time series prediction etc.

Tuesday, August 4, 2009

Colazzo and Costantino (1998)

Luigi Colazzo and Marco Costantino (1998): 'Multi-user hypertextual didactic glossaries' (International Journal of Artificial Intelligence in Education, 9: 111-127).

MOTIVATION: Traditional technical glossaries (either printed or electronic, either monolingual or bilingual) suffer from a number of limitations from the perspective of the user: (a) 'loss of reading context' - the user must 'move away' from the text he is reading to look a word up in the glossary; (b) it can take a non-native speaker of the language a significant amount of time to identify the correct 'citation form' of the word he wants to look up in the dictionary; (c) glossaries are traditionally 'closed' or 'static' - it is not possible for the user to MODIFY entries or ADD new entries; (d) users who are used to working with traditional printed glossaries often fail to make use of the most useful 'search' features of electronic glossaries (i.e. they lack a sophisticated 'mental model' of the glossary - Marchionini, 1989). There are also many problems for authors of 'hypertext glossaries' (where words or phrases in the text are marked up as 'anchors' (i.e. hyperlinks) to glossary entries): (a) such annotation is very time consuming when done by hand; (b) the glossary text may not be stored in a linkable format (e.g. binary doc files); (c) if annotation is to be done automatically, then the source text needs to be morphologically analysed to identify the underlying lexemes; (d) although overlapped/embedded anchors are occasionally desirable, they are impossible to code.

Four models of hypertext glossary lookup (Black, Wright, Black and Norman, 1992): (a) TABS - the glossary entry for the selected word completely replaces the source text (it is not possible to view both at the same time, but the user had to toggle between the two tabs); (b) POPUPS - the glossary entry for the selected word effaces only a small (hopefully unimportant) part of the source text; (c) SIDEBAR - there is a permanent sidebar for displaying the selected glossary entry; (d) PREDICTIVE SIDEBAR - there is a permanent sidebar containing the glossary entries for all relevant words in the current paragraph of the source text. However, none of these models make it clear how to handle RECURSIVE lookups, i.e. where the glossary definition itself contains a word which needs to be looked up.

SOLUTIONS: (a) the glossary is indexed by 'stems' rather than particular word-forms; (b) the lookup method relies on a 'word-stemming algorithm' (what about irregular forms?); (c) links are created 'automatically' from the text to the glossary, i.e. no explicit 'anchors'; (d) the popup window containing the glossary definition should be a proper window, able to be moved around, iconised and destroyed independently of the main text window - this also provides a solution to the 'recursive lookup' problem. Also, the user can access the glossary in two distinct ways: (a) RETRIEVAL - send the glossary a text string and get back a definition; (b) BROWSING - either using an alphabetical index organised as a card file, or performing a sequential scan of entries. There is a dynamic (i.e. can be extended by the user) word-stemming algorithm, which is based on a list of all the stems in Italian, as well as the regular suffixes (but it cannot handle irregular morphology or do any POS disambiguation). The system also allows the glossary to updated by multiple users (i.e. it is a wiki).

Monday, August 3, 2009

Nerbonne, Dokter and Smit (1998)

John Nerbonne, Duco Dokter and Petra Smit (1998): 'Morphological processing and computer-assisted language learning' (Computer Assisted Language Learning 11(5) pp. 543-559).

MOTIVATION: CALL systems do not generally make use of NLP technology - they limit themselves to putting self-study courses into electronic form, and hence use hand-coded/hard-wired linguistic knowledge. However, there are many CALL subtasks which should benefit from state-of-the-art (almost 'error-free') morphological or phonological processing, i.e. for use as 'support tools' in analysis of 'authentic materials', and acquisition of vocabulary.

SYSTEM DESCRIPTION: GLOSSER is a program which allows Dutch learners of French to import French texts, select individual words, and get information about these words. The main frame is a read-only text display for the source text. There are three minor frames giving information about the selected word: (a) dictionary definition (for the underlying lexeme) from the Van Dale French-Dutch dictionary; (b) (disambiguated) POS and morphological analysis; and (c) other example sentences using forms of the underlying lexeme (including some from bilingual corpora). Users can add NOTES to each selected word, to avoid multiple look-ups. The morphological analysis/POS disambiguation/lexeme identification is done using the state-of-the-art LOCOLEX software (from the Rank Xerox Research Centre).

There is a simple evaluation of the system (from the perspective of lexical coverage and accuracy). There was also a small user study with 22 students, comparing use of GLOSSER with traditional physical dictionary lookup in a traditional text comprehension task, with generally positive results.

COMMENT: The authors point out the problem of identifying 'multi-word lexemes' in source text as being worthy of future work. This is a big question. Integration with some kind of personal vocabulary database (flashcard generator?) would be a good idea. Can this idea be extended for spoken language files or videos?

The GLOSSER homepage.

Little things