Effects of context and recency in scaled word completion
Source
Computational Linguistics in the Netherlands Journal, 1, (2011), pp. 79-94ISSN
Publication type
Article / Letter to editor
Display more detailsDisplay less details
Organization
Communicatie- en informatiewetenschappen
Former Organization
Bedrijfscommunicatie
Journal title
Computational Linguistics in the Netherlands Journal
Volume
vol. 1
Page start
p. 79
Page end
p. 94
Subject
Professional CommunicationAbstract
The commonly accepted method for fast and e cient word completion is storage and retrieval of character n-grams in tries. We perform learning curve experiments to measure the scaling performance of the trie approach, and present three extensions. First, we extend the trie to store characters of previous words. Second, we extend the trie to the double task of completing the current word and predicting the next word. Third, we augment the trie with a recent word bu er to account for the fact that recently used words have a high chance of recurring. Learning curve experiments on English and Dutch newspaper texts show that (1) storing the characters of previous words yields an increasing and substantial improvement over the baseline with more data, also when compared to a word-based text completion baseline; (2) simultaneously predicting the next word provides an additional small improvement; and (3) the initially large contribution of a recency model diminishes when the trie is trained on more background training data.
This item appears in the following Collection(s)
- Academic publications [246164]
- Electronic publications [133744]
- Faculty of Arts [29989]
- Open Access publications [107272]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.