Effects of context and recency in scaled word completion
Number of pages
SourceComputational Linguistics in the Netherlands Journal, 1, (2011), pp. 79-94
Article / Letter to editor
Display more detailsDisplay less details
Communicatie- en informatiewetenschappen
Computational Linguistics in the Netherlands Journal
The commonly accepted method for fast and e cient word completion is storage and retrieval of character n-grams in tries. We perform learning curve experiments to measure the scaling performance of the trie approach, and present three extensions. First, we extend the trie to store characters of previous words. Second, we extend the trie to the double task of completing the current word and predicting the next word. Third, we augment the trie with a recent word bu er to account for the fact that recently used words have a high chance of recurring. Learning curve experiments on English and Dutch newspaper texts show that (1) storing the characters of previous words yields an increasing and substantial improvement over the baseline with more data, also when compared to a word-based text completion baseline; (2) simultaneously predicting the next word provides an additional small improvement; and (3) the initially large contribution of a recency model diminishes when the trie is trained on more background training data.
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.