Using idiolects and sociolects to improve word prediction
Publication year
2014Publisher
[S.l.] : Association for Computational Linguistics
In
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 318-327Related links
Annotation
14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2014), 26 april 2014
Publication type
Article in monograph or in proceedings

Display more detailsDisplay less details
Organization
Humanities Lab (t/m 2018)
Communicatie- en informatiewetenschappen
Languages used
English (eng)
Book title
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Page start
p. 318
Page end
p. 327
Subject
ADNEXT (Adaptive Information Extraction over Time); Language & Speech Technology; Language in Society; Nederlab; NederlabAbstract
In this paper the word prediction system Soothsayer is described. This system predicts
what a user is going to write as he is keying it in. The main innovation of Soothsayer is that it not only uses idiolects, the language of one individual person, as its source of knowledge, but also sociolects, the language of the social circle around that person. We use Twitter for data collection and experimentation. The idiolect models are based on individual Twitter feeds, the sociolect models are based on the tweets of a particular person and the tweets of the people he often communicates with. The idea behind this is that people who often communicate start to talk alike; therefore the language of the friends of person x can be helpful in trying to predict what person x is going to say. This approach achieved the best results. For a number of users, more than 50% of the keystrokes could have been saved if they had used Soothsayer.
This item appears in the following Collection(s)
- Academic publications [205116]
- Electronic publications [103343]
- Faculty of Arts [24002]
- Open Access publications [71830]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.