The perfect solution for detecting sarcasm in tweets #not
Publication year
2013Publisher
New Brunswick, NJ : ACL
In
Balahur, A.; Goot, E. van der; Montoyo, A. (ed.), Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 29-37Related links
Annotation
4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA-2013), 14 juni 2013
Publication type
Article in monograph or in proceedings
Display more detailsDisplay less details
Editor(s)
Balahur, A.
Goot, E. van der
Montoyo, A.
Organization
Communicatie- en informatiewetenschappen
Former Organization
Bedrijfscommunicatie
Languages used
English (eng)
Book title
Balahur, A.; Goot, E. van der; Montoyo, A. (ed.), Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Page start
p. 29
Page end
p. 37
Subject
ADNEXT (Adaptive Information Extraction over Time); Language & Speech Technology; Language in Society; Nederlab; Persuasive Communication; Style and Persuasive Power: Language Intensity; The changing dynamics of news (project of: ADNEXT (Adaptive Information Extraction over Time (is project of COMIC)); Stijl en overtuigingskracht: TaalintensiteitAbstract
To avoid a sarcastic message being understood in its unintended literal meaning, in microtexts such as messages on Twitter.com sarcasm is often explicitly marked with the hashtag ‘#sarcasm’. We collected a training corpus of about 78 thousand Dutch tweets with this hashtag. Assuming that the human labeling is correct (annotation of a sample indicates that about 85% of these tweets are indeed sarcastic), we train a machine learning classifier on the harvested examples, and apply it to a test set of a day’s stream of 3.3 million Dutch tweets. Of the 135 explicitly marked tweets on this
day, we detect 101 (75%) when we remove the hashtag. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 30% of the top-250 ranked tweets are indeed sarcastic. Analysis shows that sarcasm is often signalled by hyperbole, using intensifiers and exclamations; in contrast, non-hyperbolic sarcastic messages often receive an explicit marker. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of nonverbal expressions that people employ in live interaction when conveying sarcasm.
This item appears in the following Collection(s)
- Academic publications [244262]
- Electronic publications [131202]
- Faculty of Arts [29768]
- Open Access publications [105228]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.