Dataset: tweets and events linked to the paper 'Open-domain extraction of future events from Twitter'
Display more detailsDisplay less details
Communicatie- en informatiewetenschappen
Key wordsNatural Language Processing; Twitter; Event detection; Information extraction
Input data and output of research conducted in the study described in the paper: F. Kunneman and A. Van den Bosch (2016), Open-domain extraction of future events from Twitter, Natural Language Engineering, doi: 10.1017/S1351324916000036 The paper describes a system that extracts future referring time expressions and entities from Twitter messages, and subsequently detects events as a pair of a date and entity the are often mentioned in the same tweet. This dataset features the ids of a large set of Dutch tweets posted in August 2014, which was used as input to the system, as well as the time expression and / or entity that was extracted from each tweet, if any. Furthermore, the detected events are included, represented as a date, one or more describing terms, the tweetids that refer to it and the assessment of the event by human annotators.