Date of Archiving
2018Archive
DANS EASY
Related publications
Publication type
Dataset

Display more detailsDisplay less details
Organization
Nederlandse Taal en Cultuur
Humanities Lab (t/m 2018)
CLST - Centre for Language and Speech Technology
Audience(s)
Communication sciences
Languages used
Dutch
Key words
social media; whatsapp; corpus linguistics; syntactic analysisAbstract
Whatsappdata collected for the PhD research of Lieke Verheijen (Radboud University). Informed consent only from contributor and not from conversational partner. Consequently, the subcorpus only contains contributions from the submitter. Metadata per conversation are available in CMDI XML files. Ref: Verheijen, L., & Stoop, W. (2016, September). Collecting facebook posts and whatsapp chats. In International Conference on Text, Speech, and Dialogue (pp. 249-258). Springer, Cham.
The corpus has been made available for the CLARIAH sponsored ACAD project.
See https://www.clariah.nl/projecten/research-pilots/acad/acad and https://cesar.science.ru.nl/.
Cooperators:
Micha Hulsbosch - Radboud University Nijmegen, Faculty of Arts, Humanities Lab, TSG
Wilbert Spooren - Radboud University Nijmegen, Faculty of arts, Dutch language
Erwin R. Komen - Radboud University Nijmegen, Faculty of Arts, Humanities Lab, TSG
Patrick Sonsma - Radboud University Nijmegen, Faculty of arts, Dutch language
Original researcher:
Lieke Verheijen - Radboud University Nijmegen, Faculty of arts, Dutch language
The corpus contains 218 WhatsApp chat sessions that have been collected by Lieke Verheijen in 2012-2014 in the Netherlands.
The exact date of each chat is included in the <event> tag attributes in the .folia.xml files.
The participants have all indicated that their chats can be used (in an anonymized form) for research purposes.
Metadata per chat are available in CMDI XML files.
The File textlist-folia.json contains an overview of all available texts in json format.
===========================================================
Version: Date: Notes:
1.0 10/oct/2018 [First] archiving of this corpus
===========================================================
This item appears in the following Collection(s)
- Datasets [1591]
- Faculty of Arts [28912]