Challenges and experiences in collecting a chat corpus
SourceJournal for Language Technology and Computational Linguistics, 29, 2, (2014), pp. 1-15
Article / Letter to editor
Display more detailsDisplay less details
Nederlandse Taal en Cultuur
Journal for Language Technology and Computational Linguistics
SubjectLanguage in Society; Persuasive Communication
Present day access to a wealth of electronically available linguistic data creates enormous opportunities for cutting edge research questions and analyses. Computer-mediated communication (CMC) data are specifically interesting, for example because the multimodal character of new media puts our ideas about discourse issues like coherence to the test. At the same time CMC data are ephemeral, because of rapid changing technology. That is why we urgently need to collect CMC discourse data before the technology becomes obsolete. This paper describes a number of challenges we encountered when collecting a chat corpus with data from secondary school children in Amsterdam. These challenges are various in nature: logistic, ethical and technological.
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.