Date of Archiving
2018Archive
DANS EASY
Publication type
Dataset

Display more detailsDisplay less details
Organization
Nederlandse Taal en Cultuur
Humanities Lab (t/m 2018)
CLST - Centre for Language and Speech Technology
Audience(s)
Communication sciences
Languages used
Dutch; English
Key words
newspaper; NRC; POS-tagging; syntactic analysisAbstract
Newspaper texts taken from printed and and digital versions of the NRC newspaper. The texts cover blogs, hard news, background articles, opinion articles on related topics. Metadata per article are available in CMDI XML files.
The 'NRC2011' corpus has been created for the CLARIAH sponsored ACAD project.
See https://www.clariah.nl/projecten/research-pilots/acad/acad and https://cesar.science.ru.nl/.
Cooperators:
Micha Hulsbosch - Radboud University Nijmegen, Faculty of Arts, Humanities Lab, TSG
Wilbert Spooren - Radboud University Nijmegen, Faculty of arts, Dutch language
Erwin R. Komen - Radboud University Nijmegen, Faculty of Arts, Humanities Lab, TSG
The corpus contains 2225 newspaper texts taken from printed and and digital versions of the NRC newspaper (year 2011).
The texts cover blogs, hard news, background articles, opinion articles on related topics. Metadata per article are available in CMDI XML files.
The File textlist-folia.json contains an overview of all available texts in json format.
The file NRCLicentieovereenkomst.pdf contains the License Agreement with NRC.
===========================================================
Version: Date: Notes:
1.0 5/jun/2018 [First] archiving of this corpus
===========================================================
This item appears in the following Collection(s)
- Datasets [1269]
- Faculty of Arts [23945]