Managing Textual Corpora in under-resourced languages
This workflow is to be used with the “Building Corpus” and “Archiving Corpora” workflows, also available in the SSH Open Marketplace, to have a complete set of passages to management of corpora for under-resourced languages. The “Corpus Management” workflow provides a complete set of steps to manage your corpora while staying compliant with FAIR and Open Science principles. Best practices, dedicated tools and references to other workflows will be provided.
This workflow was designed during the workshop "Creating Managing and Archiving Textual Corpora in Under-resourced Languages”. The workshop brought together experts from different fields of studies working on corpora to discuss how to address pressing issues in the management of under-resourced languages. The workshop was conceived by DARIAH Working Groups Research Data Management and Multilingual DH, financed by DARIAH-EU Funding Scheme for Working Group Activities 2023-25, and hosted by the University of Hamburg on 28th to 30th August 2024.
Workflow steps(7)
1 Preparing material
2 Documenting your Corpora
3 Creating metadata
4 Using standardised vocabularies
5 Updating your Data Management Plan
6 Assigning (open) license
7 Versioning
The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.