Skip to main content
Tool or service

Machine-learning-based interactive tool to identify the content of collective agreements (and other legal texts) in multiple languages

Since 2012, the WageIndicator Foundation has maintained a Collective Agreements Database, where the texts of 1668 collective agreements (CBAs) from 61 countries and in 29 languages have been uploaded, coded and annotated. Under the SSHOC project and with the support of the CLARIN Research Infrastructure, the agreements have been manually and automatically annotated on several levels. Within this project, machine learning techniques and models are being used to identify where in a CBA a specific topic is addressed. A guided notebook applies and compares different models and - for each language and each question - finds which model works better in spotting the right piece of text. The result is then used in this interactive tool, which allows the user to upload new collective agreements texts and find where a specific topic is addressed. As an extra feature, it also provides a worker friendliness score for the uploaded text.

European Union flag

The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.