Skip to main content
Training material

Jupyter notebooks for Europeana newspaper text resource processing with CLARIN NLP tools


These notebooks have been designed to help getting started with the processing historical text resources (from Europeana Newspapers) with natural language processing (NLP) tools (from CLARIN) using Jupyter notebooks.

The easiest way to get started is to click the Launch binder badge above. This will guide you through the process of creating your own Jupyer instance in which you can interactively discover ways of accessing and querying metadata, analysing text resources within notebooks, and using advanced NLP tools on your own selection of newspaper texts.

Of course, you can use these notebooks in a any local or remote environment that you have access to. Two alternatives to the binder based solution are:

  1. installing Anaconda, which offers a user friendly interface and makes it possible to set up a local Jupyter instance with a few clicks, and

  2. jupyter-repo2docker which offers an easy way to create a docker image based on this repository for those with access to an environment where docker is available.

For more information, you can also have a look at start.ipynb right here.

These training materials have been developed by Twan Goosen and Michał Gawor (CLARIN ERIC) in the context of the Europeana DSI-4 project.

Thanks to Dieter Van Uytvanck (CLARIN ERIC), Iulianna van der Lek-Ciudin (CLARIN ERIC ), Alba Irollo (Europeana) for their contributions.

The materials in this repositoy are released under a CC0 1.0 licence.

European Union flag

The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.