Automatic Text Recognition Roadmap
Work-in-progress: this workflow is not finalised yet.
Automatic Text Recognition (ATR) uses Artificial Intelligence (AI), in particular machine learning (ML), to extract text from a scanned image. It encompasses two main techniques: Optical Character Recognition (OCR), extracting text from printed documents, and Handwritten Text Recognition (HTR), exracting text from manusripts.
This workflow presents the main steps of an ATR workflow and how to integrate it in your research project.
Workflow steps(9)
2 Resources
3 Integrating ATR in your workflow
4 Image Acquisition
5 Image Pre-Processing
6 Layout Analysis
7 Text Recognition and Model Training
8 Quality Assurance and Metrics
9 Endformat and Reusability
The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.