Skip to main content
Home

Automatic Image Annotation Workflow

Steps to fine-tune pre-trained image recognition model for domain-specific applications

This workflow outlines a process for fine-tuning a pre-trained image recognition model to enhance its ability to recognize specific object categories that are underrepresented or entirely absent in its original training dataset. The primary goal is to create a lightweight machine learning (ML) model capable of annotating images using terms from domain-specific controlled vocabularies. This facilitates more accurate and consistent image annotation in specialized contexts.

The workflow serves the dual purpose of improving the model's performance on domain-specific data and streamlining the image annotation process. By iteratively combining manual annotation, automated annotation using the fine-tuned model, and model re-training, the workflow supports efficient creation of high-quality annotations, even for large and complex datasets.

Key methods implemented include:

  • Data preparation and harmonization, division into training and testing datasets.
  • Fine-tuning: Re-training a pre-trained ML model on the manually annotated dataset.
  • Automated annotation: Using the fine-tuned model to annotate unlabeled images.
  • Verification and iteration: Refining the model by verifying and correcting its outputs, followed by additional training rounds.

The workflow follows an iterative cycle: starting with manual annotations, training the model, applying it to unlabeled data, validating its outputs, and re-training to achieve incremental improvements. This approach is applicable to any general-purpose image recognition model and enables domain-specific adaptation of image recognition models, making them effective tools for tasks like object classification and metadata generation in specialized fields.

Requirements

  • The resulting model is lightweight and versatile, i.e. can be used in various places, applications and workflows.
  • The annotated dataset can be used to further fine-tune the model, as the amount of annotated images grows. The ground-truth dataset is in a widely used format and structured according to good practice in the field.
  • Image metadata is enriched with terms from domain-specific controlled vocabularies enhancing their general FAIRness.
    • It is possible to find images according to their contents described using controlled vocabulary terms.
    • Interoperability is increased by using shared controlled vocabularies and formal mappings between various vocabularies.
    • Re-usability of the data is enhanced because the contents of images are formally annotated.

Use-case

In case of the ATRIUM project sub-task Automatic Image Recognition, the goal is to annotate images from archaeological archives with terms from domain-specific controlled vocabularies, e.g. types of artifacts present in the photographs, types of objects etc. This greatly improves usability of archaeological image archives, as metadata of the images are enriched with specific terms from controlled vocabularies, findability of images with specific contents is enhanced, and metadata description of photographs is simplified and automated.

Two types of images are used in the process:

  1. photographs of (mostly) single finds (artefacts) photographed on (often) standardized backgrounds with a scale,
  2. archival (legacy) photographs with various contents, mostly photographs of fieldwork and archaeological objects (trenches, burials etc.)

The intended outcome is two-fold, firstly process vast amounts of archival archaeological images and thus improve their metadata descriptions in the Archaeological Map of the Czech Republic (AMCR) repository and discovery service (parts of the Archaeological Information System of the Czech Republic (AIS CR)) and in the ARIADNE Knowledge Base and Portal, and secondly, automate the metadata description of photographs submitted to the AMCR repository by metal detectorists through the AMCR-PAS portal.

Media

Related items(3)

Workflow steps(7)

  1. 1 Consider the goal and use-case

  2. 2 Gather and harmonize image data

  3. 3 Align controlled vocabulary terms

  4. 4 Annotate images

  5. 5 Split into train, test and val datasets

  6. 6 Re-train model

  7. 7 Apply the ML model

European Union flag

The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.

CESSDACLARINDARIAH-EU