Annotation Types & How To Prepare Data

This page explains the four fundamental annotation types implemented in the Alegion platform, and how to prepare your data for each.

Video Annotation

Video annotation task screenshot

Alegion's video annotation tools are optimized for:

  • precise object localization
  • object classification
  • attribute assignment
  • complex relationships to other objects (including composition, association)
  • frame-by-frame scene classification
  • instance recognition

ML-enhanced labeling

For certain use cases, we integrate deep learning and/or computer vision techniques to pre-label and perform automated entity resolution on videos in order to drastically reduce effort without sacrificing annotation quality.

Handling large videos

Alegion's infrastructure techniques and task interfaces are further optimized to handle very large videos, in terms of dimensions, length (frame count), target density, or combinations of each.

Preparing your data

Video assets are hosted as described in the data pipelines documentation. Input records for video primarily consist of the URL to the asset itself.

Videos must be in the webm format. (See the open standard reference.) Our customer success team will be happy to provide you with conversion help as needed.

Image Annotation

Image annotation task screenshot

Alegion's image annotation tools are optimized for:

  • scene classification
  • object localization and classification
  • semantic segmentation
  • instance segmentation
  • complex relationships among objects (including composition, association)

Scenes and instances support any number of additional attributes (e.g. "partially occluded").

ML-enhanced labeling

In addition to the pre-labeling techniques supported in video annotation (above), our image annotation interfaces allow labelers to move faster with greater accuracy using:

  • SmartPoly™, which allows labelers to create high-fidelity polygons and masks around complex shapes in four clicks
  • extreme clicking, which drastically accelerates the drawing of accurate bounding boxes

Preparing your data

Images are hosted as described in the data pipelines documentation. Input records for image annotation tasks primarily consist of the URL to the asset itself.

NER/NLP Annotation

NLP named entity recognition task screenshot

Our purpose-built interface for NLP is ideal for:

  • named entity recognition (NER)
  • temporal tagging
  • keyphrase tagging
  • part of speech identification
  • fine-grained sentiment identification

Preparing your data

Documents used in this tool are loaded asynchronously (similar to images and videos). The document itself is separate from the input record, which primarily consists of the URL to the asset itself.

To prepare your corpus, simply write each document as a .txt file. (All Unicode characters are supported.) Your corpus is securely hosted as described in the data pipelines documentation.

Compound/General Tasks

Compound/general annotation task screenshot

Alegion is also well-suited for a vast variety of general purpose tasks, thanks to a composable task design system that supports:

  • exclusive pick lists (e.g. radio buttons)
  • multi-select pick lists (e.g. checkboxes)
  • taxonomies up to seven levels deep, with hundreds of leaf nodes
  • video and audio players
  • free text entry
  • freeform tags ("pill" interface)

Furthermore, our task designs support linking multiple form controls with conditional logic. This allows your nuanced business or domain rules to be encoded into choices available to the labeller, in turn speeding up complex tasks by simplifying the user interface.

The layout of compound tasks is flexible without being chaotic. Controls conform to a grid that is responsive to device width, but you can tailor the task design to your specific needs and users.

Finally, these form elements can be combined with the special-purpose tasks (video, image, NER) in order to capture information beyond what those interfaces include (e.g. free-form text descriptons of a scene).

Preparing your data

Fields in input records for this type of task are determined by the controls used, and the fieldnames may also differ in various stages of a multi-stage workflow. Simply match your input record fieldnames and data types to those expected by the task design. (Note that fieldnames must be unique within a workflow.)

Unlike images, videos, and text documents, where only metadata like the asset's URL is stored in Alegion, input data for compound/general tasks is part of the input record.

In cases where general task types are added to a special-purpose interface (video, image, NER), the digital assets remain separate.