Data pipelines & security

Alegion treats every customer's data as proprietary and sensitive. This page describes a layered set of techniques we've developed over the past several years. These safeguards can be combined and tailored to your specific sensitivities, including data export controls, government regulations, and a variety of limitations on human access.

Digital assets and how they're accessed

For our purposes here, "digital assets" refers to image, video, and audio files, as well as text documents used in our Named Entity Recognition tool. For these types of task, the Alegion platform does not need to read your digital assets. Since it is a web application, the user's browser must be able to load the asset, but there are a variety of ways to limit access, as described below.

We support three approaches to asset hosting: fully-managed, self-hosted, and self-hosted with VPN. Assets can be further protected by using various human controls as described below.

The platform's minimal data footprint

Note that in any case, the platform will retain the asset's URL (which might be an expiring URL as described below), as well as the annotations. For instance, a simple image annotation record with one bounding box might look like this:

    url: 'https://someurl/someimage.jpg',
    coords: [55,49,104,78],
    classification: 'Basset Hound'


Naturally, this is not the actual format, just an illustration of metadata that the platform stores.

Finally, the input record for form-based tasks will contain whatever data you need to display to the labeler to perform the task. In these cases, Alegion will work with you to maximize security of sensitive data, but even in these tasks, digital assets are handled by the same techniques explained here.

Method 1: fully managed security

By default, Alegion hosts digital assets in our secure S3 buckets (AWS) in a US availability zone. Signed URLs allow access within a limited time-to-live (TTL) window. We configure this duration to fit your requirements. Once the TTL automatically expires, attempts to load the asset will fail.

In this option, we manage copies of your files and generate the signed URLs. Assets can be transferred into our S3 buckets through secure cloud-to-cloud pipelines, or "traditional" means such as SFTP. We have preconfigured tools for ingesting data from AWS, Google Cloud Platform, and ownCloud. Other clouds can be supported on request.

Fully managed asset hosting option diagram

Method 2: self-hosted

In this option, Alegion never has a copy of your digital assets, and you manage access restrictions. Signed URLs are recommended but not strictly required. In either case, the Alegion platform only knows the asset's URL. You host the asset on the cloud or datacenter of your choice.

Self-hosted option diagram

Method 3: self-hosted + VPN firewall

Assets can also remain entirely within your firewall. Users who are physically outside that firewall must use a VPN client and authenticate with credentials you provide. While the Alegion web interface is served by our infrastructure, the asset itself is only loaded by the labeler's browser over the VPN connection and therefore doesn't leave your network.

Labelers will log in to Alegion's work portal in order to load tasks; to load assets within those tasks, they will then separately log in to a VPN. Digital assets will not load without that user's authorized access to your virtual private network.

VPN-based asset hosting option diagram

The human side of data stewardship

As a provider of software and data to very large enterprises, we implement the expected physical and infosec access controls for our employees. For a complete description of our security policies, please contact sales or your customer success manager.

Tenancy and labeler groups

Alegion configures workflows such that tasks are only available to labelers in designated groups. Grouping is flexible and supports any criteria you may have, including qualification level, citizenship, and other facets described below.

Multistage workflows can require different groups for different stages. This is useful when a higher-skilled or specialized users are needed to review or enhance judgments by a lower-skilled labelers, but can also be used for security reasons as needed. For instance, if you want to use only internal employees in a quality assurance stage, its labeler group would only include your employees, and these QA tasks would be invisible to any other labeler.

These group-based restrictions allow us to comply with a wide variety of security regimes, including government regulations in highly-regulated industries. See the next section for examples and descriptions.

NDAs and other attestations

Labelers can be required to make any attestation you require prior to accessing your work.

Geographic limitations

We can restrict access to your tasks based on where the labeler is physically. Example: for a federal contractor that has to comply with the International Trade In Arms Regulation (ITAR), we limit access to labelers in the continental United States.

Citizenship requirements

Labelers can also be limited to citizens of any specific country (or for that matter, states or provinces). Example: this is another ITAR restriction we support for government contractors.

Other protections against data theft

We can meet the needs of customers with the most demanding levels of sensitivity by adding on one of more of the following options.

Other data security options

Virtual desktops

In order to ensure that data including videos and images can't be saved locally, labelers can be required to use Amazon WorkSpaces (the AWS cloud-based virtual desktop).

Badged labelers in secure offices

We have a pool of labelers in our own offices and those of our partners. This approach is available if you require labelers to be supervised in a secure facility.

ISO 27001 & 9001

For work that requires ISO compliance, we use providers who are ISO-27001 and ISO-9001 certified. Details are available upon request.

Mobile device surrendering

When labelers are required to work from Alegion or partner offices, we can require that they surrender cell phones and other mobile devices to ensure that even photos of your proprietary data can't be taken.

"Bring your own labeler" model

Finally, you can bring your own employees to the Alegion work portal. For multi-stage workflows, you can even combine your own labelers with other labor pools as needed.