Mixed Labeling

This feature allows you to combine two labeling types.

Mixed label sets is a feature that allows you to combine two labeling types: span-based labeling and document classification.

Datasaur is currently support:

  1. token-based and document-based types in one interface.

  2. bounding box-based and document-based types in one interface

More combinations coming soon!

Span + Document-based project

Let's follow the steps below to create a span and document-based project!

Project creation through Project Creation Wizard (PCW)

As an admin, you can create a mixed label set by selecting multiple labeling types, below are the steps to create it.

  • Open team workspace.

  • Create a new project.

  • Upload file(s).

  • In step 3, you can simultaneously select span labeling and document labeling.

  • Continue to the next step and launch your project.

  • Here is the project page view when the mixed label set project is finished.

Interface

Once your mixed label set project has been created, you will have an interface looks like below: a label set, and a question set on the right side.

We don’t have any particular rule for doing which type of labeling first. However, we recommend that you label the spans first, then classify the document by answering the question.

If you have multiple documents within a project, submitting the answer will help go directly to the next document.

We use dropdown question type for the sample screenshot above. Please kindly refer to this page for other question types.

Conflict resolution

Conflict resolution is practically similar with our current behavior. Users will have an information which conflict type in the document (labels, sentence editing, or relation) and the conflict counter in each document.

The difference is, that you will need to display the Document Labeling extension by scrolling down the conflict extension and clicking Document Labeling.

Document Labeling extension will show, and you can resolve the conflicts as resolving conflicts in the row-based or document-based projects.

Project completion

After the reviewing process is done, you can click Mark as Complete button to indicate that the project is completed!

If there are unresolved conflicts in certain documents, a popup modal shows where the conflicts are. You can open the dropdown, then click the document name. It will go directly to the desired document.

Please note that marking the project as complete will make the project as a read-only. You will need to mark the project as in progress to do some modification.

Access Analytics

Project analytics

Here is the way to access Project Analytics.

  • Project page.

  • On Project row, press the three dots icon ( ⋮ ) to show options, then select “View Project Details”.

Filter Analytics Based on Project Type

As an Admin, you can filter the project analytics based on project type.

  • For a single typed project, the dropdown will automatically choose the project type.

  • For multi-typed project, the dropdown will select the default (one of the project types) and you can choose the filter based on the project’s project types.

Interface

  • Project Analytics Interface.

  • Filter by Project Type option.

Download Widget Based on Filter (Project Type and Date)

As the Admin, you can download every widget based on the filter applied (date and project type).

Behavior Changes on Analytics Widget

These widgets are affected by the project type filter:

  • Number of tasks

    • If filtered by document-based, it will be calculated based on document labels.

    • If filtered by span-based, it will be calculated based on span labels.

  • Total conflict per labeler

    • If filtered by document-based, it will be calculated based on document labels.

    • If filtered by span-based, it will be calculated based on span labels.

  • Conflict resolved by reviewer

    • If filtered by document-based, it will be calculated based on document labels.

    • If filtered by span-based, it will be calculated based on span labels

The IAA on project analytics page will not be filtered by project type.

Changes in Member Analytics

Here is the way to access Member Analytics:

  • Members page.

  • On Member row, press the three dots icon ( ⋮ ) to show options.

  • Press “View Member Details”.

For double typed project, the “labels applied”, “conflicts”, “#of accepted”, and “#of rejected” will be calculated from both project types.

ML Assisted Labeling

  • The ML Assisted Labeling will automatically apply to span-based project.

Dictionary

  • You can use the Dictionary on mixed label set project in the same way as you would use in span-based project.

  • You can use the Search Extension on mixed label set project in the same way as you would use in span-based project.

Grammar Checker

  • You can use the Grammar Checker on mixed label set project in the same way as you would use in span-based project.

Status bar

The Status Bar provides information that was available on the Dashboard Extension:

  • Total Labels Applied:

    Number of the labels applied on the current file.

  • Last Labeled Row:

    The last labeled row that the labeler/reviewer applied.

  • Total Labeled Row:

    Number of the rows in a File that are labeled by the labeler/reviewer.

  • The status bar only shows token labels analytics.

Bounding Box + Document-based

Project creation through Project Creation Wizard (PCW)

Let's follow the steps below to create a bounding box and document-based project!

As an admin, you can create a mixed label set by selecting multiple labeling types, below are the steps to create it.

  • Open team workspace.

  • Create a new project.

  • Upload file(s).

    • When uploading files, you also have the option to upload pre-labeled files as well.

  • In step 3, you can simultaneously select bounding box labeling and document labeling.

  • Continue to the next step and launch your project.

  • Here is the project page view when the mixed label set project is finished.

Interface

Once your mixed label set project has been created, you will have an interface looks like below: a bounding box label set, and a question set on the right side.

We don’t have any particular rule for doing which type of labeling first. However, we recommend that you label the bounding box first, then classify the document by answering the question.

If you have multiple documents within a project, submitting the answer will help go directly to the next document.

Extensions

Extensions are the same with the bounding box labeling project with additional extension Document Labeling. More information about extensions can be found here.

Conflict Resolutions

Following bounding box labeling rules, we currently don't support conflict resolution. It's one consensus settings for now.

Sample Files

170KB
Datasaur sample - Mixed Label Sets.jpeg
image
2KB
Datasaur sample - Mixed Label Sets.json
170KB
Datasaur sample - Mixed Label Sets (prelabeled).jpeg
image
3KB
Datasaur sample - Mixed Label Sets (prelabeled).json

Export

TypeFormat

Bounding Box-based + Document-based

Document-based only

5KB
Datasaur sample - Mixed Label Sets (export).json

Last updated