Datasaur
Search…
Let's Get Labeling!
Label projects fall under three broad categories: token-based, row-based, and document-based. We'll take a look at examples of each below.

Token-based

In token-based labeling, the labeling process can be done by labeling tokens or spans of tokens. Token-based labeling is well-suited for projects such as NER and POS. Here are the things that are important for you to know before labeling your project. (See this Youtube video for a visual guide on token-based labeling).

Keyboard shortcuts

The label box will appear when you click on the tokens. You can click manually on the labels or use the corresponding keyboard shortcuts by typing "1", "2", "3", or "4".
Due to a limited number of numerals on the keyboard, keyboard shortcuts are only available for the first 9 labels.

Search for labels

You can search for labels in the label box by starting to type out parts of the label. In the following example, you could type "PRP" followed by "2" to apply the PRP$ label.

Apply overlapping or multiple labels

It is possible to apply overlapping labels, or even multiple labels to the same token or span.
  • The first way is selecting the token or span, click shift + the appropriate label.
  • The second way is to use keyboard shortcuts. Select the span, use up and down to find the right label, then press shift + Enter.

Edit sentence

You can Edit the sentence by double-click on the row, then choose Edit from the pop up menu shown. When editing, we will show you the original sentence. Please take a note that we will tokenize the sentence at the server. To apply changes, you can do one of these:
  • Press Save after editing.
  • Press Shift + Enter if you want to use space as token separator and not using default tokenizer that Datasaur has.

Font Settings

Datasaur strives to provide a balance of comfort and control to all users. You can adjust the font and size to your desire in the Settings menu.

Insert new lines

You can add new lines by right-clicking on the row then choosing Insert Line Above or Insert Line Below.

Delete lines

You can delete lines by right-clicking on the row then choosing Delete Line.

Delete sentence labels

You can delete all the labels on a given sentence by right-clicking anywhere in the sentence and choosing Delete Sentence Labels.

Draw arrows

Once you have labeled tokens, you can draw arrows between labels.
  • You can even apply labels to the arrows themselves. In order to do so, double-click the arrow and select the appropriate label.
  • You can also reverse arrows, delete arrows, and delete labels by right-clicking on the arrow.

Go to Menu

You can move to the desired line via the Go menu.
  • Go to Start will take you to the first line.
  • Go to End will take you to the last line.
  • Go to Line will take you to a specific line.
  • Go to Next Unlabeled Token will take you to the next unlabeled token.
  • Go to Previous Unlabeled Token will take you to the previous unlabeled token.
  • Go to Next Unlabeled Line will take you to the next unlabeled line.
  • Go to Previous Unlabeled Line will take you to the previous unlabeled line.
  • Go to Next File will take you to the next file.
  • Go to Previous File will take you to the previous file.

Delete labels

Deleting labels can be done in two ways:
  • Right-clicking the label and clicking on Delete label.
  • Click Delete or Backspace on your keyboard.

Paragraph/Sentence labeling

Paragraph/sentence labeling optimizes the interface for when you are applying labels to longer sentences or entire paragraphs. You will have the option to show the label as an index bar on the left-hand side, and hide the label above the text to avoid clutter.
You can enable this by altering the project settings in token-based projects:
  • Click File on the top left, then click Settings --> General.
  • Check Show index bar for labels.
  • Select Show labels only when token is clicked (optional).

Character-based labeling

Character-based labeling allows you to select and apply labels on a character-level basis, so you don't have to select the entire token.
  • Click File on the top left, then click Settings --> Project.
  • Check Allow character-based labeling.
  • Labeling the character can be done in two ways:
    • Select the desired character using your mouse.
    • Select the character using keyboard shortcuts shift + right.

Highlight an entire sentence

If you want to label the entire sentence, you can simply click on the line number.

Select multiple lines at once

Select multiple lines at once can be done by holding Shift + clicking the desired line number.

Mark document as complete

Once you have finished labeling, click Mark document as complete. This will signify to your team you are done with the document, and it is ready for Review or Export.
Located in Dashboard extension

Row-based and Document-based

If you choose row-based or document-labeling as the task type, the goal of labeling is to answer the questions. You can answer the questions in the Document Labeling extension on the right side. (See this Youtube video for visual instructions on how to label row-based projects).
  • You can navigate to the next question by using your mouse or typing Tab on the keyboard.

Go Menu

You can move to the desired row via the Go menu.
  • Go to Start will take you to the first row.
  • Go to End will take you to the last row.
  • Go to Line will take you to a specific row.
  • Go to Next Unlabeled Line will take you to the next unlabeled line.
  • Go to Previous Unlabeled Line will take you to the previous unlabeled line.
  • Go to Next File will take you to the next file.
  • Go to Previous File will take you to the previous file.

Required question

The asterisk (*) next to the question indicates that the question requires an answer - leaving a required field blank will trigger an error.

Sort and filter column

If you create Text Field, Text Area, and Date type questions, you are able to sort and filter the columns.
For the Text Field and Text Area columns, you can filter by searching the keyword.
For the Date column, you can filter the date range.

Keyboard shortcuts for Dropdown questions

When the answer type is Dropdown, keyboard shortcuts are displayed in the extension. In the example below, you can click "1" on your keyboard to apply Fiction as an answer.

Filter rows

You are allowed to see all rows or the unlabeled rows by clicking the View menu. This feature will help you if your project has many rows.

Hide and rename the headers

You can hide and rename headers by right-clicking on the header.

Mark document as complete

Once you have finished labeling, click Mark document as complete.
Located in Dashboard extension

Row-based with URL View

There's an option for you to label multiple images by providing the URL of the Images in a column of your Row-based file.
Prepare a Row-based file that contains a URL column
You can store your Images on your preferred storage options (make sure it's accessible). You can also add additional information for each of the images by adding the attributes to the columns (The data can't be edited later).
Sample .csv file that contains URL of Images
Check and set the Preview of the Row-based labeling
You can set how the media will be previewed on the labeling page. Here are some of the options:
  • Don't expand: Not previewing image from the URL
  • Thumbnail: Previewing smaller size of the image from the URL
  • Large: Previewing the larger size of the image from the URL
Set the media preview option
Set the Viewer Setting to URL View
Make sure to change the Viewer Setting from the Tabular View to the URL View. Also, set the URL Columns to the column name of your Row-based file that contains the URLs.
Set the Viewer Setting and URL Columns
Labeling Page of the Row-based with URL View
Here you can see your Images that are retrieved from the URL that you provided on the Row-based file. Now you can conduct the Row-based labeling with the help of URL View.
Row-based with URL View
See Additional Information on the Document Labeling Extension
The additional information that are available on the columns of the Row-based file can be found on the Document Labeling extension.
Document Labeling Extension

Project Settings

If you have already created a project, you can change the configurations through Settings.
You can click the File menu and choose Settings.

Token-based Projects

  • Token Labeling tab allows you to adjust the font type and font size, show index bar for labels, and task settings.
  • The Assignment tab allows you to change the assigned labelers. This is only available for projects created in Datasaur Teams.
  • The Administrator tab allows you to change the settings related to your project.

Row-based Projects

  • Row Labeling tab allows you to change the number of rows displayed per page, configure how media should be expanded, and enable markdown parsing.
  • The Assignment tab allows you to change the assigned labelers. This is only available for projects created in Datasaur Teams.
  • The Administrator allows you to change the settings related to your project.

Document-based Projects

Document-based projects only has Assignment and Administrator settings.

Role

Datasaur has 5 team roles. You will find these roles in the team management.

Admin

You will automatically become an admin when you create a team. As an admin, you are allowed to invite team members, create projects, assign team members as labelers or reviewers, and promote team members as admins. You can also access the high-level overview of your team's projects and progress through the Overview page.

Team reviewer

We divide reviewers into team scope and project scope. If the admin assigns you as a team reviewer, you are able to see all projects and review them.

Project reviewer

A project reviewer can be assigned in Step 4 of the project creation wizard. As a project reviewer, you will only see and review the projects that are assigned to you.

Labeler

The most common role, anyone an admin invites to a team, is assigned as a labeler. As a labeler, you will only see the projects that are assigned to you.

Labeler + reviewer

If you're a team admin, you can also be a labeler. This role will automatically selected if you assign yourself in Step 4.
Copy link
On this page
Token-based
Keyboard shortcuts
Search for labels
Apply overlapping or multiple labels
Edit sentence
Font Settings
Insert new lines
Delete lines
Delete sentence labels
Draw arrows
Go to Menu
Delete labels
Paragraph/Sentence labeling
Character-based labeling
Highlight an entire sentence
Select multiple lines at once
Mark document as complete
Row-based and Document-based
Go Menu
Required question
Sort and filter column
Keyboard shortcuts for Dropdown questions
Filter rows
Hide and rename the headers
Mark document as complete
Row-based with URL View
Project Settings
Token-based Projects
Row-based Projects
Document-based Projects
Role
Admin
Team reviewer
Project reviewer
Labeler
Labeler + reviewer