Label Sets / Question Sets
Our project templates require you to upload both an input document as well as a label set.
For token-based labeling (NER, POS, DEP, COR, and OCR templates), a label set is a single-column
.csv
or .tsv
following the structure below:Column 1 |
---|
Label 1 |
Label 2 |
Label 3 |
etc... |
ner-labelset (2).csv
30B
Binary
NER label set
We provide twelve colors that you can configure manually from the Labels extension. You can also create a label set with your desired label colors in it. A sample file is provided below.
- Note:
label,color
is the header. This will always be the first row in the .csv.
colored-labelset.csv
141B
Binary
Colored label set
label,color
Annabeth Chase,#df3920
Harry Potter,#ff8000
Hermione Granger,#4db34d
John Watson,#3399cc
Percy Jackson,#cc3399
Sherlock Holmes,#9933cc
Note: colored label sets only work for the
.csv
format.Datasaur supports HTML color codes. For your reference, below are the default colors provided by Datasaur for better viewing clarity in your project.
- #df3920
- #ff8000
- #ffc826
- #91b34d
- #4db34d
- #33cc99
- #3399cc
- #3370cc
- #3333cc
- #7033cc
- #9933cc
- #cc3399

Label Color Palette
The Text Transcription setting allows the labeler to add corresponding text to a bounding box. Disabling this setting means the labeler could not add the text.
.png?alt=media)
By turning on the Text Transcription setting, the labeler can add text to a bounding box. You can choose whether a specific label must have a text by disabling or enabling the Require caption checkbox.
.png?alt=media)
For row-based or document-based projects (DOC template), a label set is a
.csv
with questions in the first column and answers in subsequent columns:Column 1 | Column 2 | Column 3 | Column 4. | Text | Text |
---|---|---|---|---|---|
Question 1 | Answer 1 | Answer 2 | Answer 3 | | |
Question 2 | Answer 1 | Answer 2 | | | |
Question 3 | Answer 1 | Answer 2 | Answer 3 | Answer 4 | Answer 5 |
bookreview2020-labelset (1).csv
61B
Binary
Book Review question set that only contain dropdown question type
You can also create a
.json
for a label set that has multiple question types.book-review-sample-question-set.json
3KB
Code
Book Review question set
As mentioned before, label sets for row-based and document-based projects are sets of questions. Let's take a look at the question types available below.
Text Field allows the labeler to answer questions by typing in free-form text, up to a single line at a time.
.png?alt=media)
Users also can add some validation **** by expanding the Advanced Settings.
.png?alt=media)
Text Area allows the labeler to answer questions by typing in free-form text. In contrast to Text Fields, this allows for multiple-line answers.
.png?alt=media)
Dropdown requires labelers to answer questions by picking one of several multiple-choice answers.
.png?alt=media)
- If you have a .csv with a pre-set list of answers, you can upload the
.csv
as an answer set.

bookgenre-answerset.csv
178B
Binary
Book Genre-Answer set
- You can also allow the labelers to select multiple answers by checking the box for Allow multiple choices.
.png?alt=media)
In Step 3
Hierarchical dropdown allows the labeler to answer questions with hierarchically organized options.
.png?alt=media)
- Just like with the Dropdown type, you can also upload an answer set once you have created the hierarchical question. The format for hierarchical label sets can be found below.
Date allows the labeler to answer the question in two ways. The key benefit of selecting Date is that this format validates that a correct date has been filled in.
- Typing the date in manually.
- Clicking on the calendar symbol, then selecting the date.
.png?alt=media)
If you want to fill date questions with the current timestamp at the time the labeler opens the project, you can check the Use current date as default value box on Step 3.
.png?alt=media)
Time allows the labeler to answer the question in two ways. The key benefit of selecting Time is that this format validates that a correct time has been filled in.
- Typing it manually.
- Clicking on the clock symbol, then selecting the time.
.png?alt=media)
If you want to fill time questions with the current timestamp at the time the labeler opens the project, you can check the Use current time as default value box on Step 3.
.png?alt=media)
Slider allows the labeler to answer the question by moving the sliding bar (ex: from 1 to 10).
.png?alt=media)
You can also set the
Grouped Attributes allows the labeler to combine multiple questions that pertain to a single group.
.png?alt=media)
Checkbox allows the labelers to answer the question by checking it. You can also put a description.
.png?alt=media)
URL allows you to put the URL links and apply validation on it.

It is possible to upload multi-level hierarchical label sets in
.csv
for token-based, row-based, and document-based projects. Here is a sample of a hierarchical label set:hierarchical-labelset2.csv
219B
Binary
Hierarchical label set
id,label
1,Novel
1.1,Author
1.2,Title
2,Characters
2.1,Antagonist
2.2,Protagonist
id,label
is the header. This will always be the first row in the .csv
. The first label will have 1 as the id, as same as the example above.The id format is similar to Microsoft Word's numbering format. In the example above, Author is a part of Novel and the id will be 1.1.
- 1.Novel: the root-level.
- 2.1: id for the root-level
- 3.Author: the second-level.
- 4.1.1: the second-level id.
In token-based projects, the hierarchy will be visible in the Labels extension and in the label dropdown.

- You have to choose hierarchical dropdown as the question type when creating projects using Project Custom Wizard.
- Hierarchical label sets in these projects are uploaded as answers sets.
%20(2)%20(1).png?alt=media)
💡 Pro Tip
- Clicking the Home icon will go directly to the top-level label
- You can search leaf nodes globally
Last modified 1mo ago