Creating a Project

After signing in, you will be automatically directed to your personal workspace. You can see the project shortcuts and the list of projects that you are working on. In this following example we will create a Row-Based Project (textual classification). If you would like a tutorial on token, audio, OCR, or document based project please watch their corresponding Youtube videos.
Creating a project can be done by clicking on the Custom project button, or on one of the Project Template shortcuts. In this article, we will walk through creating a custom project. The next article describes each of the Project Templates.

Project Creation Wizard

The Project Creation Wizard is a tool for creating custom projects. It has three basic steps: add data, preview data, and labeler tasks. (A fourth step is available for team admins to assign labelers).

Step 1: Adding data (Video Tutorial)

Datasaur supports a wide variety of formats, including:
You are able to upload multiple files, but all files in a project must be the same file format.
Uploading the data can be done in three ways: drag and drop and browsing files from your hard drive.
If you are interested in creating projects via API, you can find a full article here.
Note: the maximum file size allowed is 50 MB.

Step 2: Preview Data Adding data (Video Tutorial)

This step allows you to preview what your data will look like.

📝 Table data only

The Number of rows displayed per page setting determines how many rows should be displayed on one page. Choosing All rows will allow infinite scrolling through all the data.
The Expand Media setting allows you to choose the media resolution that will be displayed on the page. Higher resolutions will allow you to view the media in greater detail, but will take longer to load.
The Enable markdown parsing allows you to parse markdown in row-based project. We recommend to preprocess your file with Markdown syntax before uploading to Datasaur.
Markdown parsing checked
You can also edit the header and hide columns by right-clicking the header.
🧑‍🤝‍🧑 If you are in on a team project, please note the following:
  • If you choose Hide column, this setting will be propagated to the labelers as well. Labelers can show it later if they want.
  • If you choose Hide column from labeler, the labelers won't be able to show the columns at all.

Step 3: Labeler Tasks (Video Tutorial)

In this last step, you must choose whether you want to label individual tokens or answer questions about the text.
There are three task types that you can choose: token-based, row-based, and document-based.
  • Token-based allows you to label tokens in a text document.
  • Row-based allows you to label data in tables on a row-by-row basis.
  • Document-based allows you to label entire documents at a time.
You can create or upload label sets
If you select token-based, you will be asked to create or upload a label set. Label sets contain the label classes labelers will be able to choose from.
If you select row-based or document-based, you will be asked to fill in the set of questions. There are nine question types available.
  1. 1.
  2. 2.
    Text Area
  3. 3.
  4. 5.
  5. 6.
  6. 7.
  7. 9.
  8. 10.
More information about these task types can be found here.
After adding the questions and creating your project, your labeling task will look something like the example below. There are also shortcuts to creating specific project types that we discuss in detail here.

Create a New Project from File Menu

Finally, if you are already in a project, you can also create new projects from the File Menu. Once you click one of the formats below, you will be directed straight to the new blank project.