Row-Based Projects

This section will explain how to create new Row Based Labeling Project

Basic Project

By following this sample, we will create a basic project from a csv file.

cURL

Replace access_token, teamId, assignees, and file upload fields as appropriate.

curl --location --request POST 'https://datasaur.ai/graphql' \
--header 'Authorization: Bearer access_token' \
--form 'operations={
"operationName": "LaunchTextProjectMutation",
"variables": {
"input": {
"teamId": "1",
"name": "Best Seller Books",
"documentSettings": {
"kind": "ROW_BASED",
"displayedRows": -1,
"mediaDisplayStrategy": "THUMBNAIL"
},
"assignees": [
{
"email": "[email protected]",
"documentNames": [
"Book Review.csv"
],
"role": "LABELER"
}
],
"projectSettings": {
"consensus": 1,
"enableEditLabelSet": true,
"enableEditSentence": true
},
"tagNames": ["ProjectA"],
"documents": [
{
"name": "Best Seller Books",
"file": null,
"fileName": "Book Review.csv",
"settings": {
"guidelineID": "1",
"questions": [
{
"type": "HIERARCHICAL_DROPDOWN",
"config": {
"multiple": false,
"options": [
{
"label": "Penguin Publisher",
"id": 1
},
{
"label": "First edition",
"id": 2,
"parentId": 1
},
{
"label": "Second edition",
"id": 3,
"parentId": 1
},
{
"label": "J. B. Lippincott & Co.",
"id": 4
},
{
"label": "First edition",
"id": 5,
"parentId": 4
},
{
"label": "Second edition",
"id": 6,
"parentId": 4
}
]
},
"bindToColumn": "Publisher",
"name": "Q1",
"label": "Publisher",
"required": true
},
{
"type": "TEXT",
"config": {},
"bindToColumn": "Author",
"name": "Q2",
"label": "Author",
"required": true
},
{
"type": "DROPDOWN",
"config": {
"multiple": false,
"options": [
{
"id": "1",
"label": "Fiction"
},
{
"id": "2",
"label": "Non finction"
}
]
},
"bindToColumn": "Genre",
"name": "Q3",
"label": "Genre",
"required": true
},
{
"type": "NESTED",
"config": {
"multiple": true,
"questions": [
{
"type": "TEXT",
"config": {},
"name": "Q4.1",
"label": "Original",
"required": true
},
{
"type": "TEXT",
"config": {
"multiline": true
},
"name": "Q4.2",
"label": "Translation",
"required": true
}
]
},
"bindToColumn": "Language",
"name": "Q4",
"label": "Language",
"required": true
},
{
"type": "DATE",
"config": {
"format": "DD/MM/YYYY"
},
"bindToColumn": "Publication Date",
"name": "Q5",
"label": "Publication Date",
"required": true
},
{
"type": "TIME",
"config": {
"format": "HH:mm"
},
"bindToColumn": "Best Time Reading",
"name": "Q6",
"label": "Best Time Reading",
"required": true
},
{
"type": "TEXT",
"config": {
"multiline": true
},
"bindToColumn": "Quick Review",
"name": "Q7",
"label": "Quick Review",
"required": true
},
{
"type": "SLIDER",
"config": {
"min": 1,
"max": 10,
"step": 1
},
"bindToColumn": "Rate",
"name": "Q8",
"label": "How would you rate this book?",
"required": true
}
]
},
"docFileOptions": {
"customHeaderColumns": [
"Book Title",
"Publisher",
"Author",
"Genre",
"Language",
"Publication Date",
"Best Time Reading",
"Quick Review",
"Rate"
],
"firstRowAsHeader": true
}
}
]
}
},
"query": "mutation LaunchTextProjectMutation($input: LaunchTextProjectInput!) { launchTextProject(input: $input) { id rootDocumentId settings { consensus enableEditLabelSet enableEditSentence __typename } __typename }}"
}' \
--form 'map={"1":["variables.input.documents.0.file"]}' \
--form '1=@bookcover.csv'
  • operationName: you can fill any alphanumeric string in as the operationName. Refer this page for best practices on choosing an operationName .

  • variables

    • teamId: id of the team where we want to create the project.

    • name: Name for the project.

    • documentSettings

      • displayedRows: This determines how many rows displayed in the editor. Use -1 to show all rows at once.

      • mediaDisplayStrategy: This determines how media displayed in editor.

        • NONE. Any media will not be displayed and expanded.

        • THUMBNAIL. Media will be displayed as a thumbnail.

        • FULL. Media will be displayed at its original size.

      • kind: Use ROW_BASED as the value here since we want to create row-based labeling project.

        • TOKEN_BASED

        • ROW_BASED

        • DOCUMENT_BASED

    • projectSettings

      • consensus: peer review / labeler consensus. This determines how many labelers must agree in order for the system to automatically accept the label.

      • enableEditLabelSet: labelers will be restricted from adding or removing labels from the label set while labeling.

      • enableEditSentence: labelers will be able to edit the original text while labeling.

    • assignees

      • email: this refers to the user's email. Datasaur will throw an error if the email is not found on the team.

      • documentNames: Optional. List of document names. It refers to the field documents.fileName below. If not specified, all documents will be assigned to the team member above.

      • role: Optional. This determines the assignment role.

        • LABELER

        • REVIEWER

    • tagNames: a list of tag names to apply to a project. This parameter is optional and the type is string array.

    • documents: list of documents or files to be attached to this project. Every document must have the fields: name and fileName. There are optional fields, such as settings and fileUrl. Please see GraphQL schema for more information.

      • name: Document name.

      • fileName: File name. This can be used in the documentNames field above.

      • file: Use null.

      • settings

        • guidelineID: Put guideline ID here.

        • questions: Add the list of questions here. Refer to this page for more information about questions.

      • docFileOptions:

        • customHeaderColumns: Optional. Override column headers by using these values.

        • firstRowAsHeader: Optional. If the csv or xlsx file has a header as the first row, Datasaur will use it as the column header.

  • query: Copy this from the cURL example.

💡 You can check this page to see how to upload file in GraphQL.

Sample File

We provide this file as an example for the main text for your cURL command above.

Response

Here is the response you can expect after issuing the cURL command.

{
"data": {
"launchTextProject": {
"id": "525",
"rootDocumentId": "caa436bd-848c-4151-b5d4-ca57530c47b5",
"settings": {
"consensus": 0,
"enableEditLabelSet": false,
"enableEditSentence": false,
"__typename": "ProjectSettings"
},
"__typename": "Project"
}
},
"extensions": {}
}

Project Referencing External Image Files

Some projects may need to load data from other URLs. For example, some projects may include images that are stored on Amazon S3.

You can upload a csv file that contains all the image URLs. In this example, we will create a Book Cover Review Project. The images are loaded from the Amazon S3 bucket.

cURL

Replace access_token, teamId, assignees, and file upload fields as appropriate.

curl --location --request POST 'https://datasaur.ai/graphql' \
--header 'Authorization: Bearer access_token' \
--form 'operations={
"operationName": "LaunchTextProjectMutation",
"variables": {
"input": {
"teamId": "1",
"name": "Book Cover Review",
"documentSettings": {
"kind": "ROW_BASED",
"displayedRows": -1,
"mediaDisplayStrategy": "THUMBNAIL"
},
"assignees": [
{
"email": "[email protected]"
}
],
"projectSettings": {
"consensus": 1,
"enableEditLabelSet": true,
"enableEditSentence": true
},
"documents": [
{
"name": "book-cover-review.csv",
"file": null,
"fileName": "book-cover-review.csv",
"settings": {
"guidelineID": "1",
"questions": [
{
"type": "DROPDOWN",
"config": {
"multiple": true,
"options": [
{
"id": "1",
"label": "Good"
},
{
"id": "2",
"label": "Bad"
},
{
"id": "3",
"label": "Neutral"
}
]
},
"name": "Impression",
"label": "What is your impression about this cover?",
"required": true
}
]
},
"docFileOptions": {
"customHeaderColumns": [
"Book Cover"
],
"firstRowAsHeader": true
}
}
]
}
},
"query": "mutation LaunchTextProjectMutation($input: LaunchTextProjectInput!) { launchTextProject(input: $input) { id rootDocumentId settings { consensus enableEditLabelSet enableEditSentence __typename } __typename }}"
}' \
--form 'map={"1":["variables.input.documents.0.file"]}' \
--form '1=@book-cover-review.csv'
  • operationName: you can fill any alphanumeric string in as the operationName. LaunchTextProjectMutation is fine as a default. Refer this page to organise operationName properly.

  • variables

    • teamId: Team id that we want to create the project for.

    • name: Project name.

    • documentSettings

      • displayedRows: This determines how many rows displayed in the editor. Use -1 to show all rows at once.

      • mediaDisplayStrategy: This determines how media displayed in editor.

        • NONE. Any media will not be displayed and expanded.

        • THUMBNAIL. Media will be displayed as a thumbnail.

        • FULL. Media will be displayed at its original size.

      • kind: Use ROW_BASED as the value here since we want to create row-based labeling project.

        • TOKEN_BASED

        • ROW_BASED

        • DOCUMENT_BASED

    • projectSettings

      • consensus: peer review / labeler consensus. This determines how many labelers must agree in order for the system to automatically accept the label.

      • enableEditLabelSet: labelers will be restricted from adding or removing labels from the label set while labeling.

      • enableEditSentence: labelers will be able to edit the original text while labeling.

    • assignees

      • email: this refers to the user's email. Datasaur will throw an error if the email is not found on the team.

      • documentNames: Optional. List of document names. It refers to field documents below. Do not specify it to assign all documents to the team member.

    • documents: list of documents or files to be attached to this project. Every document must have the fields: name and fileName. There are optional fields, such as settings and fileUrl. See GraphQL schema for more information.

      • name: Document name. It can be referred at documentNames field above.

      • fileName: File name.

      • file: Use null.

      • settings

        • questions: Add the list of questions here. Refer to this page for more information about questions.

      • docFileOptions:

        • customHeaderColumns: Optional. Override column headers by using these values.

        • firstRowAsHeader: Optional. If the csv or xlsx file has a header as the first row, Datasaur will use it as the column header.

  • query: Copy it from cURL example.

Sample File

We provide this file as an example for the main text for your cURL command above.

Response

Here is the response you can expect after issuing the cURL command.

{
"data": {
"launchTextProject": {
"id": "525",
"rootDocumentId": "caa436bd-848c-4151-b5d4-ca57530c47b5",
"settings": {
"consensus": 0,
"enableEditLabelSet": false,
"enableEditSentence": false,
"__typename": "ProjectSettings"
},
"__typename": "Project"
}
},
"extensions": {}
}

Project By Uploading Multiple Files

In this example, we will create a new project named Book Review with two different csv files.

cURL

Replace the access_token, teamId, assignees, and file upload fields as appropriate.

curl --location --request POST 'https://datasaur.ai/graphql' \
--header 'Authorization: Bearer access_token' \
--form 'operations={
"operationName": "CreateProject",
"variables": {
"input": {
"teamId": "1",
"name": "Book Review",
"documentSettings": {
"kind": "ROW_BASED",
"displayedRows": -1,
"mediaDisplayStrategy": "FULL"
},
"projectSettings": {
"consensus": 1,
"enableEditLabelSet": true,
"enableEditSentence": true
},
"assignees": [
{
"email": "[email protected]",
"documentNames": [
"book-review-1.csv",
"book-review-2.csv"
]
},
{
"email": "[email protected]",
"documentNames": [
"book-review-2.csv"
]
}
],
"documents": [
{
"fileName": "book-review-1.csv",
"file": null,
"settings": {
"guidelineID": "1",
"questions": [
{
"type": "TEXT",
"config": {
"multiple": false
},
"name": "Author",
"label": "Author",
"required": true
},
{
"type": "TEXT",
"config": {
"multiline": true,
"multiple": false
},
"name": "Review",
"label": "Review",
"required": true
},
{
"type": "DROPDOWN",
"config": {
"multiple": true,
"options": [
{
"id": "1",
"label": "Positive"
},
{
"id": "2",
"label": "Negative"
}
]
},
"name": "Polarity",
"label": "Polarity",
"required": true
}
]
},
"docFileOptions": {
"customHeaderColumns": ["Book Title"],
"firstRowAsHeader": true
}
},
{
"fileName": "book-review-2.csv",
"file": null,
"docFileOptions": {
"customHeaderColumns": ["Book Title"],
"firstRowAsHeader": true
}
}
]
}
},
"query": "mutation CreateProject($input: LaunchTextProjectInput!) {project: launchTextProject(input: $input) { id __typename }}"
}' \
--form 'map={"1":["variables.input.documents.0.file"], "2":["variables.input.documents.1.file"]}' \
--form '1=@book-review-1.csv' \
--form '2=@book-review-2.csv'
  • operationName: you can fill any alphanumeric string in as the operationName. LaunchTextProjectMutation is fine as a default. Refer this page to organise operationName properly.

  • variables

    • teamId: Team id that we want to create the project for.

    • name: Project name.

    • documentSettings

      • displayedRows: This determines how many rows displayed in the editor. Use -1 to show all rows at once.

      • mediaDisplayStrategy: This determines how media displayed in editor.

        • NONE. Any media will not be displayed and expanded.

        • THUMBNAIL. Media will be displayed as a thumbnail.

        • FULL. Media will be displayed at its original size.

      • kind: Use ROW_BASED as the value here since we want to create row-based labeling project.

        • TOKEN_BASED

        • ROW_BASED

        • DOCUMENT_BASED

    • projectSettings

      • consensus: peer review / labeler consensus. This determines how many labelers must agree in order for the system to automatically accept the label.

      • enableEditLabelSet: labelers will be restricted from adding or removing labels from the label set while labeling.

      • enableEditSentence: labelers will be able to edit the original text while labeling.

    • assignees

      • email: this refers to the user's email. Datasaur will throw an error if the email is not found on the team.

      • documentNames: Optional. List of document names. It refers to field documents below. Do not specify it to assign all documents to the team member.

    • documents: list of documents or files to be attached to this project. Every document must have the fields: name and fileName. There are optional fields, such as settings and fileUrl. See GraphQL schema for more information.

      • name: Document name. It can be referred at documentNames field above.

      • fileName: File name.

      • file: Use null.

      • settings

        • questions: Add the list of questions here. Refer to this page for more information about questions.

      • docFileOptions:

        • customHeaderColumns: Optional. Override column headers by using these values.

        • firstRowAsHeader: Optional. If the csv or xlsx file has a header as the first row, Datasaur will use it as the column header.

  • query: Copy it from cURL example.

Sample Files

We provide these files as examples for the main texts for your cURL command above.

Response

Here is the response you can expect after issuing the cURL command.

{
"data": {
"launchTextProject": {
"id": "575",
"rootDocumentId": "caa436bd-848c-4151-b5d4-ca57530c47b5",
"settings": {
"consensus": 0,
"enableEditLabelSet": false,
"enableEditSentence": false,
"__typename": "ProjectSettings"
},
"__typename": "Project"
}
},
"extensions": {}
}

💡 You can check this page on our GraphQL Schema to see what parameters available.