Programmatic API Labeling

This feature allows you to use an API to apply labels to multiple projects at once.

The Programmatic API Labeling feature automates ML-assisted labeling. This feature is best suited for users who want to compare between their model and a human labeler, or between two models.

After successfully creating the project, you can follow these steps:

Example

In this example, we will create a token-based project with 2 documents and 1 labeler. We will perform the auto labeling process against this project and add labels in the labeler's document.

Sample documents

Create the project

Detailed guidelines can be found here.

{
 "operationName": "launchTextProjectAsync",
 "variables": {
   "input": {
     "name": "Stories",
     "kinds": ["TOKEN_BASED"],
     "documents": ["little-prince.txt", "tragedy-of-hamlet.txt"],
     "documentSettings": { 
          "viewer": "TOKEN" 
           // ... other document settings 
      }
     // ... other project configurations
   }
 },
 "query": "mutation ..."
}

The API request above returns a response containing the project id: “PROJECT_ID_1”, which is going to be used for the next set of API requests.

Initiate Backend to Perform Programmatic API Labeling

This operation will ask our backend to perform the auto**-**labeling task. We perform the request in chunks. For example, if you have 500 files and 5 files will be sent per request, it will require 100 API calls.

Note: the number of files that can be sent per request depends on your internal server.

Initiate auto-labeling for the Velociraptor API for labeler@datasaur.ai

{
 "operationName": "AutoLabelTokenBasedProject",
 "variables": {
   "input": {
     "projectId": "PROJECT_ID_1",
     "labelerEmail": "labeler@datasaur.ai",
     "targetAPI": {
       "endpoint": "https://velociraptor.api/...",
       "secretKey": "raQa9of3jDj9Ksde6dLDdycr",
     },
     "options": {
       "numberOfFilesPerRequest": 2,
       // Optional
       "serviceProvider": "CUSTOM",
     },
     "role": "REVIEWER"
   }
 },
 "query": "mutation ..."
}

Receive Programmatic API Labeling API calls

Our backend will make several API requests based on the configuration provided from the request above. From the sample configuration above, our backend will make an API request to https://velociraptor.api/...

Sample API request

{
 "id": "PROJECT_ID_1",
 "name": "Stories",
 "documents": [
   {
       "id": "DOCUMENT_ID_1",
       "fileName": "little-prince.txt",
       "sentences": [
         { "id": 0, "text": "The Little Prince is a novella by French aristocrat, writer, and aviator Antoine de Saint-Exupéry." },
         { "id": 1, "text": "It was first published in English and French in the US by Reynal & Hitchcock in April 1943, and posthumously in France following the liberation of France as Saint-Exupéry's works had been banned by the Vichy Regime." }
       ]
   },
   {
       "id": "DOCUMENT_ID_2",
       "fileName": "tragedy-of-hamlet.txt",
       "sentences": [
         { "id": 0, "text": "The Tragedy of Hamlet, Prince of Denmark, often shortened to Hamlet (/ˈhæmlɪt/), is a tragedy written by William Shakespeare sometime between 1599 and 1601." },
         { "id": 1, "text": "It is Shakespeare's longest play with 30,557 words." }
       ]
   },
 ]
}

Sample API response

The sample response below will apply two labels to the first document and three labels to the second document.

{
 "id": "PROJECT_ID_1",
 "documents": [
   {
     "id": "DOCUMENT_ID_1",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 16 }
         ]
       },
       {
         "id": 1,
         "entities": [
           { "label": "YEAR", "start_char": 86, "end_char": 89 }
         ]
       }
     ]
   },
   {
     "id": "DOCUMENT_ID_2",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 20 },
           { "label": "YEAR", "start_char": 142, "end_char": 145 },
           { "label": "YEAR", "start_char": 151, "end_char": 154 }
         ]
       }
     ]
   }
 ]
}

Results

Once the auto labeling process is complete, the labeler's document will look like the screenshots below:

Layer Auto Labeling

Layer Auto Labeling allows you to specify layer for each label.

On the example below, we will auto label TITLE on the Layer 1 and YEAR on the Layer 2.

The instructions are still the same as above. However, we will add a new key "layer" in the API response.

{
 "id": "PROJECT_ID_1",
 "documents": [
   {
     "id": "DOCUMENT_ID_1",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 16, "layer": 1 }
         ]
       },
       {
         "id": 1,
         "entities": [
           { "label": "YEAR", "start_char": 86, "end_char": 89, "layer": 2 }
         ]
       }
     ]
   },

The labeler's document will look like the screenshots below:

Last updated