Datasaur
Search…
Custom Text Extraction API
Custom text extraction API is a Datasaur feature which allows creating a custom OCR project using your own text extraction API.

Request from Datasaur

Request headers
Accept
application/json, text/plain
Form Data Parameters
upload
Your document file (e.g.: receipt.jpg)

Expected API Response

Datasaur can process the response differently based on the Content-Type header returned from the API response.

Text response (Content-Type: text/plain)

1
SHIHLIN TAIWAN
2
STREET SNACKS
3
Grand Galaxy Park
4
DATE 26/02/20 15:53
5
CASHIER: Reny
6
No. Customer: 1
Copied!

JSON response (Content-Type: application/json)

Datasaur uses Importable format to process the API response.
1
{
2
"cells": [
3
{
4
"content": "SHIHLIN TAIWAN",
5
"index": 0,
6
"line": 0,
7
"metadata": [],
8
"tokens": [
9
"SHIHLIN",
10
"TAIWAN"
11
]
12
},
13
{
14
"content": "STREET SNACKS",
15
"index": 0,
16
"line": 1,
17
"metadata": [],
18
"tokens": [
19
"STREET",
20
"SNACKS"
21
]
22
}
23
],
24
"labelSets": [],
25
"labels": [
26
{
27
"startCellLine": 0,
28
"startCellIndex": 0,
29
"startTokenIndex": 0,
30
"startCharIndex": 0,
31
"endCellLine": 0,
32
"endCellIndex": 0,
33
"endTokenIndex": 0,
34
"endCharIndex": 6,
35
"layer": 0,
36
"counter": 0,
37
"pageIndex": 0,
38
"type": "BOUNDING_BOX",
39
"nodeCount": 4,
40
"x0": 130,
41
"y0": 154,
42
"x1": 255,
43
"y1": 154,
44
"x2": 255,
45
"y2": 186,
46
"x3": 130,
47
"y3": 186
48
},
49
{
50
"startCellLine": 0,
51
"startCellIndex": 0,
52
"startTokenIndex": 1,
53
"startCharIndex": 0,
54
"endCellLine": 0,
55
"endCellIndex": 0,
56
"endTokenIndex": 1,
57
"endCharIndex": 5,
58
"layer": 0,
59
"counter": 0,
60
"pageIndex": 0,
61
"type": "BOUNDING_BOX",
62
"nodeCount": 4,
63
"x0": 261,
64
"y0": 154,
65
"x1": 375,
66
"y1": 154,
67
"x2": 375,
68
"y2": 186,
69
"x3": 261,
70
"y3": 186
71
}
72
],
73
"name": "receipt.jpg",
74
"pages": [
75
{
76
"pageIndex": 0,
77
"pageHeight": 619,
78
"pageWidth": 551
79
}
80
],
81
"type": "BOUNDING_BOX"
82
}
Copied!
Last modified 3mo ago