Datasaur
Search…
Import Data
Datasaur supports a wide variety of import data formats. The available formats depend on the task type, as described in the table below. Click on any format to see a detailed explanation of the file structure Datasaur expects.

Available Formats

Task Type
Import Formats
Token-based
.txt, .tsv, .json​
Token-based with arrows
.txt, .tsv, .json, .conllu​
Row-based
​.tsv, .csv, .xls, .xlsx, .txt​
Document-based*
​.md, .pdf, .jpeg, .jpg, .png, .gif, .svg, .bmp, .tiff, .tif, .webp​
​OCR​
Media: .pdf, .jpeg, .jpg, .png​
Transcription: .txt, .json, .tsv​

Important Notes

*Document-based formats only work if you import the files through Project Creation Wizard.
**When uploading pairs of OCR documents, please make sure your image files and their corresponding transcription have the same file name. For example, unicef.jpg and unicef.txt.
Last modified 29d ago