Common Terminology

A glossary of words we frequently use

Natural Language Processing (NLP): the field of artificial intelligence specific to text and linguistics

Labeler: the person doing the labeling. Sometimes also referred to as annotators or taggers.

Reviewer: someone assigned to review the labels of another colleague.

Project: at Datasaur, every labeling task starts with a project. A project can have multiple files, and each file will likely contain many labels.

Token: typically the atomic unit in a document. This can be a single word but can also refer to punctuation such as '.'

Entity: a conceptual person, object or location mentioned in a document. Oftentimes the token or span of tokens to be labeled in an NER project.