Analytics

Datasaur offers various methods to efficiently view and analyze your data. The Analytics pages are exclusively accessible to Administrators. For convenient data access, you can choose to export it, and it will be delivered to your email.

Charts

Here are some tips to make sense and interpret the following charts below.

  1. Higher values on the charts indicate better performance. This means that your team is consistently improving their speed, accuracy, and overall efficiency.

  2. It's important to note that the total labels shown on the Throughput chart may not always match the total labels on the Quality chart. A higher value on the Throughput chart indicates that there are labels manually applied in Reviewer Mode, as the Quality chart only calculates labels from Labeler Mode. This could also indicate potential issues if there is a significant difference, as it may suggest that the Reviewer had to manually label a large amount of data.

  3. In some cases, you may observe high Efficiency values while both Throughput and Quality values are low. This typically occurs when doing projects with lots of pre-labeled data, as Throughput and Quality calculations only consider manually applied labels and exclude pre-labeled data.

Overall Projects

Display the current total projects distribution based on its status. Note that is a statistic and not a time-series data.

Remaining Files

Display the current remaining files from uncompleted projects, break down by project status. Note that is this one is also statistic and not a time-series data.

Throughput

It demonstrates the speed at which your team can produce annotations (labels and answers). It is calculated by summing the following factors:

  1. Total labels applied from each of labelers.

  2. Total labels applied from the Reviewer Mode. However, this count excludes labels that are automatically accepted through consensus. It focuses solely on labels applied manually by reviewers, including those involved in conflict resolution.

Efficiency

It illustrates the effectiveness of your labeling process in generating accepted labels per minute.

It is calculated by dividing the total number of accepted labels from Reviewer Mode (including those resolved manually, applied directly, through consensus, and pre-labeled data) by the total time spent by all team members, broken down on a daily basis.

Quality

It provides a breakdown of conflicts that occurred among labelers and how they were handled by the reviewers. This metric could also give you insights regarding the level of agreement among labelers (especially if combined with IAA) and the effectiveness of the reviewers in resolving conflicts. It's important to note that conflicts do not occur for Bounding Box Labeling and LLM projects. The data is categorized as follows:

  1. Total accepted labels This metric represents the sum of all labels applied from each labelers that have been accepted through both consensus and manual review processes.

  2. Total rejected labels This metric represents the sum of all labels applied from each labelers that have been rejected through both consensus and manual review processes.

  3. Total unresolved conflicts This metric represents the sum of all labels from each labelers that have not yet been accepted (either manually or through consensus) or rejected.

Cumulative Time Spent

It gives the total time spent of each day, categorized by labeler and reviewer role. This metric is computed when team members have a project open on their active browser tab. If they switch to another tab while working on the project, that time will not be factored into the calculation.

Multiple Ways to View the Data

Overview

Gain a high-level understanding of your data within the Workspace. This page calculates metrics from all your projects and provides overall comprehensive insights, including the Inter-Annotator Agreement. You can access the Overview page by selecting Analytics on the sidebar.

Project

Quickly access essential project information through the Name column, including statistics like total files, total time spent, token count (Token Labeling), and row count (Row Labeling). Gain additional insights by hovering over the avatar icon in the Labelers column or the Status column (only for in review and labeling in progress) to get the snapshot of the labeling progress.

Delve into detailed information about a specific project, allowing for in-depth analysis. You can access this page by clicking the triple dot on a specific project and click the View Project Details. It will consist of the same charts above and IAA that are already filtered specifically for this the project.

Team Member

Explore specific team member's data and insights, providing a comprehensive view of their contributions and performance based on the role. Once again, we will use the same charts above but filter specifically for a particular team member. You can also see the overview information for each project assigned.

Regarding the table, it displays the latest statistics and is not dependent on the date range selected. It's important to note that the data in the table is specific to either the Labeler or Reviewer Mode, which can be toggled using the tab at the top of the page.

Access this report by going to your Members page, clicking on the three dots corresponding to your teammate, and then choosing View Member Details.

Custom Report Builder

We also have a feature to customize your own report to get the Analytics data that you want. For more detailed information, you can access this page.

Evaluation Metrics

To assist you in evaluating the labeling, Datasaur calculates the evaluation metrics, which are available specifically for each completed project. For more detailed information, you can access this page.

Last updated