Let's start an audio project: Creating an audio project is simple in Datasaur. All of the steps are the same as creating token-based projects. From your project home page open "Create a Custom Project." In order to begin, you should upload both your audio and transcription on this page:
The type of audio files we accept: .mp3, .flac, and .wav
The type of transcription files we accept: a .SRT and .VTT format. Here is an example:
Before we move on, please make sure the name of the transcription file is the same as the name of the audio file. For example: SampleFile.txt and SampleFile.flac. Since both the transcription and the audio file have the same name, Datasaur will recognize them as corresponding files.
So now our window should look like this,
For the rest of the Project Creation Process, you will then be able to take all the steps one takes in a token-based project: Preview, Labeler's Tasks, Assignment, and Project Settings:
The preview step will only display the transcription. The Labeler's task step will only allow Token-based labeling.
Audio Interface Legend
Labeling an Audio Project
In the general layout of the text tool, you may regularly label any tokens in the transcript with the label sets that you have either created or uploaded. You will find an audio player on top of the interface with timestamps for your audio. You will also see a control panel between the text interface and the audio timestamps.
The buttons on this control panel enable you to: Rewind, Play/Pause, Fast-Forward, Volume, Control Audio Settings (which toggles audio speed and auto-scroll), Create a Timestamp Label, and Zoom Out/In of the Audio Timestamps. [Please watch this brief video for a visual guide].
The Control Panel
You can also create a new timestamp and bound that timestamp to its corresponding text. Click on the 'Create a Timestamp Label' button found on the control panel. Highlight a portion of the audio interface, and then highlight the span of tokens that correspond to your new timestamp. Then you're done! You've now made a new timestamp corresponding to a span of tokens.