Robosaur

Robosaur is a Typescript command-line scripting tool to help automating your process on Datasaur

Robosaur supports multiple commands. Each command needs a JSON file that will act as the configuration. For example, project creation will need information regarding what kind of projects that will be created.

Robosaur also manages a state file to keep track of the previous run commands so that it can continue the process accordingly without unwanted duplication. For example, Robosaur won't process projects that were successfully created before the next time. It will, however, process the failed and the new ones.

Installation

The source code for Robosaur is available publicly as an open-source GitHub project. Use the following command to clone the source code.

git clone https://github.com/datasaur-ai/robosaur.git
cd robosaur
nvm use
npm ci

Robosaur is developed using TypeScript and Node.js. We highly recommend using nvm to manage the versions.

  • Node.js v16.13

  • NPM v8 (should be bundled with Node.js)

First Time Configuration

Before running any Robosaur commands, get familiar with our project types and there are two things that you need to configure, i.e. OAuth credentials and specify the team ID.

  1. Open /quickstart/{preferred-project-type}/config/config.json.

  2. Generate the OAuth credentials and replace <DATASAUR_CLIENT_ID> and <DATASAUR_CLIENT_SECRET> config values.

  3. Open https://app.datasaur.ai/projects. Click your profile on the top right corner and select the team that you want to use. Grab the team ID from the URL (https://app.datasaur.ai/teams/{team-id}/projects) and replace the <TEAM_ID> values on your configuration.

Supported Commands

Create Multiple Projects

npm run start -- create-projects <path-to-configuration-file>

Export Multiple Projects

npm run start -- export-projects <path-to-configuration-file>

Apply Tags

npm run start -- apply-tags <path-to-configuration-file>

Split Document

npm run start -- split-document <path-to-configuration-file>

Please see the next section to configure and customize more deeply about each commands.

Configuration File

As mentioned above, Robosaur needs a configuration file to determine the correct behavior when running a specific command. The required attributes that must be filled regardless of which command are these below.

  • datasaur specifies the host, clientId, and clientSecret to authenticate the each call.

  • projectState specifies where to store the state files. If Robosaur is used by multiple users, make sure to save the state at cloud object storage to keep it properly synced.

For each command, you need to fill the attributes accordingly.

  • create specifies the project settings to be created.

  • export specifies the behavior for exporting projects.

  • applyTags specifies the applying tags behavior.

  • splitDocument specifies the behavior for splitting a file.

An in-depth breakdown of each attributes is available as a TypeScript file in src/config/interfaces.ts .

Object Storage

Each source attribute could be configured to local, AWS S3, Google Cloud Storage, or Azure Blob Storage. The detailed guide could be found here.

Last updated