Vector Store

Vector Stores are a central component designed to store and organize vector embeddings of language data. These embeddings capture semantic information, allowing for advanced analysis and applications in natural language processing.

Vector Stores enable easy storage, retrieval, and manipulation of language vectors. You can create, update, and delete vectors seamlessly, providing flexibility for various projects.

Access Vector Stores

In the LLM menu, select 'Vector Stores'.

Vector Store Creation

New Vector Store: Click “Create new vector store”.

Vector Store Provider

Internal (Datasaur)

Configuring Internal Vector Store: Enter the store name, select Datasaur as the provider, choose the embedding model, set chunk size and overlap, then click “Next”.

External Provider

Setting Up External Provider: For external providers, input the store name, base URL, and choose the authentication method. Then click “Next”.

File Properties

File Properties is currently not available to all users by default. To gain access, kindly reach out to support@datasaur.ai with your request, and we will gladly enable it for you.

Facilitating the user to provide information about the file is an optional step in the process of creating a vector store. If you have specific details about the files you're adding to the vector store, you can enter them here. This additional information can help in categorizing and retrieving vectors more efficiently in the future. We will explain further regarding file properties in this section.

To proceed without adding file properties, simply click "Create vector store".

Knowledge Base

Our Knowledge Base enhances vectors by associating them with valuable information, making them more than just numerical representations. This integration allows for a deeper understanding of the language elements stored in Vector Stores.

  1. Building Your Knowledge Base: Add files to your vector store by clicking “Add file”. Supported formats are .urls, .txt, .pdf, .docx (temporarily unavailable). Click “Update vector store” after adding files.

  1. Monitoring and Updating: Observe the processing status of each file. Update the store as needed, considering file processing times.

Testing Your Knowledge Base: Use the search function to validate the effectiveness of your knowledge base in providing context.

You can see the similarity score and answer source data that the model generate. And if you click the source link, it will redirect you to the answer source (e.g., page 1 or page 2 of the PDF file).

After you click the source link:

Activity

Log activities of vector store: making it easier to track changes and actions on your vector store.

File Properties

File Properties is currently not available to all users by default. To gain access, kindly reach out to support@datasaur.ai with your request, and we will gladly enable it for you.

As mentioned before, file properties will enhance your data organization by facilitating user to give information about the file.

Add File Properties

There are two easy ways for the user to add file properties during Vector Store creation:

A. The first way is the user may upload their file properties.

You may drag and drop the file into the dropdown area. Our supported formats are as follows: .csv and .json. For JSON files, please find below a sample of our accepted format.

B. The second way for a user to add a file properties is to create it from scratch.

File Properties Types

Let's take a look at the File properties types available below.

1. Text Field

Text Field allows the user to give information by typing in free-form text, up to a single line at a time.

Users also can add some validation by expanding the Advanced Settings.

  • You can also allow selecting multiple answers by checking the box to Allow multiple answers.

2. Text Area

Text Area allows the user to give information by typing in free-form text. In contrast to Text Fields, this allows for multiple-line answers.

  • You can also allow selecting multiple answers by checking the box to Allow multiple answers.

3. Dropdown

Dropdown requires user to give information by picking one of several multiple-choice answers.

  • If you have a .csv with a pre-set list of answers, you can upload the .csv as an option set.

  • You can also allow selecting multiple answers by checking the box to Allow multiple answers.

4. Hierarchical Dropdown

Hierarchical dropdown allows the user to give information with hierarchically organized options.

  • Just like with the Dropdown type, you can also upload an option set once you have created the hierarchical question. The sample format for hierarchical option sets can be found here.

💡Let's break down the components of this file

1. The header

id,label is the header. This will always be the first row in the .csv. The first label will have 1 as the id, as same as the example above.

2. id format

The id format is similar to Microsoft Word's numbering format. In the example above, Characters is a part of Novel name and the id will be 1.1.

  1. Novel name: the root-level.

  2. 1: id for the root-level

  3. Characters: the second-level.

  4. 1.1: the second-level id.

5. Date

Date allows the user to give information in two ways. The key benefit of selecting Date is that this format validates that a correct date has been filled in.

  • Typing the date in manually.

  • Clicking on the calendar symbol, then selecting the date.

If you want to fill date questions with the current timestamp at the time users open the file in knowledge base, you can check the Use current date as default value.

6. Time

Time allows the user to give information in two ways. The key benefit of selecting Time is that this format validates that a correct time has been filled in.

  • Typing it manually.

  • Clicking on the clock symbol, then selecting the time.

If you want to fill time questions with the current timestamp at the time the user opens the file in knowledge base, you can check the Use current time as default value.

7. Slider

Slider allows the user to give information by moving the sliding bar (ex: from 1 to 10).

You have the flexibility to personalize the slider color according to your preferences. While the default color for “Start at” and “End to” is blue, we provide 11 alternative default color options for you to select from.

When it comes to colors, you have the choice of using hex codes, color names, or RGB values. If you opt for any of these choices, the dropdown will be labeled as “Custom”.

To get a glimpse of how the color will appear, simply drag the slider thumb on the Preview.

Please note that we only allow numbers as the slider value.

8. Checkbox

Checkbox allows the user to give information by checking it. You can also put a description.

9. URL

URL allows you to put the URL links and apply validation on it.

10. Radio Button

Radio Button allows the user to give information by selecting one answer.

You can also insert a hint to give a description of the Radio Button. Here is an example of using the Radio Button in the labeling process:

11. Grouped Attributes

Grouped Attributes allows the user to combine multiple questions that pertain to a single group.

  • You can also allow selecting multiple answers by checking the box to Allow multiple answers.

Save File Properties

After vector store creation, the file property will appear on each file on knowledge base. Once you’ve already filled the information on the file properties, just click “Submit answers”.

Connecting Vector Store to Playground

  1. Linking Vector Store to Application: In LLM Labs, connect your vector store to your application.

  2. Comparison and Deployment: Compare results with and without the vector store. Redeploy your application with the updated prompt template and vector store.

  1. Select the prompt template and click “Choose”.

Last updated