External Object Storage

Overview

The External Object Storage Integration feature enhances the accessibility of Vector Store by allowing users to directly connect their object storage repositories. This enables the easy import of data, facilitating a smooth and efficient RAG process within the platform.

Available Providers

Right now, we support three External Object Storage:

  1. AWS S3

  2. Google Cloud Storage

  3. Azure Blob Storage

Connect the External Object Storage

  1. Connect your External Object Storage via the Workspace Settings by clicking the Add External Object Storage button.

  1. Choose the Object Storage providers. And fill the form with your credentials.

For more detailed guides on connecting to each External Object Storage, you can refer to this link:

Connecting External Object Storage to Vector Store

  1. Make sure you already connect the External Object Storage in the Workspace Settings.

  2. Open your Vector Store, click the triple-dots button and choose your desired object storage.

  1. After you click your desired external object storage, a dialog will show. You can select which files you want to import to the knowledge base by using the rules. More information about How to Write Rules can be found here.

  1. Click connect object storage button, and your desired files will be imported to the knowledge base.

  1. Click Update vector store to update the embedding, and a dialog will show the list of files that you already imported.

  1. Click the Update button and voilà, your files already updated in the vector store.


Rules in External Object Storage

The Rules feature streamlines the process of importing data into Vector Store from external object storage repositories. By incorporating Rules with Glob patterns, users can define specific criteria for importing files, providing a more tailored and automated approach to data integration.

Users can also filer the rules to include or exclude which files by using the Glob Pattern.

How to write rules

Writing Rules for File Import

When setting up rules for importing files from your External Object Storage into the Vector Store, understanding how to effectively use Glob patterns is crucial. Here are steps and tips to guide you through the process:

Step 1: Identify File Patterns

First, determine the common patterns in the names of files you wish to import or exclude. For example, if you want to import all .pdf files, your pattern would be *.pdf.

Step 2: Utilize Wildcards

  • * (asterisk) matches zero or more characters. For instance, *.pdf matches all files ending in .pdf.

  • ? (question mark) matches exactly one character. For example, ?.pdf matches a.pdf but not ab.pdf.

Step 3: Specifying Directories

If you want to specify files in a particular directory, include the directory name in your pattern. For example, myfolder/*.pdf matches all .pdf files in the myfolder directory.

Step 4: Excluding Files

To exclude files, you can use the negation pattern !. For example, if you want to import all .pdf files except those starting with temp, your rules would include *.pdf and !temp*.pdf.

Step 5: Combining Patterns

You can combine multiple patterns to fine-tune your selection. For example, to import .pdf and .docx files but exclude those in the drafts folder, use *.pdf, *.docs, and !drafts/*.

Example Rules

  1. Import all .pdf files: *.pdf

  2. Import all files except those in the temp directory: *, !temp/*

  3. Import all .docx and .txt files in the data directory: data/*.docx, data/*.txt

Remember, the order of rules matters. Patterns defined later can override those defined earlier, so plan your rules accordingly to ensure the correct files are imported.

Last updated