Overview
We provide the ability to create and independently manage datasets for multi-tenant use cases.Creating a Dataset
You should have one dataset per tenant or unique knowledge base. To create a dataset, use the create dataset route. These datasets are kept isolated from each other and can be configured independently, making them perfect to represent each tenant within your application. Each dataset can have its own configurations, tags, and crawl options.Important parameters
tracking_id
: A unique, optional tracking ID for the dataset that reflects the id of the tenant within your system. You can use this tracking id in theTR-Dataset
header to specify the dataset for the request rather than the dataset id.crawl_options
: Provides the options to setup crawling to populate your dataset (e.g., include/exclude paths, tags, and more).dataset_name
: The name of the dataset. This must be a unique within the organization.server_configuration
: Provide the server configuration for the dataset such as RAG and system prompt, stop tokens, embedding models, and more.
curl
Update configuration across datasets
You can also manage all of your dataset configurations at once using the the update all dataset configurations route. Use theserver_configuration
parameter to pass in a new configuration for all datasets in the organization.
Example of updating all dataset configurations in an organization:
Only the specified keys in the
server_configuration
object will be updated
for each dataset, keeping the unique values for other fields unchanged.Creating an Organization
It is very rare that you would need to create an organization through the API, but it is possible and explained below. The main route we use to expose this functionality is the create organization route. Use thename
parameter to pass a arbitrary, unique name which will be used to identify the organization. We recommend that you create seperate organizations for your main application as well as a staging environment for testing.
Example of creating an organization on demand through the API: