Chunk
- POSTCreate or Upsert Chunk or Chunks
- POSTSearch
- POSTAutocomplete
- POSTGet Recommended Chunks
- POSTScroll Chunks
- POSTCount chunks above threshold
- POSTGenerate suggested queries
- POSTRAG on Specified Chunks
- PUTUpdate Chunk
- PUTUpdate Chunk By Tracking Id
- GETGet Chunk By Id
- GETGet Chunk By Tracking Id
- POSTGet Chunks By Tracking Ids
- POSTGet Chunks By Ids
- DELDelete Chunk
- DELDelete Chunk By Tracking Id
- DELBulk Delete Chunks
- POSTSplit HTML Content into Chunks
Chunk Group
- POSTCreate or Upsert Group or Groups
- POSTSearch Over Groups
- POSTSearch Within Group
- POSTGet Recommended Groups
- POSTAdd Chunk to Group
- POSTAdd Chunk to Group by Tracking ID
- POSTGet Groups for Chunks
- GETGet Chunks in Group by Tracking ID
- GETGet Group by Tracking ID
- PUTUpdate Group
- DELRemove Chunk from Group
- DELDelete Group by Tracking ID
- DELDelete Group
- GETGet Group
- GETGet Chunks in Group
- GETGet Groups for Dataset
Message
File
Analytics
Dataset
- POSTCreate Dataset
- POSTBatch Create Datasets
- POSTGet All Tags
- POSTGet events for the dataset
- PUTUpdate Dataset by ID or Tracking ID
- PUTClear Dataset
- GETGet Dataset By ID
- GETGet Dataset by Tracking ID
- GETGet Datasets from Organization
- GETGet Usage By Dataset ID
- GETGet Dataset Crawl Options
- GETGet apipublic page
- DELDelete Dataset
- DELDelete Dataset by Tracking ID
Organization
Health
Stripe
Metrics
Create Organization Api Key
Create a new api key for the organization. Successful response will contain the newly created api key.
The organization id to use for the request.
The dataset ids which the api key will have access to. If not provided or empty, the api key will have access to all datasets in the dataset.
The default parameters which will be forcibly used when the api key is given on a request. If not provided, the api key will not have default parameters.
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
All of these field conditions have to match for the chunk to be included in the result set.
None of these field conditions can match for the chunk to be included in the result set.
Only one of these field conditions has to match for the chunk to be included in the result set.
Highlight Options lets you specify different methods to highlight the chunks in the result set. If not specified, this defaults to the score of the chunks.
Set highlight_delimiters to a list of strings to use as delimiters for highlighting. If not specified, this defaults to ["?", ",", ".", "!"]. These are the characters that will be used to split the chunk_html into splits for highlighting. These are the characters that will be used to split the chunk_html into splits for highlighting.
Set highlight_max_length to control the maximum number of tokens (typically whitespace separated strings, but sometimes also word stems) which can be present within a single highlight. If not specified, this defaults to 8. This is useful to shorten large splits which may have low scores due to length compared to the query. Set to something very large like 100 to highlight entire splits.
Set highlight_max_num to control the maximum number of highlights per chunk. If not specified, this defaults to 3. It may be less than 3 if no snippets score above the highlight_threshold.
Set highlight_results to false for a slight latency improvement (1-10ms). If not specified, this defaults to true. This will add <mark><b>
tags to the chunk_html of the chunks to highlight matching splits and return the highlights on each scored chunk in the response.
Set highlight_threshold to a lower or higher value to adjust the sensitivity of the highlights applied to the chunk html. If not specified, this defaults to 0.8. The range is 0.0 to 1.0.
Set highlight_window to a number to control the amount of words that are returned around the matched phrases. If not specified, this defaults to 0. This is useful for when you want to show more context around the matched words. When specified, window/2 whitespace separated words are added before and after each highlight in the response's highlights array. If an extended highlight overlaps with another highlight, the overlapping words are only included once. This parameter can be overriden to respect the highlight_max_length param.
Custom html tag which should appear after highlights. If not specified, this defaults to '</mark></b>'.
Custom html tag which should appear before highlights. If not specified, this defaults to '<mark><b>'.
Options for handling the response for the llm to return when no results are found
Page size is the number of chunks to fetch. This can be used to fetch more than 10 chunks at a time.
If true, stop words will be removed. Queries that are entirely stop words will be preserved.
Set score_threshold to a float to filter out chunks with a score below the threshold.
Set slim_chunks to true to avoid returning the content and chunk_html of the chunks.
Typo Options lets you specify different methods to correct typos in the query. If not specified, typos will not be corrected.
Set correct_typos to true to correct typos in the query. If not specified, this defaults to false.
Words that should not be corrected. If not specified, this defaults to an empty list.
The TypoRange struct is used to specify the range of which the query will be corrected if it has a typo.
The maximum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 8.
The minimum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 5.
Auto-require non-english words present in the dataset to exist in each results chunk_html text. If not specified, this defaults to true.
The TypoRange struct is used to specify the range of which the query will be corrected if it has a typo.
The maximum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 8.
The minimum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 5.
If true, quoted and - prefixed words will be parsed from the queries and used as required and negated words respectively.
The expiration date of the api key. If not provided, the api key will not expire. This should be provided in UTC time.
The name which will be assigned to the new api key.
The role which will be assigned to the new api key. Either 0 (read), 1 (Admin) or 2 (Owner). The auth'ed user must have a role greater than or equal to the role being assigned.
The routes which the api key will have access to. If not provided or empty, the api key will have access to all routes. Specify the routes as a list of strings. For example, ["GET /api/dataset", "POST /api/dataset"].
Authorizations
Headers
The organization id to use for the request.
Body
The name which will be assigned to the new api key.
The role which will be assigned to the new api key. Either 0 (read), 1 (Admin) or 2 (Owner). The auth'ed user must have a role greater than or equal to the role being assigned.
The dataset ids which the api key will have access to. If not provided or empty, the api key will have access to all datasets in the dataset.
The default parameters which will be forcibly used when the api key is given on a request. If not provided, the api key will not have default parameters.
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
All of these field conditions have to match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
None of these field conditions can match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Only one of these field conditions has to match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Highlight Options lets you specify different methods to highlight the chunks in the result set. If not specified, this defaults to the score of the chunks.
Set highlight_delimiters to a list of strings to use as delimiters for highlighting. If not specified, this defaults to ["?", ",", ".", "!"]. These are the characters that will be used to split the chunk_html into splits for highlighting. These are the characters that will be used to split the chunk_html into splits for highlighting.
Set highlight_max_length to control the maximum number of tokens (typically whitespace separated strings, but sometimes also word stems) which can be present within a single highlight. If not specified, this defaults to 8. This is useful to shorten large splits which may have low scores due to length compared to the query. Set to something very large like 100 to highlight entire splits.
x > 0
Set highlight_max_num to control the maximum number of highlights per chunk. If not specified, this defaults to 3. It may be less than 3 if no snippets score above the highlight_threshold.
x > 0
Set highlight_results to false for a slight latency improvement (1-10ms). If not specified, this defaults to true. This will add <mark><b>
tags to the chunk_html of the chunks to highlight matching splits and return the highlights on each scored chunk in the response.
exactmatch
, v1
Set highlight_threshold to a lower or higher value to adjust the sensitivity of the highlights applied to the chunk html. If not specified, this defaults to 0.8. The range is 0.0 to 1.0.
Set highlight_window to a number to control the amount of words that are returned around the matched phrases. If not specified, this defaults to 0. This is useful for when you want to show more context around the matched words. When specified, window/2 whitespace separated words are added before and after each highlight in the response's highlights array. If an extended highlight overlaps with another highlight, the overlapping words are only included once. This parameter can be overriden to respect the highlight_max_length param.
x > 0
Custom html tag which should appear after highlights. If not specified, this defaults to '</mark></b>'.
Custom html tag which should appear before highlights. If not specified, this defaults to '<mark><b>'.
Options for handling the response for the llm to return when no results are found
Page size is the number of chunks to fetch. This can be used to fetch more than 10 chunks at a time.
x > 0
If true, stop words will be removed. Queries that are entirely stop words will be preserved.
Set score_threshold to a float to filter out chunks with a score below the threshold.
fulltext
, semantic
, hybrid
, bm25
Set slim_chunks to true to avoid returning the content and chunk_html of the chunks.
Typo Options lets you specify different methods to correct typos in the query. If not specified, typos will not be corrected.
Set correct_typos to true to correct typos in the query. If not specified, this defaults to false.
Words that should not be corrected. If not specified, this defaults to an empty list.
The TypoRange struct is used to specify the range of which the query will be corrected if it has a typo.
The minimum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 5.
x > 0
The maximum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 8.
x > 0
Auto-require non-english words present in the dataset to exist in each results chunk_html text. If not specified, this defaults to true.
The TypoRange struct is used to specify the range of which the query will be corrected if it has a typo.
The minimum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 5.
x > 0
The maximum number of characters that the query will be corrected if it has a typo. If not specified, this defaults to 8.
x > 0
If true, quoted and - prefixed words will be parsed from the queries and used as required and negated words respectively.
The expiration date of the api key. If not provided, the api key will not expire. This should be provided in UTC time.
The routes which the api key will have access to. If not provided or empty, the api key will have access to all routes. Specify the routes as a list of strings. For example, ["GET /api/dataset", "POST /api/dataset"].
Response
The api key which was created. This is the value which should be used in the Authorization header.