Chunk
- POSTCreate or Upsert Chunk or Chunks
- POSTSearch
- POSTAutocomplete
- POSTGet Recommended Chunks
- POSTScroll Chunks
- POSTCount chunks above threshold
- POSTGenerate suggested queries
- POSTRAG on Specified Chunks
- PUTUpdate Chunk
- PUTUpdate Chunk By Tracking Id
- GETGet Chunk By Id
- GETGet Chunk By Tracking Id
- POSTGet Chunks By Tracking Ids
- POSTGet Chunks By Ids
- DELDelete Chunk
- DELDelete Chunk By Tracking Id
- DELBulk Delete Chunks
- POSTSplit HTML Content into Chunks
Chunk Group
- POSTCreate or Upsert Group or Groups
- POSTSearch Over Groups
- POSTSearch Within Group
- POSTGet Recommended Groups
- POSTAdd Chunk to Group
- POSTAdd Chunk to Group by Tracking ID
- POSTGet Groups for Chunks
- GETGet Chunks in Group by Tracking ID
- GETGet Group by Tracking ID
- PUTUpdate Group
- DELRemove Chunk from Group
- DELDelete Group by Tracking ID
- DELDelete Group
- GETGet Group
- GETGet Chunks in Group
- GETGet Groups for Dataset
Message
File
Analytics
Dataset
- POSTCreate Dataset
- POSTBatch Create Datasets
- POSTGet All Tags
- POSTGet events for the dataset
- PUTUpdate Dataset by ID or Tracking ID
- PUTClear Dataset
- GETGet Dataset By ID
- GETGet Dataset by Tracking ID
- GETGet Datasets from Organization
- GETGet Usage By Dataset ID
- GETGet Dataset Crawl Options
- GETGet apipublic page
- DELDelete Dataset
- DELDelete Dataset by Tracking ID
Organization
Health
Stripe
Metrics
Count chunks above threshold
This route can be used to determine the number of chunk results that match a search query including score threshold and filters. It may be high latency for large limits. There is a dataset configuration imposed restriction on the maximum limit value (default 10,000) which is used to prevent DDOS attacks. Auth’ed user or api key must have an admin or owner role for the specified dataset’s organization.
The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
All of these field conditions have to match for the chunk to be included in the result set.
None of these field conditions can match for the chunk to be included in the result set.
Only one of these field conditions has to match for the chunk to be included in the result set.
Set limit to restrict the maximum number of chunks to count. This is useful for when you want to reduce the latency of the count operation. By default the limit will be the number of chunks in the dataset.
Query is the search query. This can be any string. The query will be used to create an embedding vector and/or SPLADE vector which will be used to find the result set. You can either provide one query, or multiple with weights. Multi-query only works with Semantic Search and is not compatible with cross encoder re-ranking or highlights.
Set score_threshold to a float to filter out chunks with a score below the threshold. This threshold applies before weight and bias modifications. If not specified, this defaults to 0.0.
If true, quoted and - prefixed words will be parsed from the queries and used as required and negated words respectively. Default is false.
Authorizations
Headers
The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.
Body
Query is the search query. This can be any string. The query will be used to create an embedding vector and/or SPLADE vector which will be used to find the result set. You can either provide one query, or multiple with weights. Multi-query only works with Semantic Search and is not compatible with cross encoder re-ranking or highlights.
fulltext
, semantic
, bm25
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
All of these field conditions have to match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
None of these field conditions can match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Only one of these field conditions has to match for the chunk to be included in the result set.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Set limit to restrict the maximum number of chunks to count. This is useful for when you want to reduce the latency of the count operation. By default the limit will be the number of chunks in the dataset.
x > 0
Set score_threshold to a float to filter out chunks with a score below the threshold. This threshold applies before weight and bias modifications. If not specified, this defaults to 0.0.
If true, quoted and - prefixed words will be parsed from the queries and used as required and negated words respectively. Default is false.
Response
x > 0