- POSTCreate or Upsert Chunk or Chunks
- POSTSearch
- POSTAutocomplete
- POSTGet Recommended Chunks
- POSTScroll Chunks
- POSTCount chunks above threshold
- POSTGenerate suggested queries
- POSTRAG on Specified Chunks
- PUTUpdate Chunk
- PUTUpdate Chunk By Tracking Id
- GETGet Chunk By Id
- GETGet Chunk By Tracking Id
- POSTGet Chunks By Tracking Ids
- POSTGet Chunks By Ids
- DELDelete Chunk
- DELDelete Chunk By Tracking Id
- DELBulk Delete Chunks
- POSTSplit HTML Content into Chunks
Chunk Group
- POSTCreate or Upsert Group or Groups
- POSTSearch Over Groups
- POSTSearch Within Group
- POSTGet Recommended Groups
- POSTAdd Chunk to Group
- POSTAdd Chunk to Group by Tracking ID
- POSTGet Groups for Chunks
- GETGet Chunks in Group by Tracking ID
- GETGet Group by Tracking ID
- PUTUpdate Group
- DELRemove Chunk from Group
- DELDelete Group by Tracking ID
- DELDelete Group
- GETGet Group
- GETGet Chunks in Group
- GETGet Groups for Dataset
- POSTCreate Dataset
- POSTBatch Create Datasets
- POSTGet All Tags
- POSTGet events for the dataset
- PUTUpdate Dataset by ID or Tracking ID
- PUTClear Dataset
- GETGet Dataset By ID
- GETGet Dataset by Tracking ID
- GETGet Datasets from Organization
- POSTCreate ETL Job
- PUTCreate Pagefind Index for Dataset
- GETGet Pagefind Index Url for Dataset
- GETGet Usage By Dataset ID
- GETGet dataset crawl options
- GETGet apipublic page
- DELDelete Dataset
- DELDelete Dataset by Tracking ID
Scroll Chunks
Get paginated chunks from your dataset with filters and custom sorting. If sort by is not specified, the results will sort by the id’s of the chunks in ascending order. Sort by and offset_chunk_id cannot be used together; if you want to scroll with a sort by then you need to use a must_not filter with the ids you have already seen. There is a limit of 1000 id’s in a must_not filter at a time.
curl --request POST \
--url \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"filters": {
"must": [
"field": "tag_set",
"match_all": [
"field": "num_value",
"range": {
"gte": 10,
"lte": 25
"offset_chunk_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"page_size": 1,
"sort_by": {
"direction": "desc",
"field": "<string>",
"prefetch_amount": 1
"chunks": [
"chunk_html": "<p>Hello, world!</p>",
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "",
"metadata": {
"key": "value"
"tag_set": "[tag1,tag2]",
"time_stamp": "2021-01-01 00:00:00.000",
"tracking_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"updated_at": "2021-01-01 00:00:00.000",
"weight": 0.5
The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
All of these field conditions have to match for the chunk to be included in the result set.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
None of these field conditions can match for the chunk to be included in the result set.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Only one of these field conditions has to match for the chunk to be included in the result set.
Field is the name of the field to filter on. Commonly used fields are timestamp
, link
, tag_set
, location
, num_value
, group_ids
, and group_tracking_ids
. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata.
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Offset chunk id is the id of the chunk to start the page from. If not specified, this defaults to the first chunk in the dataset sorted by id ascending.
Page size is the number of chunks to fetch. This can be used to fetch more than 10 chunks at a time.
x > 0
Field to sort by. This has to be a numeric field with a Qdrant Range
index on it. i.e. num_value and timestamp
, asc
How many results to pull in before the sort
x > 0
Timestamp of the creation of the chunk
ID of the dataset which the chunk belongs to
Unique identifier of the chunk, auto-generated uuid created by Trieve
Timestamp of the last update of the chunk
Weight of the chunk, can be any float. Used as a multiplier on a chunk's relevance score for ranking purposes.
HTML content of the chunk, can also be an arbitrary string which is not HTML
Image URLs of the chunk, can be any list of strings. Used for image search and RAG.
Link to the chunk, should be a URL
Metadata of the chunk, can be any JSON object
Numeric value of the chunk, can be any float. Can represent the most relevant numeric value of the chunk, such as a price, quantity in stock, rating, etc.
Tag set of the chunk, can be any list of strings. Used for tag-filtered searches.
Timestamp of the chunk, can be any timestamp. Specified by the user.
Tracking ID of the chunk, can be any string, determined by the user. Tracking ID's are unique identifiers for chunks within a dataset. They are designed to match the unique identifier of the chunk in the user's system.
Was this page helpful?
curl --request POST \
--url \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"filters": {
"must": [
"field": "tag_set",
"match_all": [
"field": "num_value",
"range": {
"gte": 10,
"lte": 25
"offset_chunk_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"page_size": 1,
"sort_by": {
"direction": "desc",
"field": "<string>",
"prefetch_amount": 1
"chunks": [
"chunk_html": "<p>Hello, world!</p>",
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "",
"metadata": {
"key": "value"
"tag_set": "[tag1,tag2]",
"time_stamp": "2021-01-01 00:00:00.000",
"tracking_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"updated_at": "2021-01-01 00:00:00.000",
"weight": 0.5