POST
/
api
/
file

Authorizations

Authorization
string
headerrequired

Headers

TR-Dataset
string
required

The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.

Body

application/json
base64_file
string
required

Base64 encoded file. This is the standard base64url encoding.

file_name
string
required

Name of the file being uploaded, including the extension.

create_chunks
boolean | null

Create chunks is a boolean which determines whether or not to create chunks from the file. If false, you can manually chunk the file and send the chunks to the create_chunk endpoint with the file_id to associate chunks with the file. Meant mostly for advanced users.

description
string | null

Description is an optional convience field so you do not have to remember what the file contains or is about. It will be included on the group resulting from the file which will hold its chunk.

group_tracking_id
string | null

Group tracking id is an optional field which allows you to specify the tracking id of the group that is created from the file. Chunks created will be created with the tracking id of group_tracking_id|<index of chunk>

link
string | null

Link to the file. This can also be any string. This can be used to filter when searching for the file's resulting chunks. The link value will not affect embedding creation.

metadata
any | null

Metadata is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata. Will be passed down to the file's chunks.

rebalance_chunks
boolean | null

Rebalance chunks is an optional field which allows you to specify whether or not to rebalance the chunks created from the file. If not specified, the default true is used. If true, Trieve will evenly distribute remainder splits across chunks such that 66 splits with a target_splits_per_chunk of 20 will result in 3 chunks with 22 splits each.

split_delimiters
string[] | null

Split delimiters is an optional field which allows you to specify the delimiters to use when splitting the file before chunking the text. If not specified, the default [.!?\n] are used to split into sentences. However, you may want to use spaces or other delimiters.

tag_set
string[] | null

Tag set is a comma separated list of tags which will be passed down to the chunks made from the file. Tags are used to filter chunks when searching. HNSW indices are created for each tag such that there is no performance loss when filtering on them.

target_splits_per_chunk
integer | null

Target splits per chunk. This is an optional field which allows you to specify the number of splits you want per chunk. If not specified, the default 20 is used. However, you may want to use a different number.

Required range: x > 0
time_stamp
string | null

Time stamp should be an ISO 8601 combined date and time without timezone. Time_stamp is used for time window filtering and recency-biasing search results. Will be passed down to the file's chunks.

use_pdf2md_ocr
boolean | null

Parameter to use pdf2md_ocr. If true, the file will be converted to markdown using gpt-4o. Default is false.

Response

200 - application/json
file_metadata
object
required