Learn how to upload your files to Trieve
base64url
encoding for the base64_file
field.base64_file
: To allow users to pass metadata with their file uploads, we require you to specifically use the base64url
encoding. Convert +
to -
, /
to _
, and remove the ending =
if present.file_name
: The name of the file being uploaded, including the extension. This will become the name of the resulting group.group_tracking_id
: This field allows you to assign an arbitrary ID to the group, aiding in coordination with your database system. You can search for this group using this ID.link
, tag_set
, and time_stamp
: These fields are indexed to enable fast filtering of groups based on these attributes.target_splits_per_chunk
: This is an optional field to specify number of splits you want per chunk. If not specified, the default 20 is used.metadata
: This field allows you to include any arbitrary metadata in the form of a JSON object with the group.pdf2md_options
: This allows you to use vision LLM to convert the files to markdown.
use_pdf2md_ocr
: If true, the file will be converted to markdown using vision LLM. You can test pdf2md
performance at pdf2md.trieve.ai.link
, tag_set
,
and time_stamp
fields, as there are dedicated indexes for these. The
metadata field has an index built for match queries but is not optimized for
range queries.TR-Dataset
header with your dataset ID and the Authorization
header with
your API key.pdf2md
performance at pdf2md.trieve.ai.
This allows for better preserving document context, readability, and structure and is especially useful when working with documents containing complex layouts (tables, lists, code blocks, etc).
After the uploaded file is converted into structured Markdown, chunks are created based on the semantic structure of the document allowing for better semantic coherence.
file_name
field is used to specify the name of the resulting groups. Once uploaded, documents can be queried using the file_name
field, allowing you to retrieve and perform operations on all chunks created from the file.