curl --request POST \
--url https://api.trieve.ai/api/file \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"base64_file": "<base64_encoded_file>",
"create_chunks": true,
"description": "This is an example file",
"file_name": "example.pdf",
"link": "https://example.com",
"metadata": {
"key1": "value1",
"key2": "value2"
},
"split_delimiters": [
",",
".",
"\n"
],
"tag_set": [
"tag1",
"tag2"
],
"target_splits_per_chunk": 20,
"time_stamp": "2021-01-01 00:00:00.000Z",
"use_pdf2md_ocr": false
}'
{
"file_metadata": {
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"file_name": "file.txt",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "https://trieve.ai",
"metadata": {
"key": "value"
},
"size": 1000,
"tag_set": "tag1,tag2",
"time_stamp": "2021-01-01 00:00:00.000",
"updated_at": "2021-01-01 00:00:00.000"
}
}
Upload a file to S3 bucket attached to your dataset. You can select between a naive chunking strategy where the text is extracted with Apache Tika and split into segments with a target number of segments per chunk OR you can use a vision LLM to convert the file to markdown and create chunks per page. You must specifically use a base64url encoding. Auth’ed user must be an admin or owner of the dataset’s organization to upload a file.
curl --request POST \
--url https://api.trieve.ai/api/file \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"base64_file": "<base64_encoded_file>",
"create_chunks": true,
"description": "This is an example file",
"file_name": "example.pdf",
"link": "https://example.com",
"metadata": {
"key1": "value1",
"key2": "value2"
},
"split_delimiters": [
",",
".",
"\n"
],
"tag_set": [
"tag1",
"tag2"
],
"target_splits_per_chunk": 20,
"time_stamp": "2021-01-01 00:00:00.000Z",
"use_pdf2md_ocr": false
}'
{
"file_metadata": {
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"file_name": "file.txt",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "https://trieve.ai",
"metadata": {
"key": "value"
},
"size": 1000,
"tag_set": "tag1,tag2",
"time_stamp": "2021-01-01 00:00:00.000",
"updated_at": "2021-01-01 00:00:00.000"
}
}
The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.
JSON request payload to upload a file
The body is of type object
.
Confirmation that the file is uploading
The response is of type object
.
Was this page helpful?