POST
/
api
/
chunk
/
generate
curl --request POST \
  --url https://api.trieve.ai/api/chunk/generate \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --header 'TR-Dataset: <tr-dataset>' \
  --data '{
  "chunk_ids": [
    "d290f1ee-6c54-4b01-90e6-d701748f0851"
  ],
  "prev_messages": [
    {
      "content": "How do I setup RAG with Trieve?",
      "role": "user"
    }
  ],
  "prompt": "Respond to the instruction and include the doc numbers that you used in square brackets at the end of the sentences that you used the docs for:",
  "stream_response": true
}'
"<string>"

Authorizations

Authorization
string
header
required

Headers

TR-Dataset
string
required

The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.

Body

application/json
JSON request payload to perform RAG on some chunks (chunks)
chunk_ids
string[]
required

The ids of the chunks to be retrieved and injected into the context window for RAG.

prev_messages
object[]
required

The previous messages to be placed into the chat history. There must be at least one previous message.

audio_input
string | null

Audio input to be used in the chat. This will be used to generate the audio tokens for the model. The default is None.

context_options
object

Context options to use for the completion. If not specified, all options will default to false.

frequency_penalty
number | null

Frequency penalty is a number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. Default is 0.7.

highlight_results
boolean | null

Set highlight_results to false for a slight latency improvement (1-10ms). If not specified, this defaults to true. This will add <mark><b> tags to the chunk_html of the chunks to highlight matching splits.

image_config
object

Configuration for sending images to the llm

image_urls
string[] | null

Image URLs to be used in the chat. These will be used to generate the image tokens for the model. The default is None.

max_tokens
integer | null

The maximum number of tokens to generate in the chat completion. Default is None.

Required range: x > 0
presence_penalty
number | null

Presence penalty is a number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. Default is 0.7.

prompt
string | null

Prompt will be used to tell the model what to generate in the next message in the chat. The default is 'Respond to the previous instruction and include the doc numbers that you used in square brackets at the end of the sentences that you used the docs for:'. You can also specify an empty string to leave the final message alone such that your user's final message can be used as the prompt. See docs.trieve.ai or contact us for more information.

stop_tokens
string[] | null

Stop tokens are up to 4 sequences where the API will stop generating further tokens. Default is None.

stream_response
boolean | null

Whether or not to stream the response. If this is set to true or not included, the response will be a stream. If this is set to false, the response will be a normal JSON response. Default is true.

temperature
number | null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Default is 0.5.

user_id
string | null

User ID is the id of the user who is making the request. This is used to track user interactions with the RAG results.

Response

200
text/plain
This will be a JSON response of a string containing the LLM's generated inference. Response if not streaming.

The response is of type string.