POST
/
api
/
chunk_group
/
group_oriented_search

Authorizations

Authorization
string
header
required

Headers

TR-Dataset
string
required

The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.

X-API-Version
enum<string>

The API version to use for this request. Defaults to V2 for orgs created after July 12, 2024 and V1 otherwise.

Available options:
V1,
V2

Body

application/json
query
required

Query is the search query. This can be any string. The query will be used to create an embedding vector and/or SPLADE vector which will be used to find the result set. You can either provide one query, or multiple with weights. Multi-query only works with Semantic Search and is not compatible with cross encoder re-ranking or highlights.

search_type
enum<string>
required
Available options:
fulltext,
semantic,
hybrid,
bm25
filters
object

ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.

get_total_pages
boolean | null

Get total page count for the query accounting for the applied filters. Defaults to false, but can be set to true when the latency penalty is acceptable (typically 50-200ms).

group_size
integer | null

Group_size is the number of chunks to fetch for each group. The default is 3. If a group has less than group_size chunks, all chunks will be returned. If this is set to a large number, we recommend setting slim_chunks to true to avoid returning the content and chunk_html of the chunks so as to lower the amount of time required for content download and serialization.

Required range: x > 0
highlight_options
object

Highlight Options lets you specify different methods to highlight the chunks in the result set. If not specified, this defaults to the score of the chunks.

page
integer | null

Page of group results to fetch. Page is 1-indexed.

Required range: x > 0
page_size
integer | null

Page size is the number of group results to fetch. The default is 10.

Required range: x > 0
remove_stop_words
boolean | null

If true, stop words (specified in server/src/stop-words.txt in the git repo) will be removed. Queries that are entirely stop words will be preserved.

score_threshold
number | null

Set score_threshold to a float to filter out chunks with a score below the threshold. This threshold applies before weight and bias modifications. If not specified, this defaults to 0.0.

slim_chunks
boolean | null

Set slim_chunks to true to avoid returning the content and chunk_html of the chunks. This is useful for when you want to reduce amount of data over the wire for latency improvement (typicall 10-50ms). Default is false.

sort_options
object

Sort Options lets you specify different methods to rerank the chunks in the result set. If not specified, this defaults to the score of the chunks.

typo_options
object

Typo Options lets you specify different methods to correct typos in the query. If not specified, typos will not be corrected.

use_quote_negated_terms
boolean | null

If true, quoted and - prefixed words will be parsed from the queries and used as required and negated words respectively. Default is false.

user_id
string | null

The user_id is the id of the user who is making the request. This is used to track user interactions with the search results.

Response

200 - application/json
id
string
required
results
object[]
required
total_pages
integer
required
corrected_query
string | null