Searching with Trieve
Learn how to search over your data with Trieve
Overview
We provide the ability for you to search your data in a fast and performant manner. We have multiple search paradigms, which are exposed through the search over chunks route, the search within groups route, and the search over groups route.
Different Search Paradigms
We offer three different search strategies for you to choose from:
- Search over chunks: This strategy allows you to search all of your chunks independently. This is useful when your chunks are independent and do not need to be grouped together.
- Search within groups: This strategy lets you constrain your results to within a selected group. This is useful for searching distinct groups within your dataset independently.
- Search over groups: This strategy allows you to search over the groups of chunks within your dataset. This returns the groups and the top chunks within each group that matched your query, providing better search quality for datasets with highly related chunks within groups.
You can use the search UI at search.trieve.ai to A/B test which search method works best for you.
Important Parameters
query
: The user query that is embedded and searched against the dataset.search_type
: Can besemantic
,fulltext
, orhybrid
.- Semantic: Uses cosine distance to determine the most relevant results.
- Fulltext: Uses a SPLADE model to find the most relevant results.
- Hybrid: Uses a reranker model that pulls one page of results from both
fulltext
andsemantic
searches to find the most relevant results.
page
: The page of chunks to fetch. Pages are 1-indexed.page_size
: This lets you tune the number of results that are returned.highlight_results
: Enables subsentence highlighting of relevant portions of the text.slim_chunks
: Excludeschunk_html
from the returned results to reduce network bandwidth. Useful for large chunks.recency_bias
: A value from 0-1 that tunes how much the recency of chunks (based on thetimestamp
field) affects the ranking.filters
: Apply filters to get exactly the results you want.
To optimize for the lowest latency, set highlight_results
and get_total_pages
to false
and set slim_chunks
to true
. If you are willing to sacrifice some search quality for speed, use the fulltext
search mode.
Filtering
We provide a system to allow users to filter the chunks that are returned.
The filters are structured around three clauses:
-
must
: All filters within this clause must be matched to return the chunks.Get chunks with both “CO” and “321” in their
tag_set
:Get chunks with either “CO” OR “321” in their
tag_set
: -
must_not
: All filters in this clause must not be matched to return the chunks.Get chunks with neither “CO” nor “321” in their
tag_set
:Get chunks that either don’t have “CO” in their
tag_set
or don’t have “321” in theirtag_set
: -
should
: Any of these conditions can be matched to return a chunk.Get chunks that either have “CO” in their
tag_set
or “http://example.com” in theirlink
: