> ## Documentation Index
> Fetch the complete documentation index at: https://docs.trieve.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Dataset

> Dataset will be created in the org specified via the TR-Organization header. Auth'ed user must be an owner of the organization to create a dataset.



## OpenAPI

````yaml post /api/dataset
openapi: 3.0.3
info:
  title: Trieve API
  description: >-
    Trieve OpenAPI Specification. This document describes all of the operations
    available through the Trieve API.
  contact:
    name: Trieve Team
    url: https://trieve.ai
    email: developers@trieve.ai
  license:
    name: BSL
    url: https://github.com/devflowinc/trieve/blob/main/LICENSE.txt
  version: 0.13.0
servers:
  - url: https://api.trieve.ai
    description: Production server
  - url: http://localhost:8090
    description: Local development server
security: []
tags:
  - name: Invitation
    description: Invitation endpoint. Exists to invite users to an organization.
  - name: Auth
    description: Authentication endpoint. Serves to register and authenticate users.
  - name: User
    description: User endpoint. Enables you to modify user roles and information.
  - name: Organization
    description: >-
      Organization endpoint. Enables you to modify organization roles and
      information.
  - name: Dataset
    description: >-
      Dataset endpoint. Datasets belong to organizations and hold configuration
      information for both client and server. Datasets contain chunks and chunk
      groups.
  - name: Chunk
    description: >-
      Chunk endpoint. Think of chunks as individual searchable units of
      information. The majority of your integration will likely be with the
      Chunk endpoint.
  - name: Chunk Group
    description: >-
      Chunk groups endpoint. Think of a chunk_group as a bookmark folder within
      the dataset.
  - name: Crawl
    description: Crawl endpoint. Used to create and manage crawls for datasets.
  - name: File
    description: >-
      File endpoint. When files are uploaded, they are stored in S3 and broken
      up into chunks with text extraction from Apache Tika. You can upload files
      of pretty much any type up to 1GB in size. See chunking algorithm details
      at `docs.trieve.ai` for more information on how chunking works. Improved
      default chunking is on our roadmap.
  - name: Events
    description: >-
      Notifications endpoint. Files are uploaded asynchronously and events are
      sent to the user when the upload is complete.
  - name: Topic
    description: >-
      Topic chat endpoint. Think of topics as the storage system for gen-ai chat
      memory. Gen AI messages belong to topics.
  - name: Message
    description: >-
      Message chat endpoint. Messages are units belonging to a topic in the
      context of a chat with a LLM. There are system, user, and assistant
      messages.
  - name: Stripe
    description: >-
      Stripe endpoint. Used for the managed SaaS version of this app. Eventually
      this will become a micro-service. Reach out to the team using contact info
      found at `docs.trieve.ai` for more information.
  - name: Health
    description: Health check endpoint. Used to check if the server is up and running.
  - name: Metrics
    description: Metrics endpoint. Used to get information for monitoring
  - name: Analytics
    description: Analytics endpoint. Used to get information for search and RAG analytics
  - name: Experiment
    description: Experiment endpoint. Used to create and manage experiments
paths:
  /api/dataset:
    post:
      tags:
        - Dataset
      summary: Create Dataset
      description: >-
        Dataset will be created in the org specified via the TR-Organization
        header. Auth'ed user must be an owner of the organization to create a
        dataset.
      operationId: create_dataset
      parameters:
        - name: TR-Organization
          in: header
          description: The organization id to use for the request
          required: true
          schema:
            type: string
            format: uuid
      requestBody:
        description: JSON request payload to create a new dataset
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateDatasetReqPayload'
        required: true
      responses:
        '200':
          description: Dataset created successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Dataset'
        '400':
          description: Service error relating to creating the dataset
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponseBody'
      security:
        - ApiKey:
            - owner
components:
  schemas:
    CreateDatasetReqPayload:
      type: object
      required:
        - dataset_name
      properties:
        dataset_name:
          type: string
          description: Name of the dataset.
        server_configuration:
          allOf:
            - $ref: '#/components/schemas/DatasetConfigurationDTO'
          nullable: true
        tracking_id:
          type: string
          description: >-
            Optional tracking ID for the dataset. Can be used to track the
            dataset in external systems. Must be unique within the organization.
            Strongly recommended to not use a valid uuid value as that will not
            work with the TR-Dataset header.
          nullable: true
      example:
        dataset_name: My Dataset
        organization_id: 00000000-0000-0000-0000-000000000000
        server_configuration:
          AIMON_RERANKER_TASK_DEFINITION: >-
            Your task is to grade the relevance of context document(s) against
            the specified user query.
          BM25_AVG_LEN: 256
          BM25_B: 0.75
          BM25_ENABLED: true
          BM25_K: 0.75
          DISTANCE_METRIC: cosine
          EMBEDDING_BASE_URL: https://api.openai.com/v1
          EMBEDDING_MODEL_NAME: text-embedding-3-small
          EMBEDDING_QUERY_PREFIX: ''
          EMBEDDING_SIZE: 1536
          FREQUENCY_PENALTY: 0
          FULLTEXT_ENABLED: true
          INDEXED_ONLY: false
          LLM_BASE_URL: https://api.openai.com/v1
          LLM_DEFAULT_MODEL: gpt-3.5-turbo-1106
          LOCKED: false
          MAX_LIMIT: 10000
          MESSAGE_TO_QUERY_PROMPT: >+
            Write a 1-2 sentence semantic search query along the lines of a
            hypothetical response to: 

          N_RETRIEVALS_TO_INCLUDE: 8
          PRESENCE_PENALTY: 0
          QDRANT_ONLY: false
          RAG_PROMPT: >-
            Use the following retrieved documents to respond briefly and
            accurately:
          SEMANTIC_ENABLED: true
          STOP_TOKENS:
            - |+


            - |+

          SYSTEM_PROMPT: You are a helpful assistant
          TEMPERATURE: 0.5
          USE_MESSAGE_TO_QUERY_PROMPT: false
    Dataset:
      type: object
      required:
        - id
        - name
        - created_at
        - updated_at
        - organization_id
        - server_configuration
        - deleted
      properties:
        created_at:
          type: string
          format: date-time
          description: Timestamp of the creation of the dataset
        deleted:
          type: integer
          format: int32
          description: >-
            Flag to indicate if the dataset has been deleted. Deletes are
            handled async after the flag is set so as to avoid expensive search
            index compaction.
        id:
          type: string
          format: uuid
          description: >-
            Unique identifier of the dataset, auto-generated uuid created by
            Trieve
        name:
          type: string
          description: Name of the dataset
        organization_id:
          type: string
          format: uuid
          description: Unique identifier of the organization that owns the dataset
        server_configuration:
          description: Configuration of the dataset for RAG, embeddings, BM25, etc.
        tracking_id:
          type: string
          description: >-
            Tracking ID of the dataset, can be any string, determined by the
            user. Tracking ID's are unique identifiers for datasets within an
            organization. They are designed to match the unique identifier of
            the dataset in the user's system.
          nullable: true
        updated_at:
          type: string
          format: date-time
          description: Timestamp of the last update of the dataset
      example:
        created_at: '2021-01-01 00:00:00.000'
        id: e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3
        name: Trieve
        organization_id: e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3
        server_configuration:
          AIMON_RERANKER_TASK_DEFINITION: >-
            Your task is to grade the relevance of context document(s) against
            the specified user query.
          BM25_AVG_LEN: 256
          BM25_B: 0.75
          BM25_ENABLED: true
          BM25_K: 0.75
          DISTANCE_METRIC: cosine
          EMBEDDING_BASE_URL: https://embedding.trieve.ai
          EMBEDDING_MODEL_NAME: jina-base-en
          EMBEDDING_QUERY_PREFIX: ''
          EMBEDDING_SIZE: 768
          FREQUENCY_PENALTY: 0
          FULLTEXT_ENABLED: true
          INDEXED_ONLY: false
          LLM_BASE_URL: https://api.openai.com/v1
          LLM_DEFAULT_MODEL: gpt-4o
          LOCKED: false
          MAX_LIMIT: 10000
          MESSAGE_TO_QUERY_PROMPT: >+
            Write a 1-2 sentence semantic search query along the lines of a
            hypothetical response to: 

          N_RETRIEVALS_TO_INCLUDE: 8
          PRESENCE_PENALTY: 0
          QDRANT_ONLY: false
          RAG_PROMPT: >-
            Use the following retrieved documents to respond briefly and
            accurately:
          SEMANTIC_ENABLED: true
          STOP_TOKENS:
            - |+


            - |+

          SYSTEM_PROMPT: You are a helpful assistant
          TEMPERATURE: 0.5
          USE_MESSAGE_TO_QUERY_PROMPT: false
        tracking_id: foobar-dataset
        updated_at: '2021-01-01 00:00:00.000'
    ErrorResponseBody:
      type: object
      required:
        - message
      properties:
        message:
          type: string
      example:
        message: Bad Request
    DatasetConfigurationDTO:
      type: object
      description: Lets you specify the configuration for a dataset
      properties:
        AIMON_RERANKER_TASK_DEFINITION:
          type: string
          nullable: true
        BM25_AVG_LEN:
          type: number
          format: float
          description: The average length of the chunks in the index for BM25
          nullable: true
        BM25_B:
          type: number
          format: float
          description: The BM25 B parameter
          nullable: true
        BM25_ENABLED:
          type: boolean
          description: Whether to use BM25
          nullable: true
        BM25_K:
          type: number
          format: float
          description: The BM25 K parameter
          nullable: true
        DISABLE_ANALYTICS:
          type: boolean
          description: Whether to disable analytics
          nullable: true
        DISTANCE_METRIC:
          allOf:
            - $ref: '#/components/schemas/DistanceMetric'
          nullable: true
        EMBEDDING_BASE_URL:
          type: string
          description: The base URL for the embedding API
          nullable: true
        EMBEDDING_MODEL_NAME:
          type: string
          description: The name of the embedding model to use
          nullable: true
        EMBEDDING_QUERY_PREFIX:
          type: string
          description: The prefix to use for the embedding query
          nullable: true
        EMBEDDING_SIZE:
          type: integer
          description: The size of the embeddings
          nullable: true
          minimum: 0
        FREQUENCY_PENALTY:
          type: number
          format: double
          description: The frequency penalty to use
          nullable: true
        FULLTEXT_ENABLED:
          type: boolean
          description: Whether to use fulltext search
          nullable: true
        INDEXED_ONLY:
          type: boolean
          description: Whether to only use indexed chunks
          nullable: true
        LLM_API_VERSION:
          type: string
          description: The API version for the LLM API
          nullable: true
        LLM_BASE_URL:
          type: string
          description: The base URL for the LLM API
          nullable: true
        LLM_DEFAULT_MODEL:
          type: string
          description: The default model to use for the LLM
          nullable: true
        LOCKED:
          type: boolean
          description: Whether the dataset is locked to prevent changes or deletion
          nullable: true
        MAX_LIMIT:
          type: integer
          format: int64
          description: The maximum limit for the number of chunks for counting
          nullable: true
          minimum: 0
        MAX_TOKENS:
          type: integer
          format: int64
          description: The maximum number of tokens to use in LLM Response
          nullable: true
          minimum: 0
        MESSAGE_TO_QUERY_PROMPT:
          type: string
          description: The prompt to use for converting a message to a query
          nullable: true
        N_RETRIEVALS_TO_INCLUDE:
          type: integer
          description: The number of retrievals to include with the RAG model
          nullable: true
          minimum: 0
        PAGEFIND_ENABLED:
          type: boolean
          description: Whether to enable pagefind indexing
          nullable: true
        PRESENCE_PENALTY:
          type: number
          format: double
          description: The presence penalty to use
          nullable: true
        PUBLIC_DATASET:
          allOf:
            - $ref: '#/components/schemas/PublicDatasetOptions'
          nullable: true
        QDRANT_ONLY:
          type: boolean
          description: Whether or not to insert chunks into Postgres
          nullable: true
        RAG_PROMPT:
          type: string
          description: The prompt to use for the RAG model
          nullable: true
        RERANKER_BASE_URL:
          type: string
          description: The base URL for the reranker API
          nullable: true
        RERANKER_MODEL_NAME:
          type: string
          description: The model name for the Reranker API
          nullable: true
        SEMANTIC_ENABLED:
          type: boolean
          description: Whether to use semantic search
          nullable: true
        STOP_TOKENS:
          type: array
          items:
            type: string
          description: The stop tokens to use
          nullable: true
        SYSTEM_PROMPT:
          type: string
          description: The system prompt to use for the LLM
          nullable: true
        TEMPERATURE:
          type: number
          format: double
          description: The temperature to use
          nullable: true
        TOOL_CONFIGURATION:
          allOf:
            - $ref: '#/components/schemas/ToolConfiguration'
          nullable: true
        USE_MESSAGE_TO_QUERY_PROMPT:
          type: boolean
          description: Whether to use the message to query prompt
          nullable: true
      example:
        AIMON_RERANKER_TASK_DEFINITION: >-
          Your task is to grade the relevance of context document(s) against the
          specified user query.
        BM25_AVG_LEN: 256
        BM25_B: 0.75
        BM25_ENABLED: true
        BM25_K: 0.75
        DISTANCE_METRIC: cosine
        EMBEDDING_BASE_URL: https://embedding.trieve.ai
        EMBEDDING_MODEL_NAME: jina-base-en
        EMBEDDING_QUERY_PREFIX: ''
        EMBEDDING_SIZE: 768
        FREQUENCY_PENALTY: 0
        FULLTEXT_ENABLED: true
        INDEXED_ONLY: false
        LLM_BASE_URL: https://api.openai.com/v1
        LLM_DEFAULT_MODEL: gpt-4o
        LOCKED: false
        MAX_LIMIT: 10000
        MESSAGE_TO_QUERY_PROMPT: >+
          Write a 1-2 sentence semantic search query along the lines of a
          hypothetical response to: 

        N_RETRIEVALS_TO_INCLUDE: 8
        PRESENCE_PENALTY: 0
        QDRANT_ONLY: false
        RAG_PROMPT: >-
          Use the following retrieved documents to respond briefly and
          accurately:
        SEMANTIC_ENABLED: true
        STOP_TOKENS:
          - |+


          - |+

        SYSTEM_PROMPT: You are a helpful assistant
        TEMPERATURE: 0.5
        USE_MESSAGE_TO_QUERY_PROMPT: false
    DistanceMetric:
      type: string
      enum:
        - euclidean
        - cosine
        - manhattan
        - dot
    PublicDatasetOptions:
      type: object
      required:
        - enabled
      properties:
        enabled:
          type: boolean
        extra_params:
          allOf:
            - $ref: '#/components/schemas/PublicPageParameters'
          nullable: true
    ToolConfiguration:
      type: object
      properties:
        query_tool_options:
          allOf:
            - $ref: '#/components/schemas/QueryToolOptions'
          nullable: true
    PublicPageParameters:
      type: object
      properties:
        allowSwitchingModes:
          type: boolean
          nullable: true
        analytics:
          type: boolean
          nullable: true
        apiKey:
          type: string
          nullable: true
        baseUrl:
          type: string
          nullable: true
        brandColor:
          type: string
          nullable: true
        brandFontFamily:
          type: string
          nullable: true
        brandLogoImgSrcUrl:
          type: string
          nullable: true
        brandName:
          type: string
          nullable: true
        buttonTriggers:
          type: array
          items:
            $ref: '#/components/schemas/ButtonTrigger'
          nullable: true
        chat:
          type: boolean
          nullable: true
        creatorLinkedInUrl:
          type: string
          nullable: true
        creatorName:
          type: string
          nullable: true
        currencyPosition:
          type: string
          nullable: true
        datasetId:
          type: string
          format: uuid
          nullable: true
        debounceMs:
          type: integer
          format: int32
          nullable: true
        defaultAiQuestions:
          type: array
          items:
            $ref: '#/components/schemas/DefaultSearchQueryType'
          nullable: true
        defaultCurrency:
          type: string
          nullable: true
        defaultImageQuestion:
          type: string
          nullable: true
        defaultSearchMode:
          type: string
          nullable: true
        defaultSearchQueries:
          type: array
          items:
            $ref: '#/components/schemas/DefaultSearchQueryType'
          nullable: true
        defaultSearchQuery:
          type: string
          nullable: true
        floatingButtonPosition:
          type: string
          nullable: true
        floatingButtonVersion:
          type: string
          nullable: true
        floatingSearchIconPosition:
          type: string
          nullable: true
        followupQuestions:
          type: boolean
          nullable: true
        forBrandName:
          type: string
          nullable: true
        headingPrefix:
          type: string
          nullable: true
        heroPattern:
          allOf:
            - $ref: '#/components/schemas/HeroPattern'
          nullable: true
        hideDrawnText:
          type: boolean
          nullable: true
        imageStarterText:
          type: string
          nullable: true
        inline:
          type: boolean
          nullable: true
        inlineHeader:
          type: string
          nullable: true
        isTestMode:
          type: boolean
          nullable: true
        navLogoImgSrcUrl:
          type: string
          nullable: true
        notFilterToolCallOptions:
          allOf:
            - $ref: '#/components/schemas/NotFilterToolCallOptions'
          nullable: true
        numberOfSuggestions:
          type: integer
          nullable: true
          minimum: 0
        openGraphMetadata:
          allOf:
            - $ref: '#/components/schemas/OpenGraphMetadata'
          nullable: true
        openLinksInNewTab:
          type: boolean
          nullable: true
        placeholder:
          type: string
          nullable: true
        priceToolCallOptions:
          allOf:
            - $ref: '#/components/schemas/PriceToolCallOptions'
          nullable: true
        problemLink:
          type: string
          nullable: true
        relevanceToolCallOptions:
          allOf:
            - $ref: '#/components/schemas/RelevanceToolCallOptions'
          nullable: true
        responsive:
          type: boolean
          nullable: true
        searchBar:
          type: boolean
          nullable: true
        searchOptions:
          allOf:
            - $ref: '#/components/schemas/PublicPageSearchOptions'
          nullable: true
        searchPageProps:
          allOf:
            - $ref: '#/components/schemas/SearchPageProps'
          nullable: true
        searchToolCallOptions:
          allOf:
            - $ref: '#/components/schemas/SearchToolCallOptions'
          nullable: true
        showFloatingButton:
          type: boolean
          nullable: true
        showFloatingInput:
          type: boolean
          nullable: true
        showFloatingSearchIcon:
          type: boolean
          nullable: true
        showResultHighlights:
          type: boolean
          nullable: true
        singleProductOptions:
          allOf:
            - $ref: '#/components/schemas/SingleProductOptions'
          nullable: true
        suggestedQueries:
          type: boolean
          nullable: true
        tabMessages:
          type: array
          items:
            $ref: '#/components/schemas/PublicPageTabMessage'
          nullable: true
        tags:
          type: array
          items:
            $ref: '#/components/schemas/PublicPageTag'
          nullable: true
        theme:
          allOf:
            - $ref: '#/components/schemas/PublicPageTheme'
          nullable: true
        type:
          type: string
          nullable: true
        useGroupSearch:
          type: boolean
          nullable: true
        useLocal:
          type: boolean
          nullable: true
        usePagefind:
          type: boolean
          nullable: true
        videoLink:
          type: string
          nullable: true
        videoPosition:
          type: string
          nullable: true
        zIndex:
          type: integer
          format: int32
          nullable: true
    QueryToolOptions:
      type: object
      properties:
        max_price_option_description:
          type: string
          nullable: true
        min_price_option_description:
          type: string
          nullable: true
        price_filter_description:
          type: string
          nullable: true
        query_parameter_description:
          type: string
          nullable: true
        tool_description:
          type: string
          nullable: true
    ButtonTrigger:
      type: object
      required:
        - selector
        - mode
      properties:
        mode:
          type: string
        removeTriggers:
          type: boolean
          nullable: true
        selector:
          type: string
    DefaultSearchQueryType:
      oneOf:
        - type: string
        - $ref: '#/components/schemas/DefaultSearchQuery'
    HeroPattern:
      type: object
      properties:
        backgroundColor:
          type: string
          nullable: true
        foregroundColor:
          type: string
          nullable: true
        foregroundOpacity:
          type: number
          format: float
          nullable: true
        heroPatternName:
          type: string
          nullable: true
        heroPatternSvg:
          type: string
          nullable: true
    NotFilterToolCallOptions:
      type: object
      properties:
        toolDescription:
          type: string
          nullable: true
        userMessageTextPrefix:
          type: string
          nullable: true
    OpenGraphMetadata:
      type: object
      properties:
        description:
          type: string
          nullable: true
        image:
          type: string
          nullable: true
        title:
          type: string
          nullable: true
    PriceToolCallOptions:
      type: object
      required:
        - toolDescription
      properties:
        maxPriceDescription:
          type: string
          nullable: true
        minPriceDescription:
          type: string
          nullable: true
        toolDescription:
          type: string
    RelevanceToolCallOptions:
      type: object
      required:
        - toolDescription
      properties:
        highDescription:
          type: string
          nullable: true
        includeImages:
          type: boolean
          nullable: true
        lowDescription:
          type: string
          nullable: true
        mediumDescription:
          type: string
          nullable: true
        toolDescription:
          type: string
        userMessageTextPrefix:
          type: string
          nullable: true
    PublicPageSearchOptions:
      type: object
      properties:
        content_only:
          type: boolean
          description: >-
            Set content_only to true to only returning the chunk_html of the
            chunks. This is useful for when you want to reduce amount of data
            over the wire for latency improvement (typically 10-50ms). Default
            is false.
          nullable: true
        filters:
          allOf:
            - $ref: '#/components/schemas/ChunkFilter'
          nullable: true
        get_total_pages:
          type: boolean
          description: >-
            Get total page count for the query accounting for the applied
            filters. Defaults to false, but can be set to true when the latency
            penalty is acceptable (typically 50-200ms).
          nullable: true
        page:
          type: integer
          format: int64
          description: Page of chunks to fetch. Page is 1-indexed.
          nullable: true
          minimum: 0
        page_size:
          type: integer
          format: int64
          description: >-
            Page size is the number of chunks to fetch. This can be used to
            fetch more than 10 chunks at a time.
          nullable: true
          minimum: 0
        remove_stop_words:
          type: boolean
          description: >-
            If true, stop words (specified in server/src/stop-words.txt in the
            git repo) will be removed. Queries that are entirely stop words will
            be preserved.
          nullable: true
        score_threshold:
          type: number
          format: float
          description: >-
            Set score_threshold to a float to filter out chunks with a score
            below the threshold for cosine distance metric. For Manhattan
            Distance, Euclidean Distance, and Dot Product, it will filter out
            scores above the threshold distance. This threshold applies before
            weight and bias modifications. If not specified, this defaults to no
            threshold. A threshold of 0 will default to no threshold.
          nullable: true
        scoring_options:
          allOf:
            - $ref: '#/components/schemas/ScoringOptions'
          nullable: true
        search_type:
          allOf:
            - $ref: '#/components/schemas/SearchMethod'
          nullable: true
        slim_chunks:
          type: boolean
          description: >-
            Set slim_chunks to true to avoid returning the content and
            chunk_html of the chunks. This is useful for when you want to reduce
            amount of data over the wire for latency improvement (typically
            10-50ms). Default is false.
          nullable: true
        sort_options:
          allOf:
            - $ref: '#/components/schemas/SortOptions'
          nullable: true
        typo_options:
          allOf:
            - $ref: '#/components/schemas/TypoOptions'
          nullable: true
        use_autocomplete:
          type: boolean
          description: Enables autocomplete on the search modal.
          nullable: true
        use_quote_negated_terms:
          type: boolean
          description: >-
            If true, quoted and - prefixed words will be parsed from the queries
            and used as required and negated words respectively. Default is
            false.
          nullable: true
        user_id:
          type: string
          description: >-
            User ID is the id of the user who is making the request. This is
            used to track user interactions with the search results.
          nullable: true
      example:
        filters:
          must:
            - field: num_value
              range:
                gt: 0
                gte: 0
                lt: 1
                lte: 1
          must_not:
            - field: metadata.key3
              match:
                - value5
                - value6
          should:
            - field: metadata.key1
              match:
                - value1
                - value2
        score_threshold: 0.5
        search_type: semantic
    SearchPageProps:
      type: object
      properties:
        display:
          type: boolean
          nullable: true
        filterSidebarProps:
          allOf:
            - $ref: '#/components/schemas/SidebarFilters'
          nullable: true
    SearchToolCallOptions:
      type: object
      properties:
        noSearchRagContext:
          type: string
          nullable: true
        toolDescription:
          type: string
          nullable: true
        userMessageTextPrefix:
          type: string
          nullable: true
    SingleProductOptions:
      type: object
      properties:
        enabled:
          type: boolean
          nullable: true
        groupTrackingId:
          type: string
          nullable: true
        pdpPrompt:
          type: string
          nullable: true
        productDescriptionHtml:
          type: string
          nullable: true
        productName:
          type: string
          nullable: true
        productPrimaryImageUrl:
          type: string
          nullable: true
        productQuestions:
          type: array
          items:
            $ref: '#/components/schemas/DefaultSearchQuery'
          nullable: true
        productTrackingId:
          type: string
          nullable: true
        recSearchQuery:
          type: string
          nullable: true
    PublicPageTabMessage:
      type: object
      required:
        - title
        - tabInnerHtml
        - showComponentCode
      properties:
        showComponentCode:
          type: boolean
        tabInnerHtml:
          type: string
        title:
          type: string
    PublicPageTag:
      type: object
      required:
        - tag
      properties:
        description:
          type: string
          nullable: true
        iconClassName:
          type: string
          nullable: true
        label:
          type: string
          nullable: true
        selected:
          type: boolean
          nullable: true
        tag:
          type: string
    PublicPageTheme:
      type: string
      enum:
        - light
        - dark
    DefaultSearchQuery:
      type: object
      properties:
        imageUrl:
          type: string
          nullable: true
        query:
          type: string
          nullable: true
    ChunkFilter:
      type: object
      description: >-
        ChunkFilter is a JSON object which can be used to filter chunks. This is
        useful for when you want to filter chunks by arbitrary metadata. Unlike
        with tag filtering, there is a performance hit for filtering on
        metadata.
      properties:
        must:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            All of these field conditions have to match for the chunk to be
            included in the result set.
          nullable: true
        must_not:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            None of these field conditions can match for the chunk to be
            included in the result set.
          nullable: true
        should:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            Only one of these field conditions has to match for the chunk to be
            included in the result set.
          nullable: true
      example:
        must:
          - field: tag_set
            match_all:
              - A
              - B
          - field: num_value
            range:
              gte: 10
              lte: 25
    ScoringOptions:
      type: object
      description: >-
        Scoring options provides ways to modify the sparse or dense vector
        created for the query in order to change how potential matches are
        scored. If not specified, this defaults to no modifications.
      properties:
        fulltext_boost:
          allOf:
            - $ref: '#/components/schemas/FullTextBoost'
          nullable: true
        semantic_boost:
          allOf:
            - $ref: '#/components/schemas/SemanticBoost'
          nullable: true
    SearchMethod:
      type: string
      enum:
        - fulltext
        - semantic
        - hybrid
        - bm25
    SortOptions:
      type: object
      description: >-
        Sort Options lets you specify different methods to rerank the chunks in
        the result set. If not specified, this defaults to the score of the
        chunks.
      properties:
        location_bias:
          allOf:
            - $ref: '#/components/schemas/GeoInfoWithBias'
          nullable: true
        mmr:
          allOf:
            - $ref: '#/components/schemas/MmrOptions'
          nullable: true
        recency_bias:
          type: number
          format: float
          description: >-
            Recency Bias lets you determine how much of an effect the recency of
            chunks will have on the search results. If not specified, this
            defaults to 0.0. We recommend setting this to 1.0 for a gentle
            reranking of the results, >3.0 for a strong reranking of the
            results.
          nullable: true
        sort_by:
          allOf:
            - $ref: '#/components/schemas/QdrantSortBy'
          nullable: true
        tag_weights:
          type: object
          description: >-
            Tag weights is a JSON object which can be used to boost the ranking
            of chunks with certain tags. This is useful for when you want to be
            able to bias towards chunks with a certain tag on the fly. The keys
            are the tag names and the values are the weights.
          additionalProperties:
            type: number
            format: float
          nullable: true
        use_weights:
          type: boolean
          description: >-
            Set use_weights to true to use the weights of the chunks in the
            result set in order to sort them. If not specified, this defaults to
            true.
          nullable: true
    TypoOptions:
      type: object
      description: >-
        Typo Options lets you specify different methods to correct typos in the
        query. If not specified, typos will not be corrected.
      properties:
        correct_typos:
          type: boolean
          description: >-
            Set correct_typos to true to correct typos in the query. If not
            specified, this defaults to false.
          nullable: true
        disable_on_word:
          type: array
          items:
            type: string
          description: >-
            Words that should not be corrected. If not specified, this defaults
            to an empty list.
          nullable: true
        one_typo_word_range:
          allOf:
            - $ref: '#/components/schemas/TypoRange'
          nullable: true
        prioritize_domain_specifc_words:
          type: boolean
          description: >-
            Auto-require non-english words present in the dataset to exist in
            each results chunk_html text. If not specified, this defaults to
            true.
          nullable: true
        two_typo_word_range:
          allOf:
            - $ref: '#/components/schemas/TypoRange'
          nullable: true
    SidebarFilters:
      type: object
      properties:
        sections:
          type: array
          items:
            $ref: '#/components/schemas/FilterSidebarSection'
          nullable: true
    ConditionType:
      oneOf:
        - $ref: '#/components/schemas/FieldCondition'
        - $ref: '#/components/schemas/HasChunkIDCondition'
      description: >-
        Filters can be constructed using either fields on the chunk objects, ids
        or tracking ids of chunks, and finally ids or tracking ids of groups.
    FullTextBoost:
      type: object
      description: >-
        Boost the presence of certain tokens for fulltext (SPLADE) and keyword
        (BM25) search. I.e. boosting title phrases to priortize title matches or
        making sure that the listing for AirBNB itself ranks higher than
        companies who make software for AirBNB hosts by boosting the
        in-document-frequency of the AirBNB token (AKA word) for its official
        listing. Conceptually it multiples the in-document-importance second
        value in the tuples of the SPLADE or BM25 sparse vector of the
        chunk_html innerText for all tokens present in the boost phrase by the
        boost factor like so: (token, in-document-importance) -> (token,
        in-document-importance*boost_factor).
      required:
        - phrase
        - boost_factor
      properties:
        boost_factor:
          type: number
          format: double
          description: >-
            Amount to multiplicatevly increase the frequency of the tokens in
            the phrase by
        phrase:
          type: string
          description: The phrase to boost in the fulltext document frequency index
    SemanticBoost:
      type: object
      description: >-
        Semantic boosting moves the dense vector of the chunk in the direction
        of the distance phrase for semantic search. I.e. you can force a cluster
        by moving every chunk for a PDF closer to its title or push a chunk with
        a chunk_html of "iphone" 25% closer to the term "flagship" by using the
        distance phrase "flagship" and a distance factor of 0.25. Conceptually
        it's drawing a line (euclidean/L2 distance) between the vector for the
        innerText of the chunk_html and distance_phrase then moving the vector
        of the chunk_html distance_factor*L2Distance closer to or away from the
        distance_phrase point along the line between the two points.
      required:
        - phrase
        - distance_factor
      properties:
        distance_factor:
          type: number
          format: float
          description: >-
            Arbitrary float (positive or negative) specifying the multiplicate
            factor to apply before summing the phrase vector with the chunk_html
            embedding vector
        phrase:
          type: string
          description: >-
            Terms to embed in order to create the vector which is weighted
            summed with the chunk_html embedding vector
    GeoInfoWithBias:
      type: object
      description: >-
        Location bias lets you rank your results by distance from a location. If
        not specified, this has no effect. Bias allows you to determine how much
        of an effect the location of chunks will have on the search results. If
        not specified, this defaults to 0.0. We recommend setting this to 1.0
        for a gentle reranking of the results, >3.0 for a strong reranking of
        the results.
      required:
        - location
        - bias
      properties:
        bias:
          type: number
          format: double
          description: >-
            Bias lets you specify how much of an effect the location of chunks
            will have on the search results. If not specified, this defaults to
            0.0. We recommend setting this to 1.0 for a gentle reranking of the
            results, >3.0 for a strong reranking of the results.
        location:
          $ref: '#/components/schemas/GeoInfo'
    MmrOptions:
      type: object
      description: >-
        MMR Options lets you specify different methods to rerank the chunks in
        the result set using Maximal Marginal Relevance. If not specified, this
        defaults to the score of the chunks.
      required:
        - use_mmr
      properties:
        mmr_lambda:
          type: number
          format: float
          description: >-
            Set mmr_lambda to a value between 0.0 and 1.0 to control the
            tradeoff between relevance and diversity. Closer to 1.0 will give
            more diverse results, closer to 0.0 will give more relevant results.
            If not specified, this defaults to 0.5.
          nullable: true
        use_mmr:
          type: boolean
          description: >-
            Set use_mmr to true to use the Maximal Marginal Relevance algorithm
            to rerank the results.
    QdrantSortBy:
      oneOf:
        - $ref: '#/components/schemas/SortByField'
        - $ref: '#/components/schemas/SortBySearchType'
      description: >-
        Sort by lets you specify a method to sort the results by. If not
        specified, this defaults to the score of the chunks. If specified, this
        can be any key in the chunk metadata. This key must be a numeric value
        within the payload.
    TypoRange:
      type: object
      description: >-
        The TypoRange struct is used to specify the range of which the query
        will be corrected if it has a typo.
      required:
        - min
      properties:
        max:
          type: integer
          format: int32
          description: >-
            The maximum number of characters that the query will be corrected if
            it has a typo. If not specified, this defaults to 8.
          nullable: true
          minimum: 0
        min:
          type: integer
          format: int32
          description: >-
            The minimum number of characters that the query will be corrected if
            it has a typo. If not specified, this defaults to 5.
          minimum: 0
    FilterSidebarSection:
      type: object
      required:
        - key
        - filterKey
        - title
        - selectionType
        - filterType
        - options
      properties:
        filterKey:
          type: string
        filterType:
          type: string
        key:
          type: string
        options:
          type: array
          items:
            $ref: '#/components/schemas/TagProp'
        selectionType:
          type: string
        title:
          type: string
    FieldCondition:
      type: object
      description: >-
        FieldCondition is a JSON object which can be used to filter chunks by a
        field. This is useful for when you want to filter chunks by arbitrary
        metadata. To access fields inside of the metadata that you provide with
        the card, prefix the field name with `metadata.`.
      required:
        - field
      properties:
        boolean:
          type: boolean
          description: >-
            Boolean is a true false value for a field. This only works for
            boolean fields. You can specify this if you want values to be true
            or false.
          nullable: true
        date_range:
          allOf:
            - $ref: '#/components/schemas/DateRange'
          nullable: true
        field:
          type: string
          description: >-
            Field is the name of the field to filter on. Commonly used fields
            are `timestamp`, `link`, `tag_set`, `location`, `num_value`,
            `group_ids`, and `group_tracking_ids`. The field value will be used
            to check for an exact substring match on the metadata values for
            each existing chunk. This is useful for when you want to filter
            chunks by arbitrary metadata. To access fields inside of the
            metadata that you provide with the card, prefix the field name with
            `metadata.`.
        geo_bounding_box:
          allOf:
            - $ref: '#/components/schemas/LocationBoundingBox'
          nullable: true
        geo_polygon:
          allOf:
            - $ref: '#/components/schemas/LocationPolygon'
          nullable: true
        geo_radius:
          allOf:
            - $ref: '#/components/schemas/LocationRadius'
          nullable: true
        match_all:
          type: array
          items:
            $ref: '#/components/schemas/MatchCondition'
          description: >-
            Match all lets you pass in an array of values that will return
            results if all of the items match. The match value will be used to
            check for an exact substring match on the metadata values for each
            existing chunk. If both match_all and match_any are provided, the
            match_any condition will be used.
          nullable: true
        match_any:
          type: array
          items:
            $ref: '#/components/schemas/MatchCondition'
          description: >-
            Match any lets you pass in an array of values that will return
            results if any of the items match. The match value will be used to
            check for an exact substring match on the metadata values for each
            existing chunk. If both match_all and match_any are provided, the
            match_any condition will be used.
          nullable: true
        range:
          allOf:
            - $ref: '#/components/schemas/Range'
          nullable: true
      example:
        field: metadata.key1
        match:
          - value1
          - value2
        range:
          gt: 0
          gte: 0
          lt: 1
          lte: 1
    HasChunkIDCondition:
      type: object
      description: >-
        HasChunkIDCondition is a JSON object which can be used to filter chunks
        by their ids or tracking ids. This is useful for when you want to filter
        chunks by their ids or tracking ids.
      properties:
        ids:
          type: array
          items:
            type: string
            format: uuid
          description: >-
            Ids of the chunks to apply a match_any condition with. Only chunks
            with one of these ids will be returned.
          nullable: true
        tracking_ids:
          type: array
          items:
            type: string
          description: >-
            Tracking ids of the chunks to apply a match_any condition with. Only
            chunks with one of these tracking ids will be returned.
          nullable: true
    GeoInfo:
      type: object
      description: Location that you want to use as the center of the search.
      required:
        - lat
        - lon
      properties:
        lat:
          $ref: '#/components/schemas/GeoTypes'
        lon:
          $ref: '#/components/schemas/GeoTypes'
    SortByField:
      type: object
      required:
        - field
      properties:
        direction:
          allOf:
            - $ref: '#/components/schemas/SortOrder'
          nullable: true
        field:
          type: string
          description: >-
            Field to sort by. This has to be a numeric field with a Qdrant
            `Range` index on it. i.e. num_value and timestamp
        prefetch_amount:
          type: integer
          format: int64
          description: How many results to pull in before the sort
          nullable: true
          minimum: 0
    SortBySearchType:
      type: object
      required:
        - rerank_type
      properties:
        prefetch_amount:
          type: integer
          format: int64
          description: How many results to pull in before the rerabj
          nullable: true
          minimum: 0
        rerank_query:
          type: string
          description: Query to use for prefetching defaults to the search query
          nullable: true
        rerank_type:
          $ref: '#/components/schemas/ReRankOptions'
    TagProp:
      type: object
      properties:
        description:
          type: string
          nullable: true
        label:
          type: string
          nullable: true
        range:
          allOf:
            - $ref: '#/components/schemas/RangeSliderConfig'
          nullable: true
        tag:
          type: string
          nullable: true
    DateRange:
      type: object
      description: >-
        DateRange is a JSON object which can be used to filter chunks by a range
        of dates. This leverages the time_stamp field on chunks in your dataset.
        You can specify this if you want values in a certain range. You must
        provide ISO 8601 combined date and time without timezone.
      properties:
        gt:
          type: string
          nullable: true
        gte:
          type: string
          nullable: true
        lt:
          type: string
          nullable: true
        lte:
          type: string
          nullable: true
      example:
        gt: '2021-01-01 00:00:00.000'
        gte: '2021-01-01 00:00:00.000'
        lt: '2021-01-01 00:00:00.000'
        lte: '2021-01-01 00:00:00.000'
    LocationBoundingBox:
      type: object
      required:
        - top_left
        - bottom_right
      properties:
        bottom_right:
          $ref: '#/components/schemas/GeoInfo'
        top_left:
          $ref: '#/components/schemas/GeoInfo'
    LocationPolygon:
      type: object
      required:
        - exterior
      properties:
        exterior:
          type: array
          items:
            $ref: '#/components/schemas/GeoInfo'
        interior:
          type: array
          items:
            type: array
            items:
              $ref: '#/components/schemas/GeoInfo'
          nullable: true
    LocationRadius:
      type: object
      required:
        - center
        - radius
      properties:
        center:
          $ref: '#/components/schemas/GeoInfo'
        radius:
          type: number
          format: double
    MatchCondition:
      oneOf:
        - type: string
        - type: integer
          format: int64
        - type: number
          format: double
    Range:
      type: object
      properties:
        gt:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        gte:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        lt:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        lte:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
      example:
        gt: 0
        gte: 0
        lt: 1
        lte: 1
    GeoTypes:
      oneOf:
        - type: integer
          format: int64
        - type: number
          format: double
    SortOrder:
      type: string
      enum:
        - desc
        - asc
    ReRankOptions:
      type: string
      enum:
        - semantic
        - fulltext
        - bm25
        - cross_encoder
    RangeSliderConfig:
      type: object
      properties:
        max:
          type: number
          format: double
          nullable: true
        min:
          type: number
          format: double
          nullable: true
    RangeCondition:
      oneOf:
        - type: number
          format: double
        - type: integer
          format: int64
  securitySchemes:
    ApiKey:
      type: apiKey
      in: header
      name: Authorization

````