> ## Documentation Index
> Fetch the complete documentation index at: https://docs.trieve.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Update Chunk

> Update a chunk. If you try to change the tracking_id of the chunk to have the same tracking_id as an existing chunk, the request will fail. Auth'ed user or api key must have an admin or owner role for the specified dataset's organization.



## OpenAPI

````yaml put /api/chunk
openapi: 3.0.3
info:
  title: Trieve API
  description: >-
    Trieve OpenAPI Specification. This document describes all of the operations
    available through the Trieve API.
  contact:
    name: Trieve Team
    url: https://trieve.ai
    email: developers@trieve.ai
  license:
    name: BSL
    url: https://github.com/devflowinc/trieve/blob/main/LICENSE.txt
  version: 0.13.0
servers:
  - url: https://api.trieve.ai
    description: Production server
  - url: http://localhost:8090
    description: Local development server
security: []
tags:
  - name: Invitation
    description: Invitation endpoint. Exists to invite users to an organization.
  - name: Auth
    description: Authentication endpoint. Serves to register and authenticate users.
  - name: User
    description: User endpoint. Enables you to modify user roles and information.
  - name: Organization
    description: >-
      Organization endpoint. Enables you to modify organization roles and
      information.
  - name: Dataset
    description: >-
      Dataset endpoint. Datasets belong to organizations and hold configuration
      information for both client and server. Datasets contain chunks and chunk
      groups.
  - name: Chunk
    description: >-
      Chunk endpoint. Think of chunks as individual searchable units of
      information. The majority of your integration will likely be with the
      Chunk endpoint.
  - name: Chunk Group
    description: >-
      Chunk groups endpoint. Think of a chunk_group as a bookmark folder within
      the dataset.
  - name: Crawl
    description: Crawl endpoint. Used to create and manage crawls for datasets.
  - name: File
    description: >-
      File endpoint. When files are uploaded, they are stored in S3 and broken
      up into chunks with text extraction from Apache Tika. You can upload files
      of pretty much any type up to 1GB in size. See chunking algorithm details
      at `docs.trieve.ai` for more information on how chunking works. Improved
      default chunking is on our roadmap.
  - name: Events
    description: >-
      Notifications endpoint. Files are uploaded asynchronously and events are
      sent to the user when the upload is complete.
  - name: Topic
    description: >-
      Topic chat endpoint. Think of topics as the storage system for gen-ai chat
      memory. Gen AI messages belong to topics.
  - name: Message
    description: >-
      Message chat endpoint. Messages are units belonging to a topic in the
      context of a chat with a LLM. There are system, user, and assistant
      messages.
  - name: Stripe
    description: >-
      Stripe endpoint. Used for the managed SaaS version of this app. Eventually
      this will become a micro-service. Reach out to the team using contact info
      found at `docs.trieve.ai` for more information.
  - name: Health
    description: Health check endpoint. Used to check if the server is up and running.
  - name: Metrics
    description: Metrics endpoint. Used to get information for monitoring
  - name: Analytics
    description: Analytics endpoint. Used to get information for search and RAG analytics
  - name: Experiment
    description: Experiment endpoint. Used to create and manage experiments
paths:
  /api/chunk:
    put:
      tags:
        - Chunk
      summary: Update Chunk
      description: >-
        Update a chunk. If you try to change the tracking_id of the chunk to
        have the same tracking_id as an existing chunk, the request will fail.
        Auth'ed user or api key must have an admin or owner role for the
        specified dataset's organization.
      operationId: update_chunk
      parameters:
        - name: TR-Dataset
          in: header
          description: >-
            The dataset id or tracking_id to use for the request. We assume you
            intend to use an id if the value is a valid uuid.
          required: true
          schema:
            type: string
            format: uuid
      requestBody:
        description: JSON request payload to update a chunk (chunk)
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/UpdateChunkReqPayload'
        required: true
      responses:
        '204':
          description: No content Ok response indicating the chunk was updated as requested
        '400':
          description: >-
            Service error relating to to updating chunk, likely due to
            conflicting tracking_id
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponseBody'
      security:
        - ApiKey:
            - admin
components:
  schemas:
    UpdateChunkReqPayload:
      type: object
      properties:
        chunk_html:
          type: string
          description: >-
            HTML content of the chunk you want to update. This can also be
            plaintext. The innerText of the HTML will be used to create the
            embedding vector. The point of using HTML is for convienience, as
            some users have applications where users submit HTML content. If no
            chunk_html is provided, the existing chunk_html will be used.
          nullable: true
        chunk_id:
          type: string
          format: uuid
          description: >-
            Id of the chunk you want to update. You can provide either the
            chunk_id or the tracking_id. If both are provided, the chunk_id will
            be used.
          nullable: true
        convert_html_to_text:
          type: boolean
          description: >-
            Convert HTML to raw text before processing to avoid adding noise to
            the vector embeddings. By default this is true. If you are using
            HTML content that you want to be included in the vector embeddings,
            set this to false.
          nullable: true
        fulltext_boost:
          allOf:
            - $ref: '#/components/schemas/FullTextBoost'
          nullable: true
        group_ids:
          type: array
          items:
            type: string
            format: uuid
          description: >-
            Group ids are the ids of the groups that the chunk should be placed
            into. This is useful for when you want to update a chunk and add it
            to a group or multiple groups in one request.
          nullable: true
        group_tracking_ids:
          type: array
          items:
            type: string
          description: >-
            Group tracking_ids are the tracking_ids of the groups that the chunk
            should be placed into. This is useful for when you want to update a
            chunk and add it to a group or multiple groups in one request.
          nullable: true
        image_urls:
          type: array
          items:
            type: string
          description: >-
            Image urls are a list of urls to images that are associated with the
            chunk. This is useful for when you want to associate images with a
            chunk. If no image_urls are provided, the existing image_urls will
            be used.
          nullable: true
        link:
          type: string
          description: >-
            Link of the chunk you want to update. This can also be any string.
            Frequently, this is a link to the source of the chunk. The link
            value will not affect the embedding creation. If no link is
            provided, the existing link will be used.
          nullable: true
        location:
          allOf:
            - $ref: '#/components/schemas/GeoInfo'
          nullable: true
        metadata:
          description: >-
            The metadata is a JSON object which can be used to filter chunks.
            This is useful for when you want to filter chunks by arbitrary
            metadata. Unlike with tag filtering, there is a performance hit for
            filtering on metadata. If no metadata is provided, the existing
            metadata will be used.
          nullable: true
        num_value:
          type: number
          format: double
          description: >-
            Num value is an arbitrary numerical value that can be used to filter
            chunks. This is useful for when you want to filter chunks by
            numerical value. If no num_value is provided, the existing num_value
            will be used.
          nullable: true
        semantic_boost:
          allOf:
            - $ref: '#/components/schemas/SemanticBoost'
          nullable: true
        tag_set:
          type: array
          items:
            type: string
          description: >-
            Tag set is a list of tags. This can be used to filter chunks by tag.
            Unlike with metadata filtering, HNSW indices will exist for each tag
            such that there is not a performance hit for filtering on them. If
            no tag_set is provided, the existing tag_set will be used.
          nullable: true
        time_stamp:
          type: string
          description: >-
            Time_stamp should be an ISO 8601 combined date and time without
            timezone. It is used for time window filtering and recency-biasing
            search results. If no time_stamp is provided, the existing
            time_stamp will be used.
          nullable: true
        tracking_id:
          type: string
          description: >-
            Tracking_id of the chunk you want to update. This is required to
            match an existing chunk.
          nullable: true
        weight:
          type: number
          format: double
          description: >-
            Weight is a float which can be used to bias search results. This is
            useful for when you want to bias search results for a chunk. The
            magnitude only matters relative to other chunks in the chunk's
            dataset dataset. If no weight is provided, the existing weight will
            be used.
          nullable: true
      example:
        chunk_html: <p>Some HTML content</p>
        chunk_id: d290f1ee-6c54-4b01-90e6-d701748f0851
        group_ids:
          - d290f1ee-6c54-4b01-90e6-d701748f0851
        link: https://example.com
        metadata:
          key1: value1
          key2: value2
        time_stamp: '2021-01-01 00:00:00.000'
        weight: 0.5
    ErrorResponseBody:
      type: object
      required:
        - message
      properties:
        message:
          type: string
      example:
        message: Bad Request
    FullTextBoost:
      type: object
      description: >-
        Boost the presence of certain tokens for fulltext (SPLADE) and keyword
        (BM25) search. I.e. boosting title phrases to priortize title matches or
        making sure that the listing for AirBNB itself ranks higher than
        companies who make software for AirBNB hosts by boosting the
        in-document-frequency of the AirBNB token (AKA word) for its official
        listing. Conceptually it multiples the in-document-importance second
        value in the tuples of the SPLADE or BM25 sparse vector of the
        chunk_html innerText for all tokens present in the boost phrase by the
        boost factor like so: (token, in-document-importance) -> (token,
        in-document-importance*boost_factor).
      required:
        - phrase
        - boost_factor
      properties:
        boost_factor:
          type: number
          format: double
          description: >-
            Amount to multiplicatevly increase the frequency of the tokens in
            the phrase by
        phrase:
          type: string
          description: The phrase to boost in the fulltext document frequency index
    GeoInfo:
      type: object
      description: Location that you want to use as the center of the search.
      required:
        - lat
        - lon
      properties:
        lat:
          $ref: '#/components/schemas/GeoTypes'
        lon:
          $ref: '#/components/schemas/GeoTypes'
    SemanticBoost:
      type: object
      description: >-
        Semantic boosting moves the dense vector of the chunk in the direction
        of the distance phrase for semantic search. I.e. you can force a cluster
        by moving every chunk for a PDF closer to its title or push a chunk with
        a chunk_html of "iphone" 25% closer to the term "flagship" by using the
        distance phrase "flagship" and a distance factor of 0.25. Conceptually
        it's drawing a line (euclidean/L2 distance) between the vector for the
        innerText of the chunk_html and distance_phrase then moving the vector
        of the chunk_html distance_factor*L2Distance closer to or away from the
        distance_phrase point along the line between the two points.
      required:
        - phrase
        - distance_factor
      properties:
        distance_factor:
          type: number
          format: float
          description: >-
            Arbitrary float (positive or negative) specifying the multiplicate
            factor to apply before summing the phrase vector with the chunk_html
            embedding vector
        phrase:
          type: string
          description: >-
            Terms to embed in order to create the vector which is weighted
            summed with the chunk_html embedding vector
    GeoTypes:
      oneOf:
        - type: integer
          format: int64
        - type: number
          format: double
  securitySchemes:
    ApiKey:
      type: apiKey
      in: header
      name: Authorization

````