> ## Documentation Index
> Fetch the complete documentation index at: https://docs.trieve.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Scroll Chunks

> Get paginated chunks from your dataset with filters and custom sorting. If sort by is not specified, the results will sort by the id's of the chunks in ascending order. Sort by and offset_chunk_id cannot be used together; if you want to scroll with a sort by then you need to use a must_not filter with the ids you have already seen. There is a limit of 1000 id's in a must_not filter at a time.



## OpenAPI

````yaml post /api/chunks/scroll
openapi: 3.0.3
info:
  title: Trieve API
  description: >-
    Trieve OpenAPI Specification. This document describes all of the operations
    available through the Trieve API.
  contact:
    name: Trieve Team
    url: https://trieve.ai
    email: developers@trieve.ai
  license:
    name: BSL
    url: https://github.com/devflowinc/trieve/blob/main/LICENSE.txt
  version: 0.13.0
servers:
  - url: https://api.trieve.ai
    description: Production server
  - url: http://localhost:8090
    description: Local development server
security: []
tags:
  - name: Invitation
    description: Invitation endpoint. Exists to invite users to an organization.
  - name: Auth
    description: Authentication endpoint. Serves to register and authenticate users.
  - name: User
    description: User endpoint. Enables you to modify user roles and information.
  - name: Organization
    description: >-
      Organization endpoint. Enables you to modify organization roles and
      information.
  - name: Dataset
    description: >-
      Dataset endpoint. Datasets belong to organizations and hold configuration
      information for both client and server. Datasets contain chunks and chunk
      groups.
  - name: Chunk
    description: >-
      Chunk endpoint. Think of chunks as individual searchable units of
      information. The majority of your integration will likely be with the
      Chunk endpoint.
  - name: Chunk Group
    description: >-
      Chunk groups endpoint. Think of a chunk_group as a bookmark folder within
      the dataset.
  - name: Crawl
    description: Crawl endpoint. Used to create and manage crawls for datasets.
  - name: File
    description: >-
      File endpoint. When files are uploaded, they are stored in S3 and broken
      up into chunks with text extraction from Apache Tika. You can upload files
      of pretty much any type up to 1GB in size. See chunking algorithm details
      at `docs.trieve.ai` for more information on how chunking works. Improved
      default chunking is on our roadmap.
  - name: Events
    description: >-
      Notifications endpoint. Files are uploaded asynchronously and events are
      sent to the user when the upload is complete.
  - name: Topic
    description: >-
      Topic chat endpoint. Think of topics as the storage system for gen-ai chat
      memory. Gen AI messages belong to topics.
  - name: Message
    description: >-
      Message chat endpoint. Messages are units belonging to a topic in the
      context of a chat with a LLM. There are system, user, and assistant
      messages.
  - name: Stripe
    description: >-
      Stripe endpoint. Used for the managed SaaS version of this app. Eventually
      this will become a micro-service. Reach out to the team using contact info
      found at `docs.trieve.ai` for more information.
  - name: Health
    description: Health check endpoint. Used to check if the server is up and running.
  - name: Metrics
    description: Metrics endpoint. Used to get information for monitoring
  - name: Analytics
    description: Analytics endpoint. Used to get information for search and RAG analytics
  - name: Experiment
    description: Experiment endpoint. Used to create and manage experiments
paths:
  /api/chunks/scroll:
    post:
      tags:
        - Chunk
      summary: Scroll Chunks
      description: >-
        Get paginated chunks from your dataset with filters and custom sorting.
        If sort by is not specified, the results will sort by the id's of the
        chunks in ascending order. Sort by and offset_chunk_id cannot be used
        together; if you want to scroll with a sort by then you need to use a
        must_not filter with the ids you have already seen. There is a limit of
        1000 id's in a must_not filter at a time.
      operationId: scroll_dataset_chunks
      parameters:
        - name: TR-Dataset
          in: header
          description: >-
            The dataset id or tracking_id to use for the request. We assume you
            intend to use an id if the value is a valid uuid.
          required: true
          schema:
            type: string
            format: uuid
      requestBody:
        description: JSON request payload to scroll through chunks (chunks)
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ScrollChunksReqPayload'
        required: true
      responses:
        '200':
          description: >-
            Number of chunks equivalent to page_size starting from
            offset_chunk_id
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ScrollChunksResponseBody'
        '400':
          description: Service error relating to scrolling chunks
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponseBody'
      security:
        - ApiKey:
            - readonly
components:
  schemas:
    ScrollChunksReqPayload:
      type: object
      properties:
        filters:
          allOf:
            - $ref: '#/components/schemas/ChunkFilter'
          nullable: true
        offset_chunk_id:
          type: string
          format: uuid
          description: >-
            Offset chunk id is the id of the chunk to start the page from. If
            not specified, this defaults to the first chunk in the dataset
            sorted by id ascending.
          nullable: true
        page_size:
          type: integer
          format: int64
          description: >-
            Page size is the number of chunks to fetch. This can be used to
            fetch more than 10 chunks at a time.
          nullable: true
          minimum: 0
        sort_by:
          allOf:
            - $ref: '#/components/schemas/SortByField'
          nullable: true
    ScrollChunksResponseBody:
      type: object
      required:
        - chunks
      properties:
        chunks:
          type: array
          items:
            $ref: '#/components/schemas/ChunkMetadata'
    ErrorResponseBody:
      type: object
      required:
        - message
      properties:
        message:
          type: string
      example:
        message: Bad Request
    ChunkFilter:
      type: object
      description: >-
        ChunkFilter is a JSON object which can be used to filter chunks. This is
        useful for when you want to filter chunks by arbitrary metadata. Unlike
        with tag filtering, there is a performance hit for filtering on
        metadata.
      properties:
        must:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            All of these field conditions have to match for the chunk to be
            included in the result set.
          nullable: true
        must_not:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            None of these field conditions can match for the chunk to be
            included in the result set.
          nullable: true
        should:
          type: array
          items:
            $ref: '#/components/schemas/ConditionType'
          description: >-
            Only one of these field conditions has to match for the chunk to be
            included in the result set.
          nullable: true
      example:
        must:
          - field: tag_set
            match_all:
              - A
              - B
          - field: num_value
            range:
              gte: 10
              lte: 25
    SortByField:
      type: object
      required:
        - field
      properties:
        direction:
          allOf:
            - $ref: '#/components/schemas/SortOrder'
          nullable: true
        field:
          type: string
          description: >-
            Field to sort by. This has to be a numeric field with a Qdrant
            `Range` index on it. i.e. num_value and timestamp
        prefetch_amount:
          type: integer
          format: int64
          description: How many results to pull in before the sort
          nullable: true
          minimum: 0
    ChunkMetadata:
      type: object
      title: V2
      required:
        - id
        - created_at
        - updated_at
        - dataset_id
        - weight
      properties:
        chunk_html:
          type: string
          description: >-
            HTML content of the chunk, can also be an arbitrary string which is
            not HTML
          nullable: true
        created_at:
          type: string
          format: date-time
          description: Timestamp of the creation of the chunk
        dataset_id:
          type: string
          format: uuid
          description: ID of the dataset which the chunk belongs to
        id:
          type: string
          format: uuid
          description: >-
            Unique identifier of the chunk, auto-generated uuid created by
            Trieve
        image_urls:
          type: array
          items:
            type: string
            nullable: true
          description: >-
            Image URLs of the chunk, can be any list of strings. Used for image
            search and RAG.
          nullable: true
        link:
          type: string
          description: Link to the chunk, should be a URL
          nullable: true
        location:
          allOf:
            - $ref: '#/components/schemas/GeoInfo'
          nullable: true
        metadata:
          description: Metadata of the chunk, can be any JSON object
          nullable: true
        num_value:
          type: number
          format: double
          description: >-
            Numeric value of the chunk, can be any float. Can represent the most
            relevant numeric value of the chunk, such as a price, quantity in
            stock, rating, etc.
          nullable: true
        tag_set:
          type: array
          items:
            type: string
            nullable: true
          description: >-
            Tag set of the chunk, can be any list of strings. Used for
            tag-filtered searches.
          nullable: true
        time_stamp:
          type: string
          format: date-time
          description: Timestamp of the chunk, can be any timestamp. Specified by the user.
          nullable: true
        tracking_id:
          type: string
          description: >-
            Tracking ID of the chunk, can be any string, determined by the user.
            Tracking ID's are unique identifiers for chunks within a dataset.
            They are designed to match the unique identifier of the chunk in the
            user's system.
          nullable: true
        updated_at:
          type: string
          format: date-time
          description: Timestamp of the last update of the chunk
        weight:
          type: number
          format: double
          description: >-
            Weight of the chunk, can be any float. Used as a multiplier on a
            chunk's relevance score for ranking purposes.
      example:
        chunk_html: <p>Hello, world!</p>
        created_at: '2021-01-01 00:00:00.000'
        dataset_id: e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3
        id: e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3
        link: https://trieve.ai
        metadata:
          key: value
        tag_set: '[tag1,tag2]'
        time_stamp: '2021-01-01 00:00:00.000'
        tracking_id: e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3
        updated_at: '2021-01-01 00:00:00.000'
        weight: 0.5
    ConditionType:
      oneOf:
        - $ref: '#/components/schemas/FieldCondition'
        - $ref: '#/components/schemas/HasChunkIDCondition'
      description: >-
        Filters can be constructed using either fields on the chunk objects, ids
        or tracking ids of chunks, and finally ids or tracking ids of groups.
    SortOrder:
      type: string
      enum:
        - desc
        - asc
    GeoInfo:
      type: object
      description: Location that you want to use as the center of the search.
      required:
        - lat
        - lon
      properties:
        lat:
          $ref: '#/components/schemas/GeoTypes'
        lon:
          $ref: '#/components/schemas/GeoTypes'
    FieldCondition:
      type: object
      description: >-
        FieldCondition is a JSON object which can be used to filter chunks by a
        field. This is useful for when you want to filter chunks by arbitrary
        metadata. To access fields inside of the metadata that you provide with
        the card, prefix the field name with `metadata.`.
      required:
        - field
      properties:
        boolean:
          type: boolean
          description: >-
            Boolean is a true false value for a field. This only works for
            boolean fields. You can specify this if you want values to be true
            or false.
          nullable: true
        date_range:
          allOf:
            - $ref: '#/components/schemas/DateRange'
          nullable: true
        field:
          type: string
          description: >-
            Field is the name of the field to filter on. Commonly used fields
            are `timestamp`, `link`, `tag_set`, `location`, `num_value`,
            `group_ids`, and `group_tracking_ids`. The field value will be used
            to check for an exact substring match on the metadata values for
            each existing chunk. This is useful for when you want to filter
            chunks by arbitrary metadata. To access fields inside of the
            metadata that you provide with the card, prefix the field name with
            `metadata.`.
        geo_bounding_box:
          allOf:
            - $ref: '#/components/schemas/LocationBoundingBox'
          nullable: true
        geo_polygon:
          allOf:
            - $ref: '#/components/schemas/LocationPolygon'
          nullable: true
        geo_radius:
          allOf:
            - $ref: '#/components/schemas/LocationRadius'
          nullable: true
        match_all:
          type: array
          items:
            $ref: '#/components/schemas/MatchCondition'
          description: >-
            Match all lets you pass in an array of values that will return
            results if all of the items match. The match value will be used to
            check for an exact substring match on the metadata values for each
            existing chunk. If both match_all and match_any are provided, the
            match_any condition will be used.
          nullable: true
        match_any:
          type: array
          items:
            $ref: '#/components/schemas/MatchCondition'
          description: >-
            Match any lets you pass in an array of values that will return
            results if any of the items match. The match value will be used to
            check for an exact substring match on the metadata values for each
            existing chunk. If both match_all and match_any are provided, the
            match_any condition will be used.
          nullable: true
        range:
          allOf:
            - $ref: '#/components/schemas/Range'
          nullable: true
      example:
        field: metadata.key1
        match:
          - value1
          - value2
        range:
          gt: 0
          gte: 0
          lt: 1
          lte: 1
    HasChunkIDCondition:
      type: object
      description: >-
        HasChunkIDCondition is a JSON object which can be used to filter chunks
        by their ids or tracking ids. This is useful for when you want to filter
        chunks by their ids or tracking ids.
      properties:
        ids:
          type: array
          items:
            type: string
            format: uuid
          description: >-
            Ids of the chunks to apply a match_any condition with. Only chunks
            with one of these ids will be returned.
          nullable: true
        tracking_ids:
          type: array
          items:
            type: string
          description: >-
            Tracking ids of the chunks to apply a match_any condition with. Only
            chunks with one of these tracking ids will be returned.
          nullable: true
    GeoTypes:
      oneOf:
        - type: integer
          format: int64
        - type: number
          format: double
    DateRange:
      type: object
      description: >-
        DateRange is a JSON object which can be used to filter chunks by a range
        of dates. This leverages the time_stamp field on chunks in your dataset.
        You can specify this if you want values in a certain range. You must
        provide ISO 8601 combined date and time without timezone.
      properties:
        gt:
          type: string
          nullable: true
        gte:
          type: string
          nullable: true
        lt:
          type: string
          nullable: true
        lte:
          type: string
          nullable: true
      example:
        gt: '2021-01-01 00:00:00.000'
        gte: '2021-01-01 00:00:00.000'
        lt: '2021-01-01 00:00:00.000'
        lte: '2021-01-01 00:00:00.000'
    LocationBoundingBox:
      type: object
      required:
        - top_left
        - bottom_right
      properties:
        bottom_right:
          $ref: '#/components/schemas/GeoInfo'
        top_left:
          $ref: '#/components/schemas/GeoInfo'
    LocationPolygon:
      type: object
      required:
        - exterior
      properties:
        exterior:
          type: array
          items:
            $ref: '#/components/schemas/GeoInfo'
        interior:
          type: array
          items:
            type: array
            items:
              $ref: '#/components/schemas/GeoInfo'
          nullable: true
    LocationRadius:
      type: object
      required:
        - center
        - radius
      properties:
        center:
          $ref: '#/components/schemas/GeoInfo'
        radius:
          type: number
          format: double
    MatchCondition:
      oneOf:
        - type: string
        - type: integer
          format: int64
        - type: number
          format: double
    Range:
      type: object
      properties:
        gt:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        gte:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        lt:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
        lte:
          allOf:
            - $ref: '#/components/schemas/RangeCondition'
          nullable: true
      example:
        gt: 0
        gte: 0
        lt: 1
        lte: 1
    RangeCondition:
      oneOf:
        - type: number
          format: double
        - type: integer
          format: int64
  securitySchemes:
    ApiKey:
      type: apiKey
      in: header
      name: Authorization

````