We recommend GKE for production deployments, EKS is not able to have the embedding servers exist within the kubernetes cluster due to issues with the fractional GPU usage driver.

AWS EKS

Things you need

  • Domain name
  • An allowance for at least 8vCPU for G and VT instances
  • helm cli
  • aws cli
  • kubectl
  • k9s (optional)
aws configure

Provision Terraform

aws should be configured with your IAM credentails chosen. Run the following commands to create the EKS cluster

cd terraform/aws
terraform init
terraform apply

This should provision an eks cluster with elb and ebs drivers

2.5 Setup embedding servers

Due to many issues with the NVIDIA k8s-device-plugin, we have not figured out how to do fractional GPU usage for pods within kubernetes, meaning its not economically reasonable to have the GPU embedding server within Kubernetes. For this reason we have a docker-compose.yml that is recommended to be ran on the ec2 box provisioned within the Terraform. (Note it uses ~/.ssh/id_ed25519 as a default key). The user to login as is dev.

3 Create values.yaml

export SENTRY_CHAT_DSN=https://********************************@sentry.trieve.ai/6
export ENVIRONMENT=aws
export DOMAIN=example.com # Only used for local
export EXTERNAL_DOMAIN=example.com
export DASHBOARD_URL=https://dashboard.example.com
export SALT=goodsaltisveryyummy
export SECRET_KEY=1234512345123451234512345123451234512345123451234512345123451234512345123451234h
export ADMIN_API_KEY=asdasdasdasdasd
export OIDC_CLIENT_SECRET=YllmLDTy67MbsUBrUAWvQ7z9aMq0QcKx
export ISSUER_URL=https://oidc.example.com
export AUTH_REDIRECT_URL=https://oidc.example.com/realms/trieve/protocol/openid-connect/auth
export REDIRECT_URL=https://oidc.example.com/realms/trieve/protocol/openid-connect/auth
export SMTP_RELAY=smtp.gmail.com
export SMTP_USERNAME=trieve@gmail.com
export SMTP_PASSWORD=pass************
export SMTP_EMAIL_ADDRESS=triever@gmail.com
export LLM_API_KEY=sk-or-v1-**************************************************************** # Open Router API KEY
export OPENAI_API_KEY=sk-************************************************ # OPENAI API KEY
export OPENAI_BASE_URL=https://api.openai.com/v1
export S3_ENDPOINT=https://<bucket>.s3.amazonaws.com
export S3_ACCESS_KEY=ZaaZZaaZZaaZZaaZZaaZ
export S3_SECRET_KEY=ssssssssssssssssssssTTTTTTTTTTTTTTTTTTTT
export S3_BUCKET=trieve
export AWS_REGION=us-east-1
export STRIPE_API_KEY=sk_test_***************************************************************************************************
export STRIPE_WEBHOOK_SECRET=sk_test_***************************************************************************************************

helm/from-env.sh

This step generates a file in helm/values.yaml. It alllows you to modify the environment variables

Additionally, you will need to modify values.yaml in two ways,

First you will need to change all the embedding server origins to point to the embedding server url as follows.

config:
---
trieve:
---
sparseServerDocOrigin: http://<ip>:5000
sparseServerQueryOrigin: http://<ip>:6000
embeddingServerOrigin: http://<ip>:7000
embeddingServerOriginBGEM3: http://<ip>:8000
rerankerServerOrigin: http://<ip>:9000

Since the embbedding servers are not included in the kubernetes cluster, remove all items in the embeddings list below and leave it empty as follows


---
embeddings:

SubChart usage

Postgres, Redis, and qdrant are installed within this helm chart via a subchart. You can opt out of using the helm chart installation and using a managed service can be toggled via. the useSubChart parameters and setting the uri to a managed service. We reccomend placing at least Postgres and Redis out side of the helm chart and keeping Qdrant in the helm chart for a production usecase.

4. Deploy the helm chart

aws eks update-kubeconfig --region us-west-1 --name trieve-cluster
helm install -f helm/values.yaml trieve helm/

Ensure everything has been deployed with

kubectl get pods

6. Set DNS records

First get the ingress addresses using

kubectl get ingress

You will get output that looks like this

NAME                CLASS   HOSTS   ADDRESS                                                                  PORTS   AGE
ingress-chat        alb     *       k8s-default-ingressc-033d55ca4e-253753726.us-west-1.elb.amazonaws.com    80      9s
ingress-dashboard   alb     *       k8s-default-ingressd-50d3c72d7f-2039058451.us-west-1.elb.amazonaws.com   80      9s
ingress-search      alb     *       k8s-default-ingresss-2007ac265d-1873944939.us-west-1.elb.amazonaws.com   80      9s
ingress-server      alb     *       k8s-default-ingresss-cb909f388e-797764884.us-west-1.elb.amazonaws.com    80      9s

Set CNAME’s accordingly, we recommend using Cloudflare to CNAME and provision the SSL Cert

ingress-chat            chat.domain.
ingress-dashboard       dashboard.domain.
ingress-search          search.domain.
ingress-server          api.domain.

Once you set the ingress rules properly, the server should be able to properly deploy.

7. Setup/OIDC provider and Authorized Redirect URL’S

The last step is to setup an OIDC compliant server like keycloak for authentication and get an issuerUrl and clientSecret. This is how you do it within Keycloak.

A) Create a new realm called trieve B) Go into Clients and create a new client called trieve.

Enable client authentication and set the following allowed redirect url’s

You will get the client secret in the Credentials tab.

You will need to set the following values in the helm/values.yaml file, it should be prefilled already with default values

config:
  oidc:
    clientSecret: $OIDC_CLIENT_SECRET
    clientId: trieve
    issuerUrl: https://auth.domain.com/realms/trieve
    authRedirectUrl: https://auth.domain.com/realms/trieve/protocol/openid-connect/auth

Testing

The fastest way to test is using the Trieve CLI

trieve login # Make sure to set the api url to https://api.domain.com
trieve dataset example

And there you have it. Your very own Trieve stack. Happy hacking 🚀