AWS Self Hosting
Learn how to self-host Trieve on AWS
AWS EKS
Things you need
- Domain name
- An allowance for at least 8vCPU for G and VT instances
- helm cli
- aws cli
- kubectl
- k9s (optional)
aws configure
Provision Terraform
aws should be configured with your IAM credentails chosen. Run the following commands to create the EKS cluster
cd terraform/aws
terraform init
terraform apply
This should provision an eks cluster with elb and ebs drivers
2.5 Setup embedding servers
Due to many issues with the NVIDIA k8s-device-plugin, we have not figured out how to do fractional GPU usage for pods within kubernetes, meaning its not economically reasonable to have the GPU embedding server within Kubernetes.For this reason we have a docker-compose.yml
that is recommended to be ran on the ec2 box provisioned within the Terraform. (Note it uses ~/.ssh/id_ed25519
as a default key). The user to login as is dev
.
3 Create values.yaml
export SENTRY_CHAT_DSN=https://********************************@sentry.trieve.ai/6
export ENVIRONMENT=aws
export DOMAIN=example.com # Only used for local
export EXTERNAL_DOMAIN=example.com
export DASHBOARD_URL=https://dashboard.example.com
export SALT=goodsaltisveryyummy
export SECRET_KEY=1234512345123451234512345123451234512345123451234512345123451234512345123451234h
export ADMIN_API_KEY=asdasdasdasdasd
export OIDC_CLIENT_SECRET=YllmLDTy67MbsUBrUAWvQ7z9aMq0QcKx
export ISSUER_URL=https://oidc.example.com
export AUTH_REDIRECT_URL=https://oidc.example.com/realms/trieve/protocol/openid-connect/auth
export REDIRECT_URL=https://oidc.example.com/realms/trieve/protocol/openid-connect/auth
export SMTP_RELAY=smtp.gmail.com
export SMTP_USERNAME=trieve@gmail.com
export SMTP_PASSWORD=pass************
export SMTP_EMAIL_ADDRESS=triever@gmail.com
export LLM_API_KEY=sk-or-v1-**************************************************************** # Open Router API KEY
export OPENAI_API_KEY=sk-************************************************ # OPENAI API KEY
export OPENAI_BASE_URL=https://api.openai.com/v1
export S3_ENDPOINT=https://<bucket>.s3.amazonaws.com
export S3_ACCESS_KEY=ZaaZZaaZZaaZZaaZZaaZ
export S3_SECRET_KEY=ssssssssssssssssssssTTTTTTTTTTTTTTTTTTTT
export S3_BUCKET=trieve
export AWS_REGION=us-east-1
export STRIPE_API_KEY=sk_test_***************************************************************************************************
export STRIPE_WEBHOOK_SECRET=sk_test_***************************************************************************************************
helm/from-env.sh
This step generates a file in helm/values.yaml
. It alllows you to modify the environment variables
Additionally, you will need to modify values.yaml in two ways,
First you will need to change all the embedding server origins to point to the embedding server url as follows.
config:
---
trieve:
---
sparseServerDocOrigin: http://<ip>:5000
sparseServerQueryOrigin: http://<ip>:6000
embeddingServerOrigin: http://<ip>:7000
embeddingServerOriginBGEM3: http://<ip>:8000
rerankerServerOrigin: http://<ip>:9000
Since the embbedding servers are not included in the kubernetes cluster, remove all items in the embeddings list below and leave it empty as follows
---
embeddings:
SubChart usage
Postgres, Redis, and qdrant are installed within this helm chart via a subchart. You can opt out of using the helm chart installation and using a managed service can be toggled via. the useSubChart
parameters and setting the uri
to a managed service. We reccomend placing at least Postgres and Redis out side of the helm chart and keeping Qdrant in the helm chart for a production usecase.
4. Deploy the helm chart
aws eks update-kubeconfig --region us-west-1 --name trieve-cluster
helm install -f helm/values.yaml trieve helm/
Ensure everything has been deployed with
kubectl get pods
6. Set DNS records
First get the ingress addresses using
kubectl get ingress
You will get output that looks like this
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress-chat alb * k8s-default-ingressc-033d55ca4e-253753726.us-west-1.elb.amazonaws.com 80 9s
ingress-dashboard alb * k8s-default-ingressd-50d3c72d7f-2039058451.us-west-1.elb.amazonaws.com 80 9s
ingress-search alb * k8s-default-ingresss-2007ac265d-1873944939.us-west-1.elb.amazonaws.com 80 9s
ingress-server alb * k8s-default-ingresss-cb909f388e-797764884.us-west-1.elb.amazonaws.com 80 9s
Set CNAME’s accordingly, we recommend using Cloudflare to CNAME and provision the SSL Cert
ingress-chat chat.domain.
ingress-dashboard dashboard.domain.
ingress-search search.domain.
ingress-server api.domain.
Once you set the ingress rules properly, the server should be able to properly deploy.
7. Setup/OIDC provider and Authorized Redirect URL’S
The last step is to setup an OIDC compliant server like keycloak
for authentication and get an issuerUrl
and clientSecret
. This is how you do it within Keycloak.
A) Create a new realm called trieve
B) Go into Clients and create a new client called trieve
.
Enable client authentication and set the following allowed redirect url’s
- https://api.domain.com/*
- https://search.domain.com/*
- https://chat.domain.com/*
- https://dashboard.domain.com/*
You will get the client secret in the Credentials
tab.
You will need to set the following values in the helm/values.yaml
file, it should be prefilled already with default values
config:
oidc:
clientSecret: $OIDC_CLIENT_SECRET
clientId: trieve
issuerUrl: https://auth.domain.com/realms/trieve
authRedirectUrl: https://auth.domain.com/realms/trieve/protocol/openid-connect/auth
Testing
The fastest way to test is using the Trieve CLI
trieve login # Make sure to set the api url to https://api.domain.com
trieve dataset example
And there you have it. Your very own Trieve stack. Happy hacking 🚀