Install Trieve Vector Inference in your own AWS account
eksctl
>= 0.171 (eksctl installation guide)aws
>= 2.15 (aws installation guide)kubectl
>= 1.28 (kubectl installation guide)helm
>= 3.14 (helm installation guide)IAM Policy Minimum Requirements
eksctl
CLI.The most up-to-date guide is located hereYou are able to use the root account. However, AWS does not recommend doing this.g4dn.xlarge
, as it is the cheapest on AWS. A single small node is needed for extra utility:
GPU_INSTANCE
that are chosen bootstrap-eks.sh
script will create the EKS cluster, install the AWS Load Balancer Controller, and install the NVIDIA Device Plugin. This will also manage any IAM permissions that are needed for the plugins to work.
Download the bootstrap-eks.sh
script
bootstrap-eks.sh
with bash
embedding_models.yaml
embedding_models.yaml
. This defines all the models that you will want to use:
Login to AWS ecr repository
Install the helm chart from the Marketplace ECR repository
Address
field is the endpoint that you can make dense embeddings, sparse embeddings, or reranker calls based on the models you chose.
ingress
point will be using their own Application Load Balancer within AWS. The Address
provided is the model’s endpoint that you can make dense embeddings, sparse embeddings, or reranker calls based on the models you chose.
Check out the guides for more information on configuration.