Skip to content

Instantly share code, notes, and snippets.

@arun-gupta
Last active September 5, 2024 01:24
Show Gist options
  • Save arun-gupta/fd3793baadc9feb4c3883c80b9481161 to your computer and use it in GitHub Desktop.
Save arun-gupta/fd3793baadc9feb4c3883c80b9481161 to your computer and use it in GitHub Desktop.
OPEA on Amazon EKS

OPEA on Amazon EKS

  • Create EKS cluster as explained at https://www.eksworkshop.com/docs/introduction/setup/your-account/ using VSCode option. m7i.4xlarge and m5.4xlarge is causing this issue. Use this command instead to create the cluster:
    export EKS_CLUSTER_NAME=eks-workshop
    #curl -fsSL https://raw.githubusercontent.com/aws-samples/eks-workshop-v2/stable/cluster/eksctl/cluster.yaml | sed -e 's/m5.large/m7i.4xlarge/g' -e 's/: 3/: 1/g' | \
    curl -fsSL https://raw.githubusercontent.com/aws-samples/eks-workshop-v2/stable/cluster/eksctl/cluster.yaml | sed -e 's/m5.large/m5.4xlarge/g' | \
    envsubst | eksctl create cluster -f -
    
    This will create a three-node EKS cluster using m5.4xlarge instead of the default m5.large instance type. Here is the final output:
    ec2-user:~/environment:$ 
    export EKS_CLUSTER_NAME=eks-workshop
    curl -fsSL https://raw.githubusercontent.com/aws-samples/eks-workshop-v2/stable/cluster/eksctl/cluster.yaml | \
    envsubst | eksctl create cluster -f -
    2024-09-04 00:47:25 [ℹ]  eksctl version 0.188.0
    2024-09-04 00:47:25 [ℹ]  using region us-west-2
    2024-09-04 00:47:25 [ℹ]  subnets for us-west-2a - public:10.42.0.0/19 private:10.42.96.0/19
    2024-09-04 00:47:25 [ℹ]  subnets for us-west-2b - public:10.42.32.0/19 private:10.42.128.0/19
    2024-09-04 00:47:25 [ℹ]  subnets for us-west-2c - public:10.42.64.0/19 private:10.42.160.0/19
    2024-09-04 00:47:25 [ℹ]  nodegroup "default" will use "" [AmazonLinux2023/1.30]
    2024-09-04 00:47:25 [ℹ]  using Kubernetes version 1.30
    2024-09-04 00:47:25 [ℹ]  creating EKS cluster "eks-workshop" in "us-west-2" region with managed nodes
    2024-09-04 00:47:25 [ℹ]  1 nodegroup (default) was included (based on the include/exclude rules)
    2024-09-04 00:47:25 [ℹ]  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
    2024-09-04 00:47:25 [ℹ]  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
    2024-09-04 00:47:25 [ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=eks-workshop'
    2024-09-04 00:47:25 [ℹ]  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "eks-workshop" in "us-west-2"
    2024-09-04 00:47:25 [ℹ]  CloudWatch logging will not be enabled for cluster "eks-workshop" in "us-west-2"
    2024-09-04 00:47:25 [ℹ]  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=us-west-2 --cluster=eks-workshop'
    2024-09-04 00:47:25 [ℹ]  default addons kube-proxy, coredns were not specified, will install them as EKS addons
    2024-09-04 00:47:25 [ℹ]  
    2 sequential tasks: { create cluster control plane "eks-workshop", 
        2 sequential sub-tasks: { 
            5 sequential sub-tasks: { 
                1 task: { create addons },
                wait for control plane to become ready,
                associate IAM OIDC provider,
                no tasks,
                update VPC CNI to use IRSA if required,
            },
            create managed nodegroup "default",
        } 
    }
    2024-09-04 00:47:25 [ℹ]  building cluster stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:47:26 [ℹ]  deploying stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:47:56 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:48:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:49:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:50:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:51:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:52:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:53:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:54:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:55:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:56:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:57:26 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-cluster"
    2024-09-04 00:57:27 [!]  recommended policies were found for "vpc-cni" addon, but since OIDC is disabled on the cluster, eksctl cannot configure the requested permissions; the recommended way to provide IAM permissions for "vpc-cni" addon is via pod identity associations; after addon creation is completed, add all recommended policies to the config file, under `addon.PodIdentityAssociations`, and run `eksctl update addon`
    2024-09-04 00:57:27 [ℹ]  creating addon
    2024-09-04 00:57:28 [ℹ]  successfully created addon
    2024-09-04 00:57:28 [ℹ]  creating addon
    2024-09-04 00:57:28 [ℹ]  successfully created addon
    2024-09-04 00:57:29 [ℹ]  creating addon
    2024-09-04 00:57:29 [ℹ]  successfully created addon
    2024-09-04 00:59:30 [ℹ]  deploying stack "eksctl-eks-workshop-addon-vpc-cni"
    2024-09-04 00:59:30 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-addon-vpc-cni"
    2024-09-04 01:00:00 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-addon-vpc-cni"
    2024-09-04 01:00:00 [ℹ]  updating addon
    2024-09-04 01:00:11 [ℹ]  addon "vpc-cni" active
    2024-09-04 01:00:11 [ℹ]  building managed nodegroup stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:00:11 [ℹ]  deploying stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:00:11 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:00:41 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:01:38 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:02:33 [ℹ]  waiting for CloudFormation stack "eksctl-eks-workshop-nodegroup-default"
    2024-09-04 01:02:33 [ℹ]  waiting for the control plane to become ready
    2024-09-04 01:02:34 [✔]  saved kubeconfig as "/home/ec2-user/.kube/config"
    2024-09-04 01:02:34 [ℹ]  no tasks
    2024-09-04 01:02:34 [✔]  all EKS cluster resources for "eks-workshop" have been created
    2024-09-04 01:02:34 [✔]  created 0 nodegroup(s) in cluster "eks-workshop"
    2024-09-04 01:02:34 [ℹ]  nodegroup "default" has 3 node(s)
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-110-222.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-151-87.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-173-37.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [ℹ]  waiting for at least 3 node(s) to become ready in "default"
    2024-09-04 01:02:34 [ℹ]  nodegroup "default" has 3 node(s)
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-110-222.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-151-87.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [ℹ]  node "ip-10-42-173-37.us-west-2.compute.internal" is ready
    2024-09-04 01:02:34 [✔]  created 1 managed nodegroup(s) in cluster "eks-workshop"
    2024-09-04 01:02:36 [ℹ]  kubectl command should work with "/home/ec2-user/.kube/config", try 'kubectl get nodes'
    2024-09-04 01:02:36 [✔]  EKS cluster "eks-workshop" in "us-west-2" region is ready
    
  • Setup Hugging Face token:
    export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
    
  • Apply manifest:
    curl -fsSL https://raw.githubusercontent.com/opea-project/GenAIExamples/main/ChatQnA/kubernetes/manifests/xeon/chatqna.yaml | sed -e 's/insert-your-huggingface-token-here/$HUGGINGFACEHUB_API_TOKEN/g' -e 's/mnt\/opea-models/tmp/g' | envsubst | kubectl apply -f -
    configmap/chatqna-data-prep-config created
    configmap/chatqna-embedding-usvc-config created
    configmap/chatqna-llm-uservice-config created
    configmap/chatqna-reranking-usvc-config created
    configmap/chatqna-retriever-usvc-config created
    configmap/chatqna-tei-config created
    configmap/chatqna-teirerank-config created
    configmap/chatqna-tgi-config created
    service/chatqna-data-prep created
    service/chatqna-embedding-usvc created
    service/chatqna-llm-uservice created
    service/chatqna-redis-vector-db created
    service/chatqna-reranking-usvc created
    service/chatqna-retriever-usvc created
    service/chatqna-tei created
    service/chatqna-teirerank created
    service/chatqna-tgi created
    service/chatqna created
    deployment.apps/chatqna-data-prep created
    deployment.apps/chatqna-embedding-usvc created
    deployment.apps/chatqna-llm-uservice created
    deployment.apps/chatqna-redis-vector-db created
    deployment.apps/chatqna-reranking-usvc created
    deployment.apps/chatqna-retriever-usvc created
    deployment.apps/chatqna-tei created
    deployment.apps/chatqna-teirerank created
    deployment.apps/chatqna-tgi created
    deployment.apps/chatqna created
    
  • Here is the log output:
    ec2-user:~/environment:$ kubectl get all 
    NAME                                           READY   STATUS              RESTARTS   AGE
    pod/chatqna-79d8c5ffff-m2fb9                   1/1     Running             0          8m29s
    pod/chatqna-data-prep-77dcc665f4-gjj7t         1/1     Running             0          8m30s
    pod/chatqna-embedding-usvc-55d4dc8f67-6qrln    1/1     Running             0          8m30s
    pod/chatqna-llm-uservice-66cc67785-vkpc9       1/1     Running             0          8m30s
    pod/chatqna-redis-vector-db-5dcd98f579-x7k9q   1/1     Running             0          8m30s
    pod/chatqna-reranking-usvc-759bf96c5c-fl6f8    1/1     Running             0          8m30s
    pod/chatqna-retriever-usvc-86f8dfbfb6-pfktk    1/1     Running             0          8m30s
    pod/chatqna-tei-565488dd9-p4cj7                0/1     ContainerCreating   0          8m30s
    pod/chatqna-teirerank-6c9854cfdf-mmgqh         0/1     ContainerCreating   0          8m30s
    pod/chatqna-tgi-587b54f5ff-fcfqn               0/1     ContainerCreating   0          8m29s
    
    NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    service/chatqna                   ClusterIP   172.20.167.162   <none>        8888/TCP            8m30s
    service/chatqna-data-prep         ClusterIP   172.20.240.173   <none>        6007/TCP            8m30s
    service/chatqna-embedding-usvc    ClusterIP   172.20.194.245   <none>        6000/TCP            8m30s
    service/chatqna-llm-uservice      ClusterIP   172.20.70.157    <none>        9000/TCP            8m30s
    service/chatqna-redis-vector-db   ClusterIP   172.20.165.213   <none>        6379/TCP,8001/TCP   8m30s
    service/chatqna-reranking-usvc    ClusterIP   172.20.112.188   <none>        8000/TCP            8m30s
    service/chatqna-retriever-usvc    ClusterIP   172.20.204.167   <none>        7000/TCP            8m30s
    service/chatqna-tei               ClusterIP   172.20.116.54    <none>        80/TCP              8m30s
    service/chatqna-teirerank         ClusterIP   172.20.22.103    <none>        80/TCP              8m30s
    service/chatqna-tgi               ClusterIP   172.20.10.36     <none>        80/TCP              8m30s
    service/kubernetes                ClusterIP   172.20.0.1       <none>        443/TCP             24m
    
    NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/chatqna                   1/1     1            1           8m30s
    deployment.apps/chatqna-data-prep         1/1     1            1           8m30s
    deployment.apps/chatqna-embedding-usvc    1/1     1            1           8m30s
    deployment.apps/chatqna-llm-uservice      1/1     1            1           8m30s
    deployment.apps/chatqna-redis-vector-db   1/1     1            1           8m30s
    deployment.apps/chatqna-reranking-usvc    1/1     1            1           8m30s
    deployment.apps/chatqna-retriever-usvc    1/1     1            1           8m30s
    deployment.apps/chatqna-tei               0/1     1            0           8m30s
    deployment.apps/chatqna-teirerank         0/1     1            0           8m30s
    deployment.apps/chatqna-tgi               0/1     1            0           8m30s
    
    NAME                                                 DESIRED   CURRENT   READY   AGE
    replicaset.apps/chatqna-79d8c5ffff                   1         1         1       8m29s
    replicaset.apps/chatqna-data-prep-77dcc665f4         1         1         1       8m30s
    replicaset.apps/chatqna-embedding-usvc-55d4dc8f67    1         1         1       8m30s
    replicaset.apps/chatqna-llm-uservice-66cc67785       1         1         1       8m30s
    replicaset.apps/chatqna-redis-vector-db-5dcd98f579   1         1         1       8m30s
    replicaset.apps/chatqna-reranking-usvc-759bf96c5c    1         1         1       8m30s
    replicaset.apps/chatqna-retriever-usvc-86f8dfbfb6    1         1         1       8m30s
    replicaset.apps/chatqna-tei-565488dd9                1         1         0       8m30s
    replicaset.apps/chatqna-teirerank-6c9854cfdf         1         1         0       8m30s
    replicaset.apps/chatqna-tgi-587b54f5ff               1         1         0       8m29s
    ec2-user:~/environment:$ kubectl logs pod/chatqna-tei-565488dd9-p4cj7
    Error from server (BadRequest): container "tei" in pod "chatqna-tei-565488dd9-p4cj7" is waiting to start: ContainerCreating
    ec2-user:~/environment:$ kubectl logs pod/chatqna-teirerank-6c9854cfdf-mmgqh
    Error from server (BadRequest): container "teirerank" in pod "chatqna-teirerank-6c9854cfdf-mmgqh" is waiting to start: ContainerCreating
    ec2-user:~/environment:$ kubectl logs pod/chatqna-tgi-587b54f5ff-fcfqn
    Error from server (BadRequest): container "tgi" in pod "chatqna-tgi-587b54f5ff-fcfqn" is waiting to start: ContainerCreating
    
    Blocked on opea-project/GenAIExamples#725
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment