Skip to content

Instantly share code, notes, and snippets.

@arun-gupta
Last active September 18, 2024 17:21
Show Gist options
  • Save arun-gupta/5f02b5a57030ba8f975a4c328178ffe8 to your computer and use it in GitHub Desktop.
Save arun-gupta/5f02b5a57030ba8f975a4c328178ffe8 to your computer and use it in GitHub Desktop.
More OPEA Examples using Docker Compose

More OPEA Examples using Docker Compose

CodeGen

  • Pull the Docker image:
    sudo docker pull opea/codegen:latest
    
  • Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named .env:
    export host_ip="172.31.50.223"
    export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
    export LLM_MODEL_ID="deepseek-ai/deepseek-coder-6.7b-instruct"
    export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
    export MEGA_SERVICE_HOST_IP=${host_ip}
    export LLM_SERVICE_HOST_IP=${host_ip}
    export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7778/v1/codegen"
    
  • Download Docker Compose file:
    curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/CodeGen/docker_compose/intel/cpu/xeon/compose.yaml
    
  • Start the application:
    sudo docker compose -f compose.yaml up -d
    
  • Verify the list of containers:
    ubuntu@ip-172-31-50-223:~$ sudo docker container ls
    CONTAINER ID   IMAGE                    COMMAND                  CREATED         STATUS         PORTS                                       NAMES
    ba99bf66e45b   opea/codegen-ui:latest   "docker-entrypoint.s…"   7 minutes ago   Up 7 minutes   0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   codegen-xeon-ui-server
    31a19966946b   opea/codegen:latest      "python codegen.py"      7 minutes ago   Up 7 minutes   0.0.0.0:7778->7778/tcp, :::7778->7778/tcp   codegen-xeon-backend-server
    1c1649d31187   opea/llm-tgi:latest      "bash entrypoint.sh"     7 minutes ago   Up 7 minutes   0.0.0.0:9000->9000/tcp, :::9000->9000/tcp   llm-tgi-server
    
  • Access the service using cURL command:
    curl http://${host_ip}:7778/v1/codegen \
      -H "Content-Type: application/json" \
      -d '{"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}'
    
    This is currently causing opea-project/GenAIExamples#824

CodeTrans

  • Pull the Docker image:
    sudo docker pull opea/codetrans:latest
    
  • Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named .env:
    export host_ip="External_Public_IP"
    export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
    # Example: NGINX_PORT=80
    export NGINX_PORT=${your_nginx_port}
    export LLM_MODEL_ID="HuggingFaceH4/mistral-7b-grok"
    export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
    export MEGA_SERVICE_HOST_IP=${host_ip}
    export LLM_SERVICE_HOST_IP=${host_ip}
    export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7777/v1/codetrans"
    export FRONTEND_SERVICE_IP=${host_ip}
    export FRONTEND_SERVICE_PORT=5173
    export BACKEND_SERVICE_NAME=codetrans
    export BACKEND_SERVICE_IP=${host_ip}
    export BACKEND_SERVICE_PORT=7777
    
  • Download Docker Compose file:
    curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/CodeTrans/docker_compose/intel/cpu/xeon/compose.yaml
    
  • Start the application:
    sudo docker compose -f compose.yaml up -d
    
    This is causing opea-project/GenAIExamples#830

DocSum

  • Pull the Docker image:
    sudo docker pull opea/docsum:latest
    
  • Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named .env:
    export host_ip="172.31.54.128"
    export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
    export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
    export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
    export MEGA_SERVICE_HOST_IP=${host_ip}
    export LLM_SERVICE_HOST_IP=${host_ip}
    export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/docsum"
    
  • Download Docker Compose file:
    curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/DocSum/docker_compose/intel/cpu/xeon/compose.yaml
    
  • Start the application:
    sudo docker compose -f compose.yaml up -d
    
  • Verify the list of containers:
    ubuntu@ip-172-31-54-128:~$ sudo docker container ls
    CONTAINER ID   IMAGE                                                                 COMMAND                  CREATED              STATUS              PORTS                                       NAMES
    68ca3c32ecdd   opea/docsum-ui:latest                                                 "docker-entrypoint.s…"   About a minute ago   Up About a minute   0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   docsum-xeon-ui-server
    26b0d896b3c7   opea/docsum:latest                                                    "python docsum.py"       About a minute ago   Up About a minute   0.0.0.0:8888->8888/tcp, :::8888->8888/tcp   docsum-xeon-backend-server
    bd0606afb0fd   opea/llm-docsum-tgi:latest                                            "bash entrypoint.sh"     About a minute ago   Up About a minute   0.0.0.0:9000->9000/tcp, :::9000->9000/tcp   llm-docsum-server
    06d4446bd9b1   ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu   "text-generation-lau…"   2 minutes ago        Up About a minute   0.0.0.0:8008->80/tcp, [::]:8008->80/tcp     tgi-service
    
  • Access the service using cURL command:
    curl http://${host_ip}:8888/v1/docsum \
      -H "Content-Type: application/json" \
      -d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
    
    This is currently causing opea-project/GenAIExamples#835.

Translation

  • Pull the Docker image:
    sudo docker pull opea/translation:latest
    
    Instructions should be updated to use pre-built images: opea-project/GenAIExamples#836.
  • Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named .env:
    export host_ip="172.31.49.59" #private IP address
    export LLM_MODEL_ID="haoranxu/ALMA-13B"
    export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
    export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
    export MEGA_SERVICE_HOST_IP=${host_ip}
    export LLM_SERVICE_HOST_IP=${host_ip}
    export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
    
  • Download Docker Compose file:
    curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/Translation/docker_compose/intel/cpu/xeon/compose.yaml
    
  • Start the application:
    sudo docker compose -f compose.yaml up -d
    
  • Verify the list of containers:
    ubuntu@ip-172-31-54-128:~$ sudo docker container ls
    
    Check the logs:
    2024-09-18T17:04:48.079544Z  INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
    2024-09-18T17:04:58.095135Z  INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
    2024-09-18T17:05:08.110327Z  INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
    2024-09-18T17:05:10.114867Z  INFO text_generation_launcher: Server started at unix:///tmp/text-generation-server-0
    2024-09-18T17:05:10.213727Z  INFO shard-manager: text_generation_launcher: Shard ready in 422.780717591s rank=0
    2024-09-18T17:05:10.298830Z  INFO text_generation_launcher: Starting Webserver
    2024-09-18T17:05:10.440064Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:90: Warming up model
    2024-09-18T17:05:39.868187Z  INFO text_generation_launcher: Cuda Graphs are disabled (CUDA_GRAPHS=None).
    2024-09-18T17:05:39.868785Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:102: Setting max batch total tokens to 45136
    2024-09-18T17:05:39.869908Z  INFO text_generation_router_v3: backends/v3/src/lib.rs:126: Using backend V3
    2024-09-18T17:05:39.876810Z  INFO text_generation_router::server: router/src/server.rs:1651: Using the Hugging Face API
    2024-09-18T17:05:39.895544Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"    
    2024-09-18T17:05:40.482470Z  INFO text_generation_router::server: router/src/server.rs:2349: Serving revision 822086d00e1e61b0c3f99bea3577a916b4360001 of model haoranxu/ALMA-13B
    2024-09-18T17:05:40.483950Z  INFO text_generation_router::server: router/src/server.rs:1781: Using config Some(Llama)
    2024-09-18T17:05:40.483965Z  WARN text_generation_router::server: router/src/server.rs:1783: Could not find a fast tokenizer implementation for haoranxu/ALMA-13B
    2024-09-18T17:05:40.483967Z  WARN text_generation_router::server: router/src/server.rs:1784: Rust input length validation and truncation is disabled
    2024-09-18T17:05:40.483989Z  WARN text_generation_router::server: router/src/server.rs:1928: Invalid hostname, defaulting to 0.0.0.0
    2024-09-18T17:05:40.490701Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
    
  • Access the service using cURL command:
    ubuntu@ip-172-31-49-59:~$ curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{
       "language_from": "Hindi","language_to": "English","source_language": "आप कैसे हो  "}'
    data: b' How'
    
    data: b' are'
    
    data: b' you'
    
    data: b'?'
    
    data: b'</s>'
    
    data: [DONE]
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment