- Spin up Ubuntu 24.04 VM and install Docker following the instructions at https://gist.github.com/arun-gupta/7e9f080feff664fbab878b26d13d83d7
- Pull the Docker image:
sudo docker pull opea/codegen:latest
- Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named
.env
:export host_ip="172.31.50.223" export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" export LLM_MODEL_ID="deepseek-ai/deepseek-coder-6.7b-instruct" export TGI_LLM_ENDPOINT="http://${host_ip}:8028" export MEGA_SERVICE_HOST_IP=${host_ip} export LLM_SERVICE_HOST_IP=${host_ip} export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7778/v1/codegen"
- Download Docker Compose file:
curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/CodeGen/docker_compose/intel/cpu/xeon/compose.yaml
- Start the application:
sudo docker compose -f compose.yaml up -d
- Verify the list of containers:
ubuntu@ip-172-31-50-223:~$ sudo docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ba99bf66e45b opea/codegen-ui:latest "docker-entrypoint.s…" 7 minutes ago Up 7 minutes 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-xeon-ui-server 31a19966946b opea/codegen:latest "python codegen.py" 7 minutes ago Up 7 minutes 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-xeon-backend-server 1c1649d31187 opea/llm-tgi:latest "bash entrypoint.sh" 7 minutes ago Up 7 minutes 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-server
- Access the service using
cURL
command:
This is currently causing opea-project/GenAIExamples#824curl http://${host_ip}:7778/v1/codegen \ -H "Content-Type: application/json" \ -d '{"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}'
- Pull the Docker image:
sudo docker pull opea/codetrans:latest
- Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named
.env
:export host_ip="External_Public_IP" export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" # Example: NGINX_PORT=80 export NGINX_PORT=${your_nginx_port} export LLM_MODEL_ID="HuggingFaceH4/mistral-7b-grok" export TGI_LLM_ENDPOINT="http://${host_ip}:8008" export MEGA_SERVICE_HOST_IP=${host_ip} export LLM_SERVICE_HOST_IP=${host_ip} export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7777/v1/codetrans" export FRONTEND_SERVICE_IP=${host_ip} export FRONTEND_SERVICE_PORT=5173 export BACKEND_SERVICE_NAME=codetrans export BACKEND_SERVICE_IP=${host_ip} export BACKEND_SERVICE_PORT=7777
- Download Docker Compose file:
curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/CodeTrans/docker_compose/intel/cpu/xeon/compose.yaml
- Start the application:
This is causing opea-project/GenAIExamples#830sudo docker compose -f compose.yaml up -d
- Pull the Docker image:
sudo docker pull opea/docsum:latest
- Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named
.env
:export host_ip="172.31.54.128" export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" export TGI_LLM_ENDPOINT="http://${host_ip}:8008" export MEGA_SERVICE_HOST_IP=${host_ip} export LLM_SERVICE_HOST_IP=${host_ip} export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/docsum"
- Download Docker Compose file:
curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/DocSum/docker_compose/intel/cpu/xeon/compose.yaml
- Start the application:
sudo docker compose -f compose.yaml up -d
- Verify the list of containers:
ubuntu@ip-172-31-54-128:~$ sudo docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 68ca3c32ecdd opea/docsum-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp docsum-xeon-ui-server 26b0d896b3c7 opea/docsum:latest "python docsum.py" About a minute ago Up About a minute 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp docsum-xeon-backend-server bd0606afb0fd opea/llm-docsum-tgi:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-docsum-server 06d4446bd9b1 ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu "text-generation-lau…" 2 minutes ago Up About a minute 0.0.0.0:8008->80/tcp, [::]:8008->80/tcp tgi-service
- Access the service using
cURL
command:
This is currently causing opea-project/GenAIExamples#835.curl http://${host_ip}:8888/v1/docsum \ -H "Content-Type: application/json" \ -d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
- Pull the Docker image:
Instructions should be updated to use pre-built images: opea-project/GenAIExamples#836.sudo docker pull opea/translation:latest
- Replace HuggingFace API token and private IP address of the host below and copy the contents in a file named
.env
:export host_ip="172.31.49.59" #private IP address export LLM_MODEL_ID="haoranxu/ALMA-13B" export TGI_LLM_ENDPOINT="http://${host_ip}:8008" export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} export MEGA_SERVICE_HOST_IP=${host_ip} export LLM_SERVICE_HOST_IP=${host_ip} export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
- Download Docker Compose file:
curl -O https://raw.githubusercontent.com/opea-project/GenAIExamples/main/Translation/docker_compose/intel/cpu/xeon/compose.yaml
- Start the application:
sudo docker compose -f compose.yaml up -d
- Verify the list of containers:
Check the logs:ubuntu@ip-172-31-54-128:~$ sudo docker container ls
2024-09-18T17:04:48.079544Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0 2024-09-18T17:04:58.095135Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0 2024-09-18T17:05:08.110327Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0 2024-09-18T17:05:10.114867Z INFO text_generation_launcher: Server started at unix:///tmp/text-generation-server-0 2024-09-18T17:05:10.213727Z INFO shard-manager: text_generation_launcher: Shard ready in 422.780717591s rank=0 2024-09-18T17:05:10.298830Z INFO text_generation_launcher: Starting Webserver 2024-09-18T17:05:10.440064Z INFO text_generation_router_v3: backends/v3/src/lib.rs:90: Warming up model 2024-09-18T17:05:39.868187Z INFO text_generation_launcher: Cuda Graphs are disabled (CUDA_GRAPHS=None). 2024-09-18T17:05:39.868785Z INFO text_generation_router_v3: backends/v3/src/lib.rs:102: Setting max batch total tokens to 45136 2024-09-18T17:05:39.869908Z INFO text_generation_router_v3: backends/v3/src/lib.rs:126: Using backend V3 2024-09-18T17:05:39.876810Z INFO text_generation_router::server: router/src/server.rs:1651: Using the Hugging Face API 2024-09-18T17:05:39.895544Z INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token" 2024-09-18T17:05:40.482470Z INFO text_generation_router::server: router/src/server.rs:2349: Serving revision 822086d00e1e61b0c3f99bea3577a916b4360001 of model haoranxu/ALMA-13B 2024-09-18T17:05:40.483950Z INFO text_generation_router::server: router/src/server.rs:1781: Using config Some(Llama) 2024-09-18T17:05:40.483965Z WARN text_generation_router::server: router/src/server.rs:1783: Could not find a fast tokenizer implementation for haoranxu/ALMA-13B 2024-09-18T17:05:40.483967Z WARN text_generation_router::server: router/src/server.rs:1784: Rust input length validation and truncation is disabled 2024-09-18T17:05:40.483989Z WARN text_generation_router::server: router/src/server.rs:1928: Invalid hostname, defaulting to 0.0.0.0 2024-09-18T17:05:40.490701Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
- Access the service using
cURL
command:ubuntu@ip-172-31-49-59:~$ curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{ "language_from": "Hindi","language_to": "English","source_language": "आप कैसे हो "}' data: b' How' data: b' are' data: b' you' data: b'?' data: b'</s>' data: [DONE]