Skip to content

Instantly share code, notes, and snippets.

@yuekaizhang
Last active March 9, 2025 11:11
Show Gist options
  • Save yuekaizhang/0b81f04f4bb17ba5154e55c2678d2daf to your computer and use it in GitHub Desktop.
Save yuekaizhang/0b81f04f4bb17ba5154e55c2678d2daf to your computer and use it in GitHub Desktop.

UI-TARS 本地部署

环境

# 先配置 ngc key, passwd is your ngc key
docker login nvcr.io
docker pull nvcr.io/nvidia/tritonserver:25.02-vllm-python-py3
docker run -it --gpus all --name "ui-tars" --net host nvcr.io/nvidia/tritonserver:25.02-vllm-python-py3

Server 端

huggingface-cli download bytedance-research/UI-TARS-7B-DPO --local-dir UI-TARS-7B-DPO
python3  -m vllm.entrypoints.openai.api_server --served-model-name ui-tars \
                                               --model UI-TARS-7B-DPO \
                                               --limit-mm-per-prompt image=5

# In another terminal, 
# docker exec -it ui-tars bash 
pip install gradio-tunneling 
# 映射服务端口到公网
gradio-tun 8000 

Client 端

安装 chrome 插件 midscene, 并配置:

OPENAI_BASE_URL="https://d3bd34ef6463c5e82e.gradio.live/v1"
OPENAI_API_KEY="empty" 
MIDSCENE_MODEL_NAME="ui-tars"
MIDSCENE_USE_VLM_UI_TARS=1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment