Skip to content

Instantly share code, notes, and snippets.

@GOROman
Last active January 15, 2025 02:47
Show Gist options
  • Save GOROman/8947f32074df2370ea8c4b5877e9632b to your computer and use it in GitHub Desktop.
Save GOROman/8947f32074df2370ea8c4b5877e9632b to your computer and use it in GitHub Desktop.
MLX + MLX_VLM + Qwen2-VL-2B-Instruct-4bit で画像をVLMで解説してもらう
# /// script
# requires-python = "==3.12"
# dependencies = ["mlx==0.21.0", "mlx_vlm"]
# ///
import mlx.core as mx
import numpy as np
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
# Load the model
model_path = "mlx-community/Qwen2-VL-2B-Instruct-4bit"
model, processor = load(model_path)
config = load_config(model_path)
# Prepare input
image = ["yellow-hage.jpg"]
prompt = "Describe this image."
# Apply chat template
formatted_prompt = apply_chat_template(
processor, config, prompt, num_images=len(image)
)
# Generate output
output = generate(model, processor, formatted_prompt, image, verbose=True, dtype=np.float32)
print(output)
@GOROman
Copy link
Author

GOROman commented Jan 13, 2025

@kinneko
Copy link

kinneko commented Jan 14, 2025

繰り返して呼ぶと中国語になりがち。
プロンプトで強制したら日本語吐けたです。

prompt = "この画像を詳細に説明してください。日本語で応答してください。"

@GOROman
Copy link
Author

GOROman commented Jan 15, 2025

# /// script
# requires-python = "==3.12"
# dependencies = ["mlx_vlm"]
# ///

を冒頭に追加したので、uv run mlx_vlm_test.py でいける。

@GOROman
Copy link
Author

GOROman commented Jan 15, 2025

640x640 リサイズ版(速くなる)

yellow-hage.jpg:
yellow-hage.jpg

@GOROman
Copy link
Author

GOROman commented Jan 15, 2025

dependencies = ["mlx==0.21.0", "mlx_vlm"]

だとOK。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment