Created
March 14, 2025 18:28
-
-
Save dbreunig/79ceaec7e1a80b7ed4e4f2894b53b0ca to your computer and use it in GitHub Desktop.
A multimodal twist on Simon Willison’s “[Pelican on a Bicycle](https://github.com/simonw/pelican-bicycle)” LLM benchmark, where after an LLM generates an SVG of a pelican on a bicycle, we convert the SVG to an image and ask the LLM to describe what it sees. The optimal result is the LLM recognizes a pelican on a bicycle.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import llm | |
import re | |
import cairosvg | |
model = llm.get_model("gemma3:27b") # <- Swap in whatever model you want to test here | |
response = model.prompt("Generate an SVG of a pelican riding a bicycle") | |
# Extract the SVG code from the response | |
svg_match = re.search(r'<svg.*?</svg>', response.text(), re.DOTALL) | |
if svg_match: | |
svg_code = svg_match.group(0) | |
print("Extracted valid SVG code...") | |
with open("pelican_bicycle.svg", "w") as f: | |
f.write(svg_code) | |
else: | |
print("No SVG code found in the response.") | |
exit() | |
# Convert to PNG | |
cairosvg.svg2png(url="pelican_bicycle.svg", write_to="pelican_bicycle.png") | |
# Generate the image description | |
response = model.prompt( | |
"Describe this image in one sentence.", | |
attachments=[ | |
llm.Attachment(path="pelican_bicycle.png") | |
] | |
) | |
print(response) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment