Skip to content

Instantly share code, notes, and snippets.

@dbreunig
Created March 14, 2025 18:28
Show Gist options
  • Save dbreunig/79ceaec7e1a80b7ed4e4f2894b53b0ca to your computer and use it in GitHub Desktop.
Save dbreunig/79ceaec7e1a80b7ed4e4f2894b53b0ca to your computer and use it in GitHub Desktop.
A multimodal twist on Simon Willison’s “[Pelican on a Bicycle](https://github.com/simonw/pelican-bicycle)” LLM benchmark, where after an LLM generates an SVG of a pelican on a bicycle, we convert the SVG to an image and ask the LLM to describe what it sees. The optimal result is the LLM recognizes a pelican on a bicycle.
import llm
import re
import cairosvg
model = llm.get_model("gemma3:27b") # <- Swap in whatever model you want to test here
response = model.prompt("Generate an SVG of a pelican riding a bicycle")
# Extract the SVG code from the response
svg_match = re.search(r'<svg.*?</svg>', response.text(), re.DOTALL)
if svg_match:
svg_code = svg_match.group(0)
print("Extracted valid SVG code...")
with open("pelican_bicycle.svg", "w") as f:
f.write(svg_code)
else:
print("No SVG code found in the response.")
exit()
# Convert to PNG
cairosvg.svg2png(url="pelican_bicycle.svg", write_to="pelican_bicycle.png")
# Generate the image description
response = model.prompt(
"Describe this image in one sentence.",
attachments=[
llm.Attachment(path="pelican_bicycle.png")
]
)
print(response)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment