Created
February 8, 2025 21:20
-
-
Save tjwebb/1fc6a50c83b8f7324c5b14123c4baffd to your computer and use it in GitHub Desktop.
Ollama + Vision Example
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from langchain_ollama import OllamaLLM | |
from langchain_core.messages import HumanMessage, SystemMessage | |
import base64 | |
text_prompt = """ | |
You are a robot for a homeowners insurance underwriter. | |
You observe and record the physical characteristics of residential property from aerial imagery. | |
Response Format: JSON | |
Fill out the following JSON object that contains the following attributes of the given image. | |
Do not include any other text. Do not include introduction or explanation. Return JSON only. | |
For the “hazards” field, include anything that could potentially damage the roof or surrounding property. | |
Return: object | |
Fields: | |
roof_color: <string> | |
roof_shape: <string> | |
roof_primary_material: <string> | |
roof_architecture_style: <string> | |
roof_maintenance_condition: <string> | |
hazards: [ <string> ] | |
roof_chimneys: <integer> | |
roof_dormers: <integer> | |
""" | |
with open('testhouse.png', 'rb') as test_image: | |
#image_data = base64.b64encode(test_image.read()) | |
image_data = base64.b64encode(test_image.read()).decode('utf-8') | |
model = OllamaLLM(model="llama3.2-vision:11b-instruct-fp16") | |
model_with_image = model.bind(images=[image_data]) | |
message = HumanMessage( | |
content=[ | |
{"type": "text", "text": "describe this image"}, | |
{ | |
"type": "image_url", | |
"image_url": {"url": f"data:image;base64,{image_data}"}, | |
}, | |
], | |
) | |
#print(message) | |
print(model_with_image.invoke(text_prompt)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If you're suggesting that you're considering CrewAI, it's pretty straight forward. Although, I do believe it uses langchain under the hood (or at least it used to). Pydantic works well with it. I haven't had much luck with local models, but that's a hardware issue on my end. The OpenAI API has made it pretty easy to get around that though.