Skip to content

Instantly share code, notes, and snippets.

@reedlaw
Created April 8, 2023 23:33
Show Gist options
  • Save reedlaw/c0adb9dbc98fc843748365c8c2154a61 to your computer and use it in GitHub Desktop.
Save reedlaw/c0adb9dbc98fc843748365c8c2154a61 to your computer and use it in GitHub Desktop.
from langchain import OpenAI, LLMChain, PromptTemplate
from langchain import PromptTemplate, FewShotPromptTemplate
from langchain.chains import LLMChain
from langchain.llms import LlamaCpp
from langchain.prompts import PromptTemplate
import csv
examples = [
{"product": "Toothpaste",
"category": "Health:Dental"},
{"product": "Toilet Flapper",
"category": "Home:Maintenance"},
{"product": "Laptop Stand",
"category": "Computer:Accessories"},
{"product": "Pressure Cooker",
"category": "Kitchen:Appliances"},
{"product": "T-shirt",
"category": "Clothing"},
{"product": "Bananas",
"category": "Grocery"},
]
llm = LlamaCpp(model_path="../llama.cpp/models/ggml-alpaca-7b-q4.bin")
example_formatter_template = """
Product: {product}
Category: {category}\n
"""
example_prompt = PromptTemplate(
input_variables=["product", "category"],
template=example_formatter_template,
)
few_shot_prompt = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
prefix="Give the category of every product",
suffix="Product: {product}\nCategory:",
input_variables=["product"],
example_separator="\n\n",
)
llm_chain = LLMChain(
llm=llm,
prompt=few_shot_prompt,
verbose=True,
)
with open('../../my/finances/amz.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(llm_chain.predict(product=row['Product Name']))
@reedlaw
Copy link
Author

reedlaw commented Apr 8, 2023

This takes an Amazon order history CSV and categorizes products for expense tracking software such as Beancount. An example run produced:

Product: LEVN Bluetooth Headset with Microphone, Trucker Bluetooth Headset with AI Noise Cancelling & Mute Button, Wireless On-Ear Headphones 60 Hrs Working Ti
Category:

llama_print_timings:        load time =   534.18 ms
llama_print_timings:      sample time =    10.94 ms /    23 runs   (    0.48 ms per run)
llama_print_timings: prompt eval time = 15154.56 ms /   167 tokens (   90.75 ms per token)
llama_print_timings:        eval time =  5072.16 ms /    22 runs   (  230.55 ms per run)
llama_print_timings:       total time = 20240.41 ms

> Finished chain.
 Electronics:Headsets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment