Skip to content

Instantly share code, notes, and snippets.

@tjwebb
Created March 20, 2025 06:02
Show Gist options
  • Save tjwebb/217174924cd6489bbcefd4e729833236 to your computer and use it in GitHub Desktop.
Save tjwebb/217174924cd6489bbcefd4e729833236 to your computer and use it in GitHub Desktop.
mapping:
parcel_number,parcel_id
account_number,None
land_use_code,use_type
land_use_description,prop_type
zoning_code,None
num_buildings,None
num_stories,floor_count
year_built,dwelling_year_built
main_building_sqft,square_footage
improvement_value,impv
land_value,land_value
agriculture_value,None
total_value,total_value
last_sale_amount,total_value
owner_type,None
owner_name,owner_name
legal_description,legald_description
source_updated,None
create or replace function llm_prompt(model text, prompt text) returns text as $$
from ollama import Client
client = Client(host='http://ollama_harmonizer:11434')
response = client.generate(model=model, prompt=prompt, stream=False, options={ 'num_ctx': 5000 })
plpy.info(response)
return response['response']
$$ language plpython3u parallel safe;
select
column_name as source_column,
llm_prompt('gemma3:27b', parcel_llm_get_harmonize_prompt('parcel_1_ozamwt', column_name)) as target_column
from information_schema.columns
where
and table_name = 'parcel_training';
You are a database administrator.
You have a database table named "old_land_parcel". It has the following columns:
parcel_number,account_number,land_use_code,land_use_description,zoning_code,num_buildings,num_stories,year_built,main_building_sqft,improvement_value,land_value,agriculture_value,total_value,last_sale_amount,owner_type,owner_name,legal_descr
You also have a table called new_land_parcel. It has the following columns:
source_level,fid,objectid,lrs n,parcel_id,extension,extbldgnum,ext_description,parent_parcel_id,routing_number,leg_description,prop_street,prop_city,prop_state,prop_zip,owner1,own_street,own_city,own_state,own_zip,imagepath,sketchpath,strtnum
The following 4 rows of data are from the table "old_land_parcel":
00000003814600,0980349587,0000,Vacant,AXX-1,1,2,1999,1450,100000,75000,0,175000,199000,Individual,Panther Montgomery,LOT89 BLK A,2023-01-01
384-99-11111,0009919834,None,Commercial,HC-3,2,2,1922,3600,250000,100000,0,350000,410000,Government,State of Washington,PLAT 52 Part of B,2024-03-01
03814600,1540498893,0100,Single-Family,HC-1,1,2,1956,2450,150000,75000,0,225000,237000,Individual,Mowgli Jones,LOT89 BLK A,2024-01-01
239847,3245890,ABC,ALL,9990,1,2,2008,2450,150000,75000,0,225000,237000,Individual,Max Powers,None,2024-09-01
The following 4 rows of data are from the table "new_land_parcel":
1,3621,None,3622,9746,06910700,R01,1,,,1449718102,N 1/2-2 Blk 31 Brandon 2,3480 Johns St,Norfolk,VA,23513-1655,Martin, Herman W & Betty Jo,1440 Braden Cres,Norfolk,VA,23502,001\0691\0700R012.jpg,00000011\00000586.jpg,3480,,,Johns,ST,,,Norfo
1,31294,None,31295,41645,30337950,R01,1,,,1449592499,12 Blk 3 (Larrymore Lawns Sec 6),7437 Spartan Av,Norfolk,VA,23518-4333,Montano Living Trust,7437 Spartan Ave,Norfolk,VA,23518-4333,001\3033\7950R012.jpg,00000039\00000763.jpg,7437,,,Sparta
1,23582,None,23583,32817,23785900,None,None,None,,1438430124,617 25 Ft,2421 Middle Av,Norfolk,VA,23504,Butts, Brittany E,2425 Middle Ave,Norfolk,VA,23504-2030,None,None,2421,,,Middle,AV,,,Norfolk,23504,0,2581,0.0593,2581,0474,None,500,500 R
1,50294,None,50295,63330,46201360,R01,1,,,1438665154,15-16 Blk 19 Lafayette Res Pk Co,3026 Racine Av,Norfolk,VA,23509-1137,Beard, Olivia Et Al,3026 Racine Ave,Norfolk,VA,23509,001\4620\1360R012.jpg,00000058\00000442.jpg,3026,,,Racine,AV,,,No
Some known synonyms to "parcel_number" are:
- pid
- apn
- parcelno
Some known wrong matches (antonyms) to "parcel_number" are:
- fid
- house_number
Your job is to determine which column in new_land_parcel most closely matches the column "parcel_number" from table old_land_parcel.
Consider each the following independently of each other:
- The column must exist in the table new_land_parcel. Respond with 'None' if there is no good match.
- Consider the likely meaning of the column name "parcel_number" and the semantic similarities.
- Consider lexical similarity (e.g. levenshtein distance) between the name "parcel_number" and the names of the columns in the table old_land_parcel.
- Consider the provided synonyms and antonyms
- Your answer must be a named column from the table new_land_parcel that was provided earlier.
- Additionally, consider the sample rows from each table in CSV format provided above.
What is the name of the column in the table new_land_parcel most closely matches the column "parcel_number"?
If none of the columns in new_land_parcel match "parcel_number", respond with 'None'.
Respond with the column name only. Do not include any reasoning or other information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment