Skip to content

Instantly share code, notes, and snippets.

@etozzato
Created January 18, 2020 19:03
Show Gist options
  • Save etozzato/4d10caf814b2803fe074e5b98f4b0811 to your computer and use it in GitHub Desktop.
Save etozzato/4d10caf814b2803fe074e5b98f4b0811 to your computer and use it in GitHub Desktop.
California BCC Bureau of Cannabis Control's License CSV Parser
require 'csv'
# This LicenseReader will receive a CSV file containing business license information. Different US states will implement completely different formats,
# the following example is for California. The LicenseReader, integrated into the LIMS will allow the laboratory to type a license `C11-0001123-LIC` and
# return attribute data businessuctures compatible with the creation of the Client and License model.The output must remain constant but the input must
# be parsed using several regular expressions depending on the data made available by the regulatory agency. The class must take into consideration
# fault tolerancy when data is incomplete. Particularly badly businessucture in the example is the column Business Contact Information!
#
# "License Number","License Type","Business Owner","Business Contact Information","Business businessucture","Premise Address","Status","Issue Date","Expiration Date","Activities","Adult-Use/Medicinal"
# "C11-0001129-LIC","Cannabis - Dibusinessibutor License","RICHARD SMITH","CLH Dibusinessibution, LLC : Email- [email protected] : Phone- 8057203518 ","undefined","LOMPOC, CA 934366120 County: SANTA BARBARA ","Active","12/17/2019","12/16/2020","N/A for this license type","BOTH"
# "C11-0001130-LIC","Cannabis - Dibusinessibutor License","Steven Dang","XTRACTA DIbusinessIBUTION, LLC : Email- [email protected] : Phone- 6196345267 ","undefined","San Diego, CA 92126 County: SAN DIEGO ","Active","12/17/2019","12/16/2020","N/A for this license type","BOTH"
# "C11-0001123-LIC","Cannabis - Dibusinessibutor License","Douglas Fierro","Humboldt Hills Dibusinessibution : Email- [email protected] : Phone- 2138001789 ","undefined","SHASTA LAKE, CA 96019 County: SHASTA ","Active","12/13/2019","12/12/2020","N/A for this license type","BOTH"
# "C11-0001122-LIC","Cannabis - Dibusinessibutor License","Leo Shlovsky","ONE LED CA LLC : Green Revolution : Email- [email protected] : Phone- 9177545717 ","undefined","DESERT HOT SPRINGS, CA 92240 County: ","Active","12/12/2019","12/11/2020","N/A for this license type","BOTH"
# "C11-0001121-LIC","Cannabis - Dibusinessibutor License","Austin Lubin","PENINSULA DIbusinessIBUTION : Email- [email protected] : Phone- 8313244917 ","undefined","SEASIDE, CA 939554321 County: MONTEREY ","Active","12/10/2019","12/09/2020","N/A for this license type","BOTH"
# "C11-0001120-LIC","Cannabis - Dibusinessibutor License","Daniel Nathanson","SWEETWATER CANYON DEVELOPMENT, LLC : Malibu Gold : Email- [email protected] : Phone- 4243351086 ","undefined","SANTA ANA, CA 927074236 County: ORANGE ","Active","12/04/2019","12/03/2020","N/A for this license type","BOTH"
#
# Example of use
# lr = LicenseReader.new(path: csv, regs: 'ca')
# lr.query('C11-0001121-LIC') => true
# lr.client_attributes => {:business_name=>"PENINSULA DISTRIBUTION", ...
# lr.license_attributes => {:name=>"PENINSULA DISTRIBUTION", ...
#
# Parse the whole file
#
# lr.read_lines
# lr.lines.each do |line|
# next if line[0] == 'License Number'
#
# if lr.query(line[0])
# puts "\nclient_attributes"
# puts lr.client_attributes
# puts "\nlicense_attributes"
# puts lr.license_attributes
# puts
# end
# end
class LicenseReader
attr_accessor :path, :lines, :data, :regs, :match, :query
def initialize(path:, regs:)
self.path = path
self.regs = regs
self.lines = []
self.data = {}
end
def read_lines
self.lines = CSV.read(path)
end
def query(q)
self.query = nil
self.match = `grep '#{q}' #{path}`
return false if match == ''
self.query = q
true
end
def client_attributes
return {} if match == ''
parse_line
{
business_name: data[:business_name],
business_type: data[:business_type],
url: data[:website],
first_name: data[:first_name],
last_name: data[:last_name],
address: data[:location],
phone1: data[:phone],
email1: data[:email]
}
end
def license_attributes
return {} if match == ''
parse_line
{
name: data[:dba] || data[:business_name],
address: data[:location],
number: data[:license_number],
license_type: data[:license_type],
verified: true,
expires_on: Date.strptime(data[:expires], '%m/%d/%Y').to_s,
deleted: false
}
end
# data breaker
def parse_line
case regs
when 'ca', 'california'
license_number,
license_type,
name,
business,
business_type,
location,
_,
assigned,
expires,
_,
category = match.split(/\s*","\s*/)
business_array = business.split(/\s+\:\s+/)
md = business.match(/Email- (?<email>[^\s]+)\s*:*/)
email = md && md[:email]
md = business.match(/Phone- (?<phone>[^\s]+)\s*:*/)
phone = md && md[:phone]
md = business.match(/Website- (?<website>[^\s]+)\s*:*/)
website = md && md[:website]
business_name = business_array[0]
dba = business_array[1]
dba = nil if dba =~ /Email/ || dba =~ /Phone/ || dba =~ /Website/
first_name, last_name = name.split(/\s+/)
license_number.delete!('"')
row_data = %i[
license_number
license_type
name
business
business_type
location
assigned
expires
category
dba
business_name
email
phone
website
first_name
last_name].zip([
license_number,
license_type,
name,
business,
business_type,
location,
assigned,
expires,
category,
dba,
business_name,
email,
phone,
website,
first_name,
last_name
])
self.data = Hash[row_data]
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment