Created
October 19, 2017 20:09
-
-
Save lamvann/59428868280dd18db9bd7d39c246828d to your computer and use it in GitHub Desktop.
At some point in 2016 I was interested in figuring out what car would have the highest ROI if I were to rent it on the internet. This J file analyzes a CSV file returned by a web scraper, cleans up the data, links renters to cars and tallies the number of times each vehicle gets rented over time.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
NB. Additional Helper Verbs | |
jdw1 =. (i(0 0 f) i (1 0 f) jd o ('reads count w1 from make where w1 = "'c , ] , '"'c)) | |
space =. (>o[,' 'c,>o])/ | |
getCSV =. i readcsv H,'/CSV/'c,],'.csv'c | |
shape20 =. (],(' 'c)#~(20 c)-$o]) | |
isVerb =. 4!:0&bo | |
s =.(]$~1:,#) | |
NB.All Header Operations | |
removeTrashHdrs =.(}.o],~(1:,#o{.o])$(]-.(a. c)-.'-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'c) e o {.) | |
replaceHdrs =. (}.o],~(1:,#o{.o])$(('null'c)`(rtb"1 o (1 0 f) o jd o ('reads new from headers where old = "'c,],'"'c))@.((0 0 f) o (1 0 f) o (jd o ('reads count old from headers where old = "'c , ] , '"'c))))e o {.) | |
removeNullHdrs =. ({"1~ i<i<i<i I.(i<'null'c)={.) | |
cleanHdrs =.(}.o],~rtb e o (i<"1,/o>o{.)) | |
header =. cleanHdrs o removeNullHdrs o replaceHdrs o removeTrashHdrs | |
NB.Operations Per Column, Must Match Column Name | |
FreeDelivery =. ((('no 'c)`('yes'c)@.(*o+/)o(]=i<'FREE'c))"1 o;:o(]-.LF c)o> e o ]) | |
Url =. (('Category';'City';'CarID'),i>((i<i _7&}.2 f),(3&{),(i{. '?'Tokenize _1 f)) o ('/'Tokenize]) e o }.o]) | |
MakeModel =. (('Make'c;'Model'c) , >o(({. , bo o space o }.)`((i<i space 2&}.) ,~i<i space 2&{.) @. (jdw1 o (0 f)) e o ]) o ((a:c-.~ ,o>o(;: e o ('-'Tokenize ]))) e o }.o])) | |
Year =. ] | |
DayPrice =. ] | |
DayMiles =. (('-1'c) t (]-:'Untd'c)o(]-.', miles'c)L:0 o ]) | |
WeekMiles =. (('-1'c) t (]-:'Untd'c)o(]-.', miles'c)L:0 o ]) | |
MonthMiles =. (('-1'c) t (]-:'Untd'c)o(]-.', miles'c)L:0 o ]) | |
trips =. ((]-.' trips'c) ` ('0'c) @. (]-:'null'c)L:0 o ]) | |
Instant =. (('Yes'c)`('No 'c)@.(]-:'Rent this car'c)L:0 o ]) | |
Age =. (((3 f) o ;:)`('0'c)@.(]-:'null'c)L:0 o ]) | |
OwnerID =. ((_1 f)o ('/'Tokenize])L:0 o ]) | |
OwnerName =. ] | |
Joined =. ] | |
TotalTrips =. ((]-.' trips'c) ` ('0'c) @. (]-:'null'c)L:0 o ]) | |
NB.Simplification Verbs | |
group1 =. (i|:({.,i,/(0 f) apply }.)"1 o ({~ i<i<i<1 2 c) o |:) | |
group2 =.(('MakeModel'apply 2&{) ,.'Url'apply 1&{) o |: | |
body =. group2 ,. group1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment