Skip to content

Instantly share code, notes, and snippets.

View andymithamclarke's full-sized avatar
👋

Andy Clarke andymithamclarke

👋
View GitHub Profile
lng lat
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-46.632005 -23.519901
-46.725853 -23.548752
-46.615347 -23.650206
-46.508786 -23.482543
@andymithamclarke
andymithamclarke / random-brazil-coordinates.csv
Created April 14, 2022 12:50
Randomly Generated Coordinates in Brazil
We can't make this file beautiful and searchable because it's too large.
lng,lat
-53.64057227283197,-15.150617321621077
-42.99421760836617,-11.145332942844876
-44.41791507114796,-8.157790364362803
-43.66038973901276,-21.148947704964765
-50.294873612911196,-7.517881971758156
-39.23700952309843,-9.933767804515153
-56.39064428015981,-4.383264074730363
-45.240887220743545,-13.103282452658972
-42.645435327462764,-21.757497469260148
@andymithamclarke
andymithamclarke / notebook.ipynb
Last active February 2, 2022 15:05
Reads a list of search queries as generated by Google Adwords Keyword Planner and performs a search for each query, returning the top 10 URLs for each query.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andymithamclarke
andymithamclarke / trustpilot-scraping.ipynb
Last active January 11, 2022 10:20
Notebook for scraping reviews from Trustpilot
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andymithamclarke
andymithamclarke / customer-segmentation.csv
Created October 27, 2021 08:11
Customer Segmentation Example Dataset
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 29 columns, instead of 27 in line 4.
Recency,MntWines,MntFruits,MntMeatProducts,MntFishProducts,MntSweetProducts,MntGoldProds,NumDealsPurchases,NumWebPurchases,NumCatalogPurchases,NumStorePurchases,NumWebVisitsMonth,Year_Birth,Education,Marital_Status,Income,Kidhome,Teenhome,Dt_Customer,AcceptedCmp3,AcceptedCmp4,AcceptedCmp5,AcceptedCmp1,AcceptedCmp2,Complain,Response,umap_cluster,month_name,weekday_name
58,635,88,546,172,88,88,3,8,10,4,7,1957-01-01T00:00:00Z,Graduation,Single,58138,0,0,2012-09-04T00:00:00Z,false,false,false,false,false,false,true,Cluster 1,January,Tuesday
38,11,1,6,2,1,6,2,1,1,2,5,1954-01-01T00:00:00Z,Graduation,Single,46344,1,1,2014-03-08T00:00:00Z,false,false,false,false,false,false,false,Cluster 12,January,Friday
26,426,49,127,111,21,42,1,8,2,10,4,1965-01-01T00:00:00Z,Graduation,Together,71613,0,0,2013-08-21T00:00:00Z,false,false,false,false,false,false,false,Cluster 2,January,Friday
26,11,4,20,10,3,5,2,2,0,4,6,1984-01-01T00:00:00Z,Graduation,Together,26646,1,0,2014-02-10T00:00:00Z,false,false,false,false,false,false,false,C
@andymithamclarke
andymithamclarke / graphext-vaccines-nlp-steps.txt
Created June 8, 2021 17:58
NLP Steps Added to VAERS Study Project
# Configure English as language of text
make_constant(ds["Symptom Description"], {
"value": "en",
"out_type": "category"
}) -> (ds.lang)
# Parse and extract ADJECTIVES from SYMPTOM DESCRIPTION column.
extract_keywords(ds["Symptom Description"], ds.lang, {
"keywords": {
"pos_tags": [
@andymithamclarke
andymithamclarke / 💉 VAERS 2021 | Cleaning & Joining.ipynb
Last active June 9, 2021 08:45
A notebook documenting the way a team at Graphext cleaned and joined data from the 2021 VAERS wave.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
id gender age hypertension heart_disease ever_married work_type Residence_type avg_glucose_level bmi smoking_status stroke
9046 Male 67 0 1 Yes Private Urban 228.69 36.6 formerly smoked 1
51676 Female 61 0 0 Yes Self-employed Rural 202.21 N/A never smoked 1
31112 Male 80 0 1 Yes Private Rural 105.92 32.5 never smoked 1
60182 Female 49 0 0 Yes Private Urban 171.23 34.4 smokes 1
1665 Female 79 1 0 Yes Self-employed Rural 174.12 24 never smoked 1
56669 Male 81 0 0 Yes Private Urban 186.21 29 formerly smoked 1
53882 Male 74 1 1 Yes Private Rural 70.09 27.4 never smoked 1
10434 Female 69 0 0 No Private Urban 94.39 22.8 never smoked 1
27419 Female 59 0 0 Yes Private Rural 76.15 N/A Unknown 1
@andymithamclarke
andymithamclarke / disneyland_reviews.csv
Created April 14, 2021 10:55
42,656 Reviews of 3 Disneyland Branches
We can't make this file beautiful and searchable because it's too large.
Review_ID,Rating,Year_Month,Reviewer_Location,Review_Text,Branch
670772142,4,2019-4,Australia,If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides its a Small World is absolutely fabulous and worth doing. The day we visited was fairly hot and relatively busy but the queues moved fairly well. ,Disneyland_HongKong
670682799,4,2019-5,Philippines,"Its been a while since d last time we visit HK Disneyland .. Yet, this time we only stay in Tomorrowland .. AKA Marvel land!Now they have Iron Man Experience n d Newly open Ant Man n d Wasp!!Ironman .. Great feature n so Exciting, especially d whole scenery of HK (HK central area to Kowloon)!Antman .. Changed by previous Buzz lightyear! More or less d same, but I'm expecting to have something most!!However, my boys like it!!Space Mountain .. Turns into Star Wars!! This 1 is Great!!!For cast members (staffs) .. Felt bit MINUS point from before
@andymithamclarke
andymithamclarke / Remote File.csv
Last active April 23, 2021 11:22
101 Scottish Whiskeys with Tasting Notes
RowID Distillery Body Sweetness Smoky Medicinal Tobacco Honey Spicy Winey Nutty Malty Fruity Floral Postcode Latitude Longitude Features
0 1.0 Aberfeldy 2.0 2.0 2.0 0.0 0.0 2.0 1.0 2.0 2.0 2.0 2.0 2.0 PH15 2EB 286580.0 749680.0 10.0
1 2.0 Aberlour 3.0 3.0 1.0 0.0 0.0 4.0 3.0 2.0 2.0 3.0 3.0 2.0 AB38 9PJ 326340.0 842570.0 10.0
2 3.0 AnCnoc 1.0 3.0 2.0 0.0 0.0 2.0 0.0 0.0 2.0 2.0 3.0 2.0 AB5 5LI 352960.0 839320.0 8.0
3 4.0 Ardbeg 4.0 1.0 4.0 4.0 0.0 0.0 2.0 0.0 1.0 2.0 1.0 0.0 PA42 7EB 141560.0 646220.0 8.0
4 5.0 Ardmore 2.0 2.0 2.0 0.0 0.0 1.0 1.0 1.0 2.0 3.0 1.0 1.0 AB54 4NH 355350.0 829140.0 10.0
5 6.0 ArranIsleOf 2.0 3.0 1.0 1.0 0.0 1.0 1.0 1.0 0.0 1.0 1.0 2.0 KA27 8HJ 194050.0 649950.0 10.0
6 7.0 Auchentoshan 0.0 2.0 0.0 0.0 0.0 1.0 1.0 0.0 2.0 2.0 3.0 3.0 G81 4SJ 247670.0 672610.0 7.0
7 8.0 Auchroisk 2.0 3.0 1.0 0.0 0.0 2.0 1.0 2.0 2.0 2.0 2.0 1.0 AB55 3XS 340754.0 848623.0 10.0
8 9.0 Aultmore 2.0 2.0 1.0 0.0 0.0 1.0 0.0 0.0 2.0 2.0 2.0 2.0 AB55 3QY 340754.0 848623.0 8.0