Great news! You just landed a new job working for BigO Corp as their new hot shot programer. Everyone is expecting great things from you. Your new boss stops by your office and says, "Welcome to the family, fam! We needed a rockstar and they sent us you. The last person never even finished the first task we gave them, said it wasn't worth it or something. Anyways, I forwarded you the email we sent them with all the instructions. This is sort of past due, so if you could get this done asap that would be great. Thanks!"
Hey Rockstar,
Attached is a file full of json blobs that we would like to know more about. Clean this dataset up a bit and write a Python script to break this big file into some smaller files, say maybe sorted by country and a file for blobs that don't have a country? Hey you know while you're at it, it might be nice to go ahead and break it down by asn too, so why don't you do this: make a subfolder for countries and make a subfolder for asns. This way we'll have both sets available to us. Even better, when you're done show us those Git skills and create a new repo for your results so we can use it in the next meeting.
Love ya fam
Big Boss
You as the new great rockstar of BigO need to meet the needs of the Big Boss so you can meet the needs of the Land Boss and not have to deal with the Weather Boss anymore than you have to. Start a new git repo, make sure you save all of your code. If you process the dataset in steps, make sure you create a readme with clear steps on how to recreate what you have done. Save and upload your git repo.
Eager to get started, you crack open the file and find it isn't even valid or anything. There are parsing errors all over the file and even with you recognizing that it resembles JSON, you can’t get jq to just accept it. Worried, you email BigO’s head sage for advice. Luckily, the head sage is in a good mood and passes you this tidbit:
Hey Rockstar,
Don’t let Big Boss get to ya. It seems someone didn’t filter any error messages or run this thought a linter?!?! I whipped up a quick one-liner that should get you where you need to be. Run the following against your dataset:
cat big_boss_list | perl -p -e 's/429 Too Many Requests\n//gs' | perl -p -e "s/}\n\{/}{/gs" | sed -e 's/}{/},{/' -e 'H;$!d;x;s/^\n//' -e 's/^/[/;s/$/]/' > fixed_big_boss_list
Maybe thats worth a beer to ya?
Head Sage
With your new dataset and oneliner in hand you set off to answer the needs of Big Boss and be the Rockstar of BigO Corp.
Hint: use the following to save the big_boss_list directly to a local folder
wget https://gist.githubusercontent.com/StephanieSunshine/04d4a4eee550b8dbbf30e5b4c6ae4ab1/raw/73965937d245322e16f7bb8ac376a2fb5dc5dfd8/big_boss_list