Hello fellow datanistas!
Welcome to the September edition of the programming-oriented data science newsletter. I hope you've all been staying safe amid the COVID-19 outbreak.
There's no special theme this month, just a smattering of cool tools and articles that I think will improve your productivity!
Firstly, a blog post by Steven Mortimer on how to set up VSCode, which is a really awesome IDE, in such a way that it behaves like RStudio. For R users who have to transition over to Python (e.g. for work, or for personal interest), this should help bridge the gap a bit!
Speaking of VSCode, I have been test-driving Pylance in my workflow at work, and it's blazing fast and performant for code checking! As I was writing my code, the Pylance VSCode extension continually checked my code, helping me to catch execution errors before I even executed the code. Amazing stuff, Microsoft, I like what you've become now :).
Since learning about ECDFs a few years ago, I have advocated for visualizing distributions of data using ECDFs rather than histograms. Well, nothing beats having best practices available conveniently, so I'm super happy to see ECDFs conveniently available in seaborn!
From experience at work, I can vouch for the idea that it's completely worthwhile for a data scientist to learn the ideas around containers, Kubernetes included. To help get up to speed, my colleague Zach Barry found an awesome article to help, titled "Stupid Simple Kubernetes". Lots of terms in the K8 world get clarified in that article. I hope you enjoy it!
This is an article that resonated deeply with me. Learning in public has been, for me, the biggest career hack that I have experienced. Now, Shawn Wang has articulated clearly the benefits of doing so! The biggest is being able to build a public-facing portfolio that you can point to that demonstrates your skill set.
Some things I recently wrote about:
- Software skills are important, for it helps us data scientists think clearly.
- Some early thoughts test-driving
pandera
for data validation. .also()
, which comes from the Kotlin programming language, proposed inpyjanitor
as a new feature - I'm excited to see where this one goes!- I'll be speaking at JupyterCon 2020 this year! Super excited to release a talk on how we compiled Network Analysis Made Simple into our eBook and website!
The final thing I'd like to include in this newsletter is a completely unsolicited but heartfelt advertisement for Samuel Oranyeli. He's been a consistent contributor to the pyjanitor
project, and I have witnessed his skills growth over the past few months of contribution. The most important quality he possesses is consistent learning! If you're hiring for a Python developer in the Sydney, Australia area or remotely, do consider him on your list!
As always, let me know on Twitter if you've enjoyed the newsletter, and I'm always open to hearing about the new things you've learned from it. Next month, we resume regular scheduled, ahem, programming!
Meanwhile, if you'd like to get early access to new written tutorials, essays, 1-on-1 consulting and complimentary access to the Skillshare workshops that I make, I'd appreciate your support on Patreon!
Stay safe, stay indoors, and keep hacking!
Cheers, Eric