Probably the most important advice I could give right now:
Think about files first. Files are your core abstraction between pipeline steps. You have many languages already, so enforcing a standard on your inputs and outputs will help you keep existing code working while allowing you to create new code. This will become more apparent as you start to deal with more than 3 generations of Operating Systems, programming languages, and grad students.