My Software Stack
What levers and pulleys I’ve been pushing and pulling in 2021-2022…
The last year of my life has been a lot of software engineering learning - moreso than usual. On top of the standard tools I’ve been using for ~2 decades, I learned a whole host of new tools related to good software engineering practice. I did a recent tweet that seemed to gain traction:
Because I'm a bit of a masochist, I'm now making ipython notebooks for every figure in my dissertation inside a reproducible {venv + Docker}.
— elsewhere (@affineincontrol) March 20, 2022
It's weirdly painful, and fun, to revisit code from 8-2 years ago :( :)
So I thought I’d write down a list of tools that I’m currently using (around the code I’m building) so I can refer back to this and figure out what’s best to write more about.
Here’s a very brief overview of them 1
Standard Tools
It’s hard to keep learning software engineering in the middle of medical school and graduate school, but somehow I was able to do it!2
Linux - Linux is an operating system that gives you full flexibility and customizability over your physical computer. Sometimes it’s harder than your standard Windows/Mac setup, but for me it’s always been worth it because it keeps me relatively up to speed with the latest and greatest. Also, it’s free, open-source software.
i3 window manager - Tiling window managers let you focus solely on what you’re doing. I use i3, it’s brillian, and I’ve come to completely depend on it. When combined with i3-bar and various dmenu offerings, i3 becomes an incredibly powerful tool for productivity… and unproductivity3
Python - I switched everything over to Python. All my work, all my thinking, all my hobbies. Yes, it has its issues, but this makes it much easier to plug in to try out new ideas instead of constantly futzing with synatx and toolchains.
LaTeX/Overleaf - LaTeX is a way to write documents in a way that abstracts content and formatting separately. This is so crucial for when you’re trying to publish because a lot of time gets wasted revamping word docs from one journal format to another. Instead, why not let the journal give you a template and let LaTeX translate your content into whatever template the journal wants? Even better, why not put it up on a preprint server and let the server do this as you one-click-submit?!
Coding Practices
git - Version control that gives piece of mind in refining and updating code. I even use this for my dissertation!
vscode - Microsoft really hit it out of the park with VScode… It’s incredible. There are still some issues4 but overall it’s becoming my go to IDE.
venv - In each of your projects you can set up a Python “virtual environment” that houses everything (Python interpreter + package library) needed to run the project. This is step 1 of modularity and persistence of your code for generations to come.
Docker - Docker’s an interesting way to containerize your projects. Basically, it tells your computer instructions needed to set up your enviornment. Sounds a but like venv, but venv already requires you to have Python. Docker tells your conmputer how to install Python, which one to install, and even what file structure you want set up so your project code can run smoothly with all its file references.
DVC - DVC is an interesting way to pipeline your ML work and version-control your data. Basically, you set up a separate repository that interfaces (roughly) with your git repo. DVC then tracks the dependencies of your data + processing/training/validation stages in a directed acyclic graph5
- S3 - I’m still very early and new at using S3 and the entire AWS ecosystem, but I’m starting to dive into it and finding that… well.. it’s the future we have.
Extra Fun
Julia - I’m trying to learn Julia because it seems particularly powerful for the future of scientific computing. Not sure it will ever reach the level of Python when it comes to adoption, but it’s elegant in its own way and may be the best way to plug into cutting edge GNN efforts.
Haskell - I’ve always wanted to learn Haskell, mostly because I’m fascinated with alternative programming approaches (functional programming in particular), Seems like a great way to keep up with my effort to learn category theory as well…
JAX - I love JAX. It’s incredible. It’s basically a way that we can skip all the quirky ML/DL framework-specific setup learning and jump right into differentiable programming with numpy functions. Most of the coursework I took in my ECE MS focused on setting up the objective function for the problem before moving onto the next problem - JAX lets us kind of do the exact same thing in a practical setting.
Hugo - I love Hugo for static site generation. This blog uses hugo, as does my main professional website. It really does make the whole process easier…6
I’m working with several other tools7, but these are the main ones that people have been interested in. I’ll talk about how all of these swirl together into my workflow soon, and add some “best practices for neuroengineers” guides.
-V
This post is mostly to keep track of the tools that I’ll be making more full-fledged how-to guides and tutorials for. But that’s all after I’m done with all this MD business. ↩︎
As in, jobs have looked at my github and, in addition to face-to-face technical interviews, given me some nice jobs. I must have kept up just enough… ↩︎
One counterproductive feature of i3/Linux in general is the decades-long ability to have ~10+ separate virtual desktops/workspaces. That means with the press of a button I can switch to an entirely new screen/setup. This is one of those “1000 tabs open actually hurts productivity” vibes. ↩︎
Mayavi isn’t working and it worked pretty flawlessly with Spyder. Maybe I’ll just set up Spyder inside my pyenv? ↩︎
People are obsessed with DAGs, it’s a bit weird. Give me those cyclic graphs, that’s where control loops live! ↩︎
Once you’ve spent 7 years perseverating on the theme so you can have clean sidenotes/footnotes with dynamic content and other bells and whistles you’ll use just once. ↩︎
Most of them relate to specific types of science. For example, I’m working with PyMC3 at the moment, as well as Blackjax, to try to make my analyses more explicitly Bayesian. I’m working with pytorch-geometric to try to get up to speed with GNN. I’m working with The Virtual Brain (TVB) to finish out some of the modeling work involving DBS for Depression… etc. I’ll talk about those later if there is interest. ↩︎