Learning to do things well...

As I mentioned in my post explaining my decision to leave academia, one of my frustrations about academic research was that I felt like I never had the time to figure out how to do things properly. That's a bit of a nebulous concept, but here's what I mean:

In most of the labs in which I was a member, a large part of my work was computational. I, like almost everyone else, was self-taught in terms of how to analyze data. We either figured it out on our own, or we shared tips on how to get things done. Most of it 'worked'[1], but as everything was cobbled together, there was no guarantee that it was particularly efficient. We got things done, but not necessarily done well

When I started my new job, my first task was to analyze some next-gen sequencing data. This is something that I've been doing for >5 years, so no sweat. I started running some scripts and, within a couple of hours, a coworker stopped by my office. He said that he'd noticed that I was running an analysis on a SAM file that was taking up a lot of memory and that he could probably show me how to do it faster and more efficiently using a Python module called pysam.

Mind blown.

Some folks will probably read this post and think it's rather quaint that I'd never learned to use pysam. But guess what: I know that I'm far from the only one. Most of the folks I've worked with were happily PERL-ing their way through giant files one line at a time. Had I known about this stuff a few years ago, my productivity would've increased dramatically. Of course, this has led me to wonder what other efficiencies I could add into my workflow.

Since I'm not struggling to juggle the demands of doing a postdoc anymore, I've begun reading books about this stuff.

Since I'm not struggling to juggle the demands of doing a postdoc anymore, I've begun reading books about this stuff.

I've only been 'out of the game' for a few months, and there are already so many things that I've learned that I think should be universally applied to academic labs. So much could be improved by using GitHub to control and share code, setting up Google Docs and Google Drive to track team projects, and writing your computational 'lab book' in Markdown to quickly organize information [2], for example.

Part of bettering one's skills is taking the time (i.e., evenings and weekends) to read guides and practice. It also helps to be surrounded by people with different specialties and skills, as long as those people have the time and inclination to share their skills with others. I'm worried that the rat-race, pedal-to-the-metal mentality of academic life is preventing the transfer of incredibly useful skills and reducing productivity overall [3].

Anyways, I'm hoping to use this blog to share some of the 'skillz' I've acquired with folks who may benefit from them, so stay tuned.


[1] 'Worked' is subjective here. The quality of scientific software is often depressingly poor. For example, I can't count the number of times that coworkers and I have found bugs in scripts and code. I've also frequently seen software made public with incomprehensible, idiosyncratic options and requirements. For instance, requiring that all files be located in the same directory as the program's executable, or even worse, requiring that the user produce a particular filename/directory structure because the program doesn't even allow you to specify the name of input files, let alone their location(s).

[2] For many years now, I've been storing notes with code-snippets, or descriptions of how to use various pieces of scientific software in plain text files. While useful for portability purposes, it becomes unwieldy to organize large text files into sections and clearly indicate what is comment vs. what is code. Of course programmers would've solved this problem: Markdown is a super-streamlined version of HTML that allows you to easily create headings, code blocks, numbered/bulleted lists, embedded links, etc. simply by how text is laid out in a document. I recommend an awesome live Markdown editor called MacDown, which can be used to view/edit Markdown files, or export them to HTML/PDF. I've also started a GitHub repo where I'm going to store all my notes, so you can see what it looks like here.  

[3] I think that it's possible to transfer 'knowledge' in the form of results without transferring the skills required to generate said results. A big difference between academia and industry that I've noticed is that the latter doesn't have a pathological aversion to discussing details. I don't know how many times I've heard academics ask me and others to skip the details on how we were going to do something and focus on what we were going to do. Seems like a great recipe for the prevention of sharing useful skills.