Wanted: Code Review

This as with many of my other posts, was inspired by my Software Carpentry instructor training. Nearly, three years later and so much of what I learned still rattles around my brain and often out my mouth.

Why is showing people your code super scary?

It is nerve racking to have someone look over your words, either code or written word. There is something personal about those words we worked so hard to put together or that script we struggled for days to code up.

I’ve heard it so many times when I teach github or in study groups.

Some variation on “My code sucks so I don’t want people to look at it.” or “I’m just not ready for someone to look at this piece of code.”

It is very scary to put your code out there. I personally live in fear of people finding my code on github and then sending me messages about how terrible my code is.

Here is where I could tell you that you just have to get over it but I think that would be ignoring where this fear spawns from. I think it comes from a similar place as the fear I talked about in my first post. The internet is not always a safe place for people to share their thoughts. Aside from trolls, there are also people who don’t know how to give contructive feedback (a topic I will talk more about in a later blog post). But I think with some work we can make places where people can feel comfortable getting and giving code review.

The need for code review

Gaps in knowledge

Code review can actually be one of the easier ways to get better at programming. While code review is useful at any skill level, other learning formats may not work as well for intermediate+ programmers. When you are beginner you read a book or go to a class/workshop and much of what you see is new and relevant. You’re a sponge soaking up everything you see. Once you’ve gotten some skill with programming these formats require you to search or sit through a majority of information that you already know, making it harder to find the tidbits you need. If you are a self-taught programmer like me, you probably learned pieces of code as you needed them. Being able to find what you need is a good skill to have but it can also mean you miss out on some pieces. I’ll give you a prime example using my past in python.

I never learned ennumerate()(which is a way to automatically number the items you are ranging over) when I learned to code in python. But I could easily code around it. So for the longest time that is what I did. Then I helped teach a workshop and it blew my mind when I discovered this function. How had I gone so long without it!?!? But it was one of the only things in the workshop I’d never seen before.

If I’d had someone else look over my code, they could have told me about this much sooner. Code review is very useful as it helps us to fill gaps of knowledge and improve the code we are producing.

Mistakes

The fear I talked about above also comes from the worry that we’ve made a mistake in our code. Which happens, we’re human. But if we keep our code to ourselves because it is “too ugly” or might have a mistake in it, then we might be using scripts with errors in them and, worst case, publishing results based on flawed code.

Since I’ve learned more about code review, it is scary for me to think about the possibility that there might be errors in my own published code and analyses. With more journals requesting code and data, it is becoming commonplace for people to publish their code along with their results. This makes it more important than ever that we have a better system of code review for scientific programming and data analysis.

Possible solutions

So I’ve come to the conclusion that I really need code review but how do I get it?

I want to:

  • Get regular feedback
  • From people who I can trust and know how to give constructive feedback
  • In the easiest possible way for all parties involved

I’ve heard of some labs having formal code review which seems like it might be helpful. Here are a couple models I can think of for getting code review.

Lab Meeting - Regular meetings where people take turns sharing their code and giving feedback
- Pros: Meets regularly
- Cons: Might not have experts, have to moderate group/have a code of conduct

Study Group - Like Lab meeting but with (probably) a more diverse set of people

Buddy System - Share code with a friend/co-worker and review code
- Pros: Can get feedback when you need it, friendly form of feedback
- Cons: The other person might miss things or not know enough to be helpful

My ideal

Virtual Code Review Group

  • Slack Group (perhaps?)
  • Code of conduct (+moderators)
  • List of known proficiencies or shared language requirements
  • people share code on github that they want feedback on and post about it in slack channel
  • others volunteer to look at this code and give feedback

This does feel like I’m sort of reinventing the stack overflow but I’m hoping it will create a community where scientists(including me) can get review and help with their code from each other.

Let me know what you think!

I’ll hopefully be trying to put together my ideal group based around ComBEE, my local computational biology study group, but it doesn’t have to be limited to local people. Let me know if you are interested in the comments below or message me via email/twitter.

I’d also love feedback on my plan above if you have ideas or experience you’d like to share.

Finally, it would be great to hear more about how other groups do code review. Let me know how you or your group review code in the comments below.

Written on December 7, 2017