Why You Should Practice Reading Code
Understanding unfamiliar code is an essential career skill. Read on for practice suggestions.
In my first few months at Amazon, I was once asked to implement a paginated list for a mobile application. As usual, at large tech companies, we had an internal framework for developing user interfaces.
There was one problem: The internal framework had almost zero documentation.
To do non-trivial things, engineers often searched internal codebases for examples.
To deliver on this task, I had to:
Find a few codebases that have already implemented paginated lists.
Read parts of their codebase to reverse-engineer how it works.
Attempt to implement it in my team’s codebase.
Test that my reverse-engineered understanding of how things work is correct.
It took me days to get things working. Still, I walked away from the experience with an important takeaway: Reading and understanding unfamiliar code is an essential software engineering skill.
As Paul Graham once wrote in his essay, The Need to Read:
You can't think well without writing well, and you can't write well without reading well. And I mean that last "well" in both senses. You have to be good at reading, and read good things.
Though Paul Graham was referring to reading books, this quote also generalizes to code. When we read code written by more experienced engineers, we pick up new patterns and language tricks we can use in our own code.
How to Practice Reading Code
Many software engineers develop their “software reading comprehension” while working full-time. Consider the following quote from a polarizing book by Robert C. Martin, Clean Code:
“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code.”
The quote above illustrates one reason we read code, which is to understand it enough to make changes.
As a new software engineer develops this skill, it is common to feel overwhelmed. Production codebases are massive and intimidating.
To practice reading code outside of the workplace, I have a few suggestions.
Read Code You Wrote In The Past (Easy)
There is a famous software engineering law called Eagleson’s Law, which goes like this:
“Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else.”
Some people would say six months is optimistic. Either way, if you have code artifacts written a few months ago or longer, revisiting those codebases is an easy way to practice.
You likely remember some details about the codebase’s domain and tooling, given that you once wrote this code, so it shouldn’t be too difficult to reach a point where you understand the codebase enough to modify it.
Read Small, Open-Source Codebases (Medium)
The next step in difficulty is reading code you didn’t write yourself. Code written by someone else can be more difficult to read because the author might:
Have a different coding style.
Use libraries and/or frameworks you are unfamiliar with.
Do things differently than how you would.
While you could try to explore well-known codebases like React, Redis, or Spring, consider starting with smaller, lesser-known codebases.
For example, if you were interested in Java web servers, you could consider reading a codebase like microhttp, which is small, solves one problem, and has a good amount of documentation.
Another source of codebases to explore could be solutions to John Crickett’s Coding Challenges:
With a quick Google search, I found a solution to Coding Challenge #5 (implementing a load balancer) written in Go.
The purpose of this exercise is to understand a codebase written by someone else enough to make a change to it. To check your understanding, you should try to actually change the code.
For example, the load balancer solution above implements a round-robin algorithm. Instead of this algorithm, try changing it to something else (e.g., you could pick a server at random!).
The point is to be active in the reading and learning process. Don’t read the code passively.
Read Larger, Open-Source Codebases (Hard)
The ultimate test of your ability to read code is to explore large, open-source codebases. This exercise best simulates what being a professional software engineer is like.
Unlike reading a small codebase, you can’t sit down and read everything in a few minutes or hours. Codebases that have been around for years could have hundreds of thousands of lines of code, so you can’t keep the whole thing in your head.
When reading large codebases, you develop the ability to discern what is important right now, and what is noise.
For example, when I was diving into other teams’ codebases to figure out how paginated lists work, I had to drown out other unrelated details in the code (e.g., I didn’t care about how other teams fetched their list data).
To practice reading larger codebases, I suggest starting with core libraries for your programming language of choice. For example:
These types of large codebases are easier to explore than others. Core libraries give you the benefits of reading a large codebase but are easier to digest because you can use a “divide and conquer” approach.
For example, if you wanted to explore Guava, you might start by focusing on immutable collections (e.g., ImmutableList
and ImmutableSet
). Afterward, you could repeat this process on other parts of the codebase until you’ve read most of it.
By the end of this exercise, you will have learned something new!
To my readers: Do you read code? What codebases do you like to explore? Leave a comment down below!