weakty

Approaching new codebases

I've been thinking about the things that make it easier for me to approach new codebases. I'm going to try and list down a few. I have titled this post as a pt. 1 because I imagine I'll come back with new ideas as I get to spend more and more time working on new codebases as a consultant (at least, much more frequently than when I was working as an in-house engineer.)

You don't have to know everything!

When I'm starting on a new codebase I sometimes feel compelled that I need to know the entire codebase; that I need to be an expert or need to be able to answer questions that might appear out of left field.

In reality, this has not been the case at all. I've now worked on codebases large enough that many people who have been working internally on them happily share that they don't know what X part of the code does or what the state of Y-corner-of-the-code is in. At a certain point, programs grow to big to hold in your head, and teams grow too, altering communication overhead and requiring many talented hands.

Even when the time comes when I'm asked a question that I don't know about the codebase, it's a lot faster for me to offer that I don't know. I haven't looked at that code yet. Give me an afternoon and I'll give you an update and probably ask some questions on top of that.

Have a goal

I've found that having a goal/ticket/task to work on really helps narrow the search-light on what you need to know. Whatever the goal may be, I look at what I need to do in pseudocode terms, and then often start by looking for other places that might have similar work done for them already. In those cases, if a UI is part of the codebase, I'll start by thinking about it from the user's perspective; I find it a lot easier to "work backward" (or forward?) from there. Having a goal makes it easier for me to say to myself: I probably shouldn't be in this part of the codebase, and then re-orient myself.

Don't be afraid to go spelunking.

This used to be tough for me. In the beginning, I always started with questions, often before looking at any code. Over time, instead of jumping to questions, I would just try and solve it/build it/fix it/understand it/etc first. I would usually do that until I got frustrated (or until a timebox "ran out"), and then I'd ask any pertinent questions.

Now, I try and simply read more code before I ask questions (and before trying to build). If I can't find an answer in the immediate context that I'm working, I try and look around. Poke my nose into someone else's code. Go look for resources in adjacent repos. For me, spelunking can be unstructured and sometimes feel like you are lost in the caves without making much progress, but it's a skill that I'm getting better at.

Rely on Git

Some folks I've paired with rely on Git and others don't when it comes to getting going on a new codebase. There's no rule here, but for me, git has become pretty integral to my workflow. Perhaps because paging through git-history is super easy in emacs (thank you git timemachine), but being able to page back through your file's history can provide context regardless of whether or not the people who wrote the code are still around. Of course, a clean / well committed git history is required…

I also like to look at recently merged pull requests to see the dialogue surrounding the work that has been happening and what files are being worked with frequently.


That's all I've got for now, but I'm sure some new thoughts will spring up on this in the future.

Thanks for reading o/.

WT