Written by Henri Normak
Note: This is a modified version of an internal document we’ve had in place at Sixfold since April 2020.
This is the first post of two about the code review process at Sixfold. In this post we’ll focus on the aspects of ensuring code health via code review. In the followup post, we’ll discuss the effective communications involved in code review.
Terms & Background
- Pull Request (PR) — Method for submitting changes to a project. In our context, method for submitting changes to our production environment.
- Author — Person(s) that are driving the PR, aiming to make changes to the production environment.
- Reviewer — Person(s) that are conducting the code review of the PR.
The primary purpose of code review is to make sure that the overall code health of the codebase is improving over time. All of the tools and processes of code review should be designed to this end.
Paraphrasing of Google’s engineering practices
In an organisation, where several people/teams operate together, it is vital to maintain a level of quality over the shared assets — in our case, the codebase(s) of our services.
Our chosen method for achieving this, is using proactive code review. It is important to distinguish between proactive and retroactive code review, as the former is aimed at stopping the code health from ever decreasing, while the latter is aimed at detecting deteriorated code health after the fact. They are not mutually exclusive, but in the scope of this post, we’ll focus on the proactive approach.
Although not done actively in our case, the retroactive approach can still occur when a PR that has already been reviewed and merged receives new comments. In such cases, it is expected of the author and the reviewer to work together at figuring out the best solution to who should file the followup PR.
Code review as a process involves two parties — author and reviewer — both of which can be represented by one or more people. However, it should never be the case that a person acts as both, an author and a reviewer, on the same piece of code in the same PR.
While the primary purpose of code review is to maintain the health of the codebase, it also serves secondary purposes:
- Mentoring — thanks to feedback received in code review engineers would become more experienced over time
- Knowledge sharing — new programming methods, tools and design principles can be spread across the team via code reviews
Aspects of Healthy Code
As the primary purpose of code review is to maintain a healthy codebase, it makes sense to start with the aspects that we consider as healthy and how these aspects should be validated/reviewed as part of code review.
Each of the following sections focuses on one distinct aspect of code health, providing some illustrative examples as well as some suggestions from the point of view of the reviewer. It is assumed that whenever acting as the author, all engineers ultimately aim to make it as easy as possible for the reviewer to carry out the best practices outlined here.
All code in a production environment is expected to perform the tasks it was designed to do.
If the functionality of the codebase changes as part of the PR being reviewed, it is expected of the author to provide the reasoning/justification for this new functionality. This can take several forms:
- Linking to a product ticket, explaining the motivation behind the change
- Giving an overview of the parts of code that were changed and what are the expected outcomes
- Providing an overview of the testing of these changes, for example listing of test cases or a strategy
It is expected of the reviewer to verify these claims and to agree with the changes. This verification can take several forms:
- Reading through the code itself, verifying statically that it functions as expected
- Verifying the code by checking the tests that have been added or suggesting additional test cases to be written
- Verifying the functionality by manually testing the solution or asking for a demo from the author
By the end of the code review process, it is expected that the reviewer and the author are in agreement about how the changed codebase functions and how it affects its users (either end-users or other developers).
It is important to emphasise, that unless the functional change is in a purely engineering-focused resource, it is not always possible for the author or the reviewer to change the product related requirements. As such, the code review process should refrain from challenging the product requirements, deferring to a wider audience if there is a question about the validity of the product requirement — code review is not the place to change or argue against/for a product decision.
Healthy code is understandable, understandable code is maintainable. Code review should not only encourage understandability and maintainability, it should actively protect the codebase from overt complexity.
This doesn’t mean the code has to be simplistic or toy-like, rather, the codebase should aim at avoiding negative reactions from engineers that work with it. Google’s engineering practices sums this up as “developers are likely to introduce bugs when they try to call or modify this code”.
As part of the review, it is expected of the reviewer to understand the code that the author has written. If there are parts that are not clear, or that raise questions along the lines of “What does this code do?”, then in all likelihood, the author should try and rewrite the code. Adding comments to the code may also help, but the best documentation for any code, is the code itself.
A common source of complexity is over-engineering, which often surfaces itself as code that is too generic. It is up to the reviewer to be vigilant and identify code that seems too generic, however, ultimately the responsibility of reducing over-engineering lies with the author.
It is encouraged of the reviewer to provide specific suggestions on how to reduce the complexity/genericity, however, ultimately if the complexity is identifiable by questions raised by the reviewer, it is up to the author to choose the best option for reducing it.
In general we don’t measure test-coverage as a benchmark, however, it is highly advisable to provide test coverage for your changes as an author.
It is expected from the reviewer to remind the author about tests that might be important to add, similarly, it is expected that the reviewer verifies the test cases added as part of a PR.
Context and Consistency
All changes to any codebase live in a bigger context, either on the service level, or intra-service level.
In the code review process, it is important to take a step back and look at the bigger picture, not just the specific lines that change.
This can take the form of several actions — such as looking at the entire file instead of the new function that was added to it, or looking at the service that this new part is added to.
While looking at these higher levels, it is expected of the author to align with existing paradigms or programming approaches, favouring consistency over novelty.
If an existing part of the code can be made useful for the proposed change, that should be done over writing additional code. As part of the code review process, it is expected of the reviewer to bring some domain-knowledge about the bigger context into the review, ideally identifying immediately parts that are inconsistent with the existing codebase.
Examples of inconsistency can vary from naming to API signatures to design patterns. A concrete example of inconsistency could be use of
Many of these inconsistencies in code style we have codified into our linter rules and prettier config, in which case, a PR should never go against what the tools propose/suggest.
More complex forms of inconsistency may take the form of lapses in architectural choices, for example, inconsistencies in API requests/paths/responses. These are often identifiable by comparing a feature/functionality against some similar function in the same service — for example, comparing two Kafka consumers in terms of their dataflow.
Additional form of context can be comments and documentation of code. While we don’t expect all functions to be fully documented, it does often help provide context about the need for a function to exist — answering the question why?.
Our aim is to produce code that is self-evident, in which case comments are generally only needed to offer context about how parts of the program behave and why that is. This is more important on interfaces that bridge the gap between different codebases, be it on APIs surfaced in packages, or code that is exposed between different modules of the same service.
In this post we examined the four aspects that help ensure a healthy codebase and how these aspects are enforced in the code review process.
For each of these aspects we looked at positive examples as well as the common pitfalls that we all might fall into. Being aware of these is the first step to ensuring that the reviews we deliver are useful and a positive experience for all parties involved.
Let us know what you consider when keeping code healthy — things to look out for during code review — or perhaps some entirely different approaches?
In the next post in this series, we’ll talk about effective communications of feedback that surfaces during the code review process, and how to ensure that the process is one of knowledge transfer, not of blunt nitpicking.