In the Android team at Memrise, we’ve gotten a lot of value from tracking code coverage. It’s something we’ve worked very hard to improve over the last few years. The main two reasons being:
- Tests are great. They help us move faster and safer! Coverage is the indicator of how well we’re doing.
- Coverage helps us find areas that need some love ❤️ Setting coverage goals as quarterly KPIs is a great motivator to improve our codebase.
How do you track coverage? Well, as you’re probably aware, there are companies out there providing built-in solutions so you don’t have to do it yourself. We previously used one of the popular solutions but found that their pricing model wasn’t a good fit for our use case. After researching potential alternatives and not being convinced by any, we decided that building our own coverage tracking was the way to go!
Implementation speed and maintenance were key aspects for us, as we prefer to spend our time making Memrise great! With this in mind, we wanted our solution to cover two requirements:
- Easy visualization of coverage stats. At Memrise we use Data Studio dashboards to track health metrics and KPIs, so we wanted to have a page to see stats like global coverage and coverage breakdown by module.
- Coverage reports in Pull Requests. Coverage reports in PRs raise quality awareness and provide great immediate feedback to add those unit tests that you were lazy about.
To generate the coverage of our codebase we use Jacoco, an open-source code coverage library for Java and Kotlin. Jacoco gives you some goodies for free such as the ability to exclude code (i.e generated code or UI classes like Activities, Fragments, or Adapters) or choose the format of the generated report. It’s worth saying that Jacoco can be run at both global or module level. We do the latter since in the Memrise Android app we go big on modularisation.
To get an idea of how a CSV Jacoco report looks like, have a look at the screenshot.
Every row is a different class and every column an attribute. We’ll be using the LINE_MISSED and LINE_COVERED attributes.
To generate the report whenever we want it, we need to create a Gradle task (:allCoverage) that will do the following:
- For each module, it runs Jacoco and generates coverage.csv. It then calculates total lines missed and total lines covered to come up with a coverage percentage for the module.
- Once it’s done with all the modules, it calculates the global coverage.
- Builds a JSON and stores it in /build/coverage.json
The generated JSON looks like this:
[ ["global","88.29%"], ["billings","80%"], ["downloader","23.76%"], ["module-1","99.47%"], ["module-2","0.47%"], ["module-n","0.47%"], ... ]
Disclaimer: all coverage values in this article are dummy!
Storing reports remotely
Now that we’re able to generate our coverage report, we need to put it somewhere. This could be a database or if you want to go for simplicity like us, a spreadsheet. The steps are relatively simple:
- On every commit merged in develop, our CI will run the Gradle :allCoverage task.
- It will then look for our generated coverage.json inside the build directory and do a CURL (POST) to an AppScript function linked to our new coverage spreadsheet.
- The script will parse the JSON and dump the data into our spreadsheet.
This is the result! 🎉 We store unique records per day and every new module will be added as a new column.
We needed to be able to easily visualize coverage reports. Since spreadsheets can be added as data sources in Data Studio, all we had to do is decide what type of data we want to see. Some examples could be plotting global coverage to make sure it keeps going up, or any specific feature module where the team wants to refactor or increase coverage.
With this we’ve completed our first requirement: Easy visualization of coverage stats ✅ Let’s now move onto coverage PR reports!
Coverage reports in Pull Requests
Now that we’ve got a nice way to visualize our current coverage, it’s time to add reports to our Pull Requests to see if our changes are increasing or decreasing code coverage. To be more precise, we want to show the following: taking a branch with a PR open, we want to highlight coverage changes (both global and modules) versus develop.
At Memrise, our Mobile Release Pipeline project automates different steps of our releases for both iOS and Android. The pipeline is composed of Firebase Functions that will execute different tasks, such as creating a release candidate in GitHub, posting a comment in a Jira ticket, or posting release notes in a Slack channel. In order to have coverage reports in Pull Requests, we use our release pipeline, but keep in mind it could be done with any system capable of making a call to GitHub’s API.
We’re going to create a new coverage Firebase Function in our Mobile Release Pipeline that can take the following parameters:
- Pull request URL
- Branch name
- Coverage JSON
Then, let’s split our commit workflows into 2 different scenarios:
- Committing to develop (AKA merging a PR). Here we don’t want to show any report but to update our coverage spreadsheet. In other words, we want to have the same behavior as we had before but through the release pipeline. Instead of having our CI directly hitting our spreadsheet script, we’re going to hit our coverage Firebase Function. Then, our pipeline will do the POST to the spreadsheet script to update our coverage records.
- Committing to any other branch with an open PR. Our CI will hit the coverage Firebase Function providing all the requested parameters. Then, the following steps will be performed in order:
- Fetch develop coverage JSON by doing a GET to the spreadsheet script.
- Compare coverage reports and create a diff report using Markdown syntax.
- Find our PR in GitHub using GitHub’s API and post the report as a comment. It’s worth saying that if the coverage report already exists from a previous commit, we edit it instead of adding a new comment.
And with this, we’ll get beautiful reports in our Pull Requests, making our code coverage tracking work completed! ✅ 🚀🚀🚀
At this stage, we’ve got a fully functional solution to track coverage, so future improvements are optional and something to add when/if needed. Some ideas could be:
- PR Checks (as in failing PRs if coverage goes down).
- Different base branches for coverage diff reports. Because we use a spreadsheet to store coverage records, we’re currently limited to develop for the diff reports. But it doesn’t need to be this way. A database would provide more flexibility.
- Using it on other platforms. The fact that it was built for Android doesn’t mean much since the tool itself is platform agnostic.
Tracking coverage has been valuable for our team and is something every team can do to improve the quality of the code they ship. There are good solutions out there, but if you’re looking for minimum cost and maximum flexibility, building an in-house tool to track coverage is much less complicated than you might think!
Hope you enjoyed the article. For any questions, my twitter account is the best place. Thanks for your time!
NOTE: This article was first published in the Memrise Engineering Blog. Check the blog out, there’s great content!