Modular indirection with a dagger

In my last blog post I made a top-level overview of how a modular codebase can look with a level of indirection. The api and implementation division addresses build time issues across our code base. This architecture isn’t enough.

There is a problem in our architecture that we should address in more detail. If no modules reference an :implementation we can’t create objects we need.

This post introduces the :di module and discusses how it enables dependency injection. All while hiding :implementation and maintaining acceptable incremental build times.

What is Dependency Injection?

I don’t want to go into the details of dependency injection in this blog post. But to ensure you and I are on the same page I would suggest a read of this post from the Android developer website

Dependency injection should provide dependencies to objects without betraying the source of dependencies. Dependency injection plays well with the dependency inversion principle. You should identify dependency inversion throughout the previous post.

In this project architecture, think of dependency injection as a supply chain. It starts with an :implementation and then delivers :api to consumers.

The :di module is going to use Dagger to to build our dependency injection graph.

How does dependency injection work? 

To construct a class we need to know about the dependencies of the class. For example, the parameters to a class’ constructor.

To create an object we need to create the dependencies. To create dependencies we need to know their dependencies. Then the dependencies of those dependencies. So on.. so forth. You get the idea. This forms a dependency graph.

This dependency graph”is a directed graph. It is like the Gradle project graph. Each node represents an object, or how to make an object. 

Each edge represents a dependency required to create the object the edge departs from. 

My tool of choice for building a graph is Dagger. There are many ways to do build the graph. But I have experience with Dagger and I am confident with it. I like the compile time guarantees that come along with it. 

Next up, a whistle stop tour of some of the key parts of Dagger and where they fit into our dependency graph.


A @Component annotated class is a dependency graph. It can act as a complete or partial dependency graph. We can add nodes and edges to a graph with a reference to a @Module. We can connect another graph to ours by connecting it to another @Component.


A @Module annotated class defines nodes and edges in a dependency graph. A node is either a @Provide or @Bind annotated function. 

Edges are parameters to a @Provide function. Or, the parameters to the constructor of a class passed to a @Bind function.


Annotating a constructor with @Inject tells Dagger that this class is a node. Whilst useful, we should limit usages of @Inject. We can create concretions with it. Or on classes we know are only consumed in :implementations.

A visualisation

This sounds great in text. But as always a visualisation of how this relates to a dependency graph will help.

I’ve used OkHttp in the example and I hope there are some familiar concepts in there.

A Dagger Component that references a Module and @Injected classes being contrasted against the Direct Acyclic Graph it creates.
Dagger components and the graph they create.

We can see how the Dagger infrastructure builds ourselves a dependency graph. The component contains the graph and modules are contributing nodes and edges to the graph.

Components can reference components. This is useful when we want to use dependencies that come from a separate dependency graph. This is going to be a key concept going forwards.

Aligning the graphs

The project graph tells us how different projects relate and form an application. The dependency graph tells us how the objects inside glue together to make other objects.

The two graphs relate, but.. also don’t relate. Gradle creates the project graph. It enables the dependency graph created by Dagger. But, it doesn’t know anything about it. 

Using the :di module we can align the two graphs and attempt to create a single mental model. This will help developers working with this architecture.

If we draw both graphs separately and then together we can see that the two graphs do align.

Project relationships enables objects inside projects to have a relationship with each other. 

The combined graph above gives us information about the structure of our code. It will help us in our mission to modularise with indirection:

  • Cross boundary object references state our public api. They belong in the :api module.
  • Classes not referenced outside of a project must be encapsulated within an :implementation module.

Modifying our :networking project to align with our new model. Gives us the same graph, but with a bit of a split between them. We are going to use the :di module to indirectly maintain the relationships between project and object that we saw above.

Introducing the :di module

The :di module handles providing the :implementation in the shape of the :api module. To me, this is the module that is used to make our Dagger graph align with the project graph.

A lot of lines creeping into these diagrams!

The :di module above contains component and module classes; the graph and the nodes and edges. 

We should treat the component as the single Dagger entry point into a Gradle project. There are a couple of rules to follow in our :di module: 

  • Components should be public. They are the only objects in our :di module that can be referenced across project boundaries.
  • Modules should only have an internal modifier. Never reference a module across project boundaries.

The :di project itself should reference two other local projects: :api and :implementation. This gives the project the ability to upcast and concrete classes into their abstractions.

If :implementation modules are only referenced in a couple of places. Never in the places they are actually needed. There is no way to actually create the objects without the :di module. 

How to use the :di module

The parent of the module should only be referenced directly by it’s parent project. The :di module glues together components. For example, a :profile:di module might look something like this :networking:di.

@Component(dependencies = [NetworkingComponent::class])
interface ProfileComponent {

    fun inject(profileActivity: ProfileActivity)

    interface Builder {
        fun addDependency(networkingComponent: NetworkingComponent): Builder
        fun build(): ProfileComponent

A real project is going to be a bit more complex than this small example. As more projects are referencing each other our components will compose together creating a larger and larger dependency graph.

The composed components will be referenced in a project that applies an android application plugin. If the projects and components are correctly set up then any projects depending on an :api module will receive the fully resolved implementations.

Whilst this does bring some complexity to our project set up it does maintain the incremental build time improvements we really want to maintain from the previous blog post. If an :implementation module changes it will only invalidate any related :di projects. This is an acceptable cost for delivering concretions throughout our application.


In the last blog post I wrapped up with a short bit on how we can validate the project graph and prove that the :implementation module is used correctly.

We can do the same here and prove our :di module is used correctly. Here are a couple of rules:

  • A :di module can only be referenced by another :di module or by it’s parent module.
  • A :di module can only reference a :di module if it’s sibling :implementation module references an :implementation

The combination of the above ensures the module structure has been correctly wired together and the object supply chain will work as expected.

I think this post has only really scratched the surface of how dependency injection is important. I think I could go into much, much more detail about each point here. But I don’t want to write a book! I hope I’ve at least painted a picture on how crucial dependency injection is in our modular indirection.

Modular indirection is the best direction

(If you think this title is bad you should see how I name variables!)

Whilst I worked at the Guardian I worked in a monolithic codebase. The majority of code lived in a single module. This had the expected downsides: 

  • slow incremental builds, 
  • hard to discover code 
  • quite a bit of known, but hidden, complexity!

Our aim was to increase modularisation. This would bring faster incremental builds and increase better engineering practices. With lower coupling and increased cohesion we will see better code. We also had business goals to increase cross functional collaboration. Modularisation should encourage this.

Modularising unearthed some impressive levels of coupling and a general lack of cohesiveness. Let’s not talk about our Dagger set up! This melange made it very difficult to break apart our monolith.

It became clear that to tackle this problem we needed to standardise our approach. So, as with all good projects this meant a good deal of investigation into the best approaches.

There were two sources I kept coming back to over and over again:

Both talks give excellent insights into the world of modularisation in large codebases, it also gives insight into how we can tackle this with dependency injection (this post is going to avoid dependency injection in detail because it deserves it’s own!). I regularly re-watch those talks and I feel like I pick up on something new every time I watch them. Here are some top-level topics that both talks cover:

  • Creating modular code bases that don’t compromise on build speeds
  • Making good use of dependency injection (Dagger @Square and Scythe @Twitter) in a modular code base
  • Moving our monolithic code (:hairball, @Square and :legacy @Twitter) into smaller modules

I left the Guardian and joined Twitter before we made large amounts of progress. Twitter is much further along in the modularisation process discussed above. It is so exciting to see something I have spent a whole year thinking about in action.

Over this year I’ve done a fair bit of thinking and prototyping based upon the talks above and from my own ideas. I’ve gotten to a place where I am happy with what I’ve come up with. I will write a series of blog posts to delve into the benefits and some tooling ideas I have floating around in my head.

This post explains my mental model of a problem we want to solve. Provide some concrete examples of a solution and then validate the improved approach.

The Gradle graph

I am not a Gradle pro, I am working on a mental model that I think makes sense. Would love some feedback on this!

Lets look at how Gradle and build a mental model to look at the problem. I’m going to come back to this mental model a number of times in this post.

An important primitive in the world of Gradle is a project, commonly referred to as a module. We can spot a project as it is a directory containing a build.gradle file. Gradle projects have 0..n nested sub-projects. Projects can nest content like code or resources in other sub-directories.

Gradle builds a directed graph, this is a graph where the edges between nodes have a direction. We can not create a circular reference in a directed graph. Nodes are projects. Edges are  relationships defined by a Gradle dependency e.g. implementation, api or kapt.

Project graph of a small modular application.

Changing the code in a project invalidates our project or any related projects. In the above example, a change in the :networking module may invalidate the six other modules. 

Project graph with invalidated dependencies.

In general the code in each of these modules recompiles (not always true, but it is true in this mental model). We can push our black box of knowledge a little bit deeper by looking at what is a task.


A task is an idempotent piece of work that Gradle may execute to do a job. Because it is idempotent, Gradle is able to cache the result.  If inputs are the same as a previous running of the task we know that we can use the cached result.

Every Gradle plugin contains a set of tasks. For example, the Kotlin JVM plugin contains tasks that compile using kotlinc. The kapt plugin contains tasks that create java stubs from Kotlin files. The Android application project contains tasks to generate an AAB or APK. Tasks are added to a project by applying a plugin to a project. The plugin helps us to identify the type of project we are working with.

Task invalidation

A Gradle task invalidates when new inputs to a task differ from the inputs to a cached version. The task must now be re-run to produce a new result. An input to a task can be a file or even the result of another task. Tasks may cross project boundaries and  can validate another project’s tasks. This explains the large quantity of red arrows in the last graph. 

Incremental compilation is when Gradle only runs invalidated tasks. We want to keep invalidated to tasks to the smallest. This means reducing the number of edges in the graph.  The more modules that need to be recompiled the longer your incremental builds may be. 

Our job as developers of a modularised codebase is to maintain a healthy project graph. As we begin to modularise a monolithic application we will see an improvement in build time. If we don’t manage this correctly, our invalidated project graph will begin to rival the incremental build time of our monolith. 

Another benefit for Gradle enterprise consumers is that smaller cached tasks can be easily shared across many users. We will begin to see improvements in our clean build time for developers.

Tackling with indirection

To resolve this we can make use of a bit of indirection and play some tricks with Gradle to reduce invalidated tasks. 

The talks linked to above all suggest an architecture that divides an :api from an :implementation. Let’s delve into this a bit deeper. Top-level projects must be split into at least three smaller projects:

  • :api – Defines a static api that other modules can use
  • :implementation – Contains implementations for all abstractions defined in the :api module
  • :di – This contains the code that wires together the :api to the :implementation

There are a few rules. Other projects that want to use our feature should only refer to the :api project. The :implementation module should only be consumed by a project and the :di module. 

The relationships in a module split by an :api and :implementation

You may notice that the parent :networking project references all internal modules, :api, :implementation and :di. This is not necessary but I like this as it allows our modules to create a manifest of all required modules.

Let’s recreate our first project graph again with our new :networking module structure.

The original graph now with a networking module with an :api and :implementation split.

Looking at this graph again we can see the improvement this should make. We’ve reduced the number of direct edges to our volatile :implementation module. Imagine invalidating the :implementation module we can see it has a much reduced impact when compared to :image-loading.

There are a lot of interesting ways we can take this module structure. It simplifies creating one off applications (called sandboxes at Twitter), we can nest our tests and separate our fakes. 

The eagle eyed among you may notice that the :di module also references the :implementation module. This allows our :di module to downcast our implementations into the abstractions contained in our :api module. Like I mentioned above, this really deserves it’s own blog post!


Wouldn’t it be great if we can validate that this approach actually brings the improvements I am promising. Well.. I can’t prove this right now but I have some ideas and it is actually the reason I wrote this blog post. 

I plan to create some tooling that can be used on a project  implementing a module structure as I’ve defined above. The tooling will live within Android Studio/IntelliJ or separately through CI tooling.

Invalidated Tasks

We measure our builds using time (or success or failure). Time has many variables: 

  • your machine (slow/fast processor), 
  • concurrent processes  (1 or 100 browser tabs open) 
  • or the status of your local Gradle daemon. 

Gradle has another indicator: measuring the number of tasks invalidated by a change. Whilst not giving us a concrete time impact we can infer the impact of those invalidated tasks.

Searching the graph

We can validate a two properties if we look into the shape of the Gradle project graph: 

  1. An :implementation module should only have a depth of two from a node
  2. Any :implementation module should have one direct sibling with an implementation edge (:di) and another reference from a parent module (it may have one or more testImplementation)

If these properties are true, we know we have created an isolated :implementation module. We can safely change without worrying about the impact of our changes throughout the rest of our project graph. 

Property #1 is validatable by finding an application module node and starting a depth-first search from that node until we have found every :implementation module throughout our code base. If we find a module after visiting more than two nodes we know this module needs to be improved. 

We can confirm property #2  by confirming that every implementation module is visited twice. If we encounter it more than twice it is an indicator that the graph is incorrect.

I hope this has been interesting for some developers out there, but this post has gotten a bit too lengthy! The next post in this series will talk about dependency injection in a modularised world and look into how the directed graph of dependency injection can work alongside our project graph!

Learning a lot at Twitter

I recently joined Twitter as an Android Developer. So far, it is unlike anywhere I have ever worked before. In two months I have learned a lot and have been incredibly overwhelmed. I’ve recently felt like I have started to get on top of the learning curve and I can start to reflect a bit on the first two months. To do that, I’m going to make an effort to blog three times a week focusing on a few topics. 

I already have so many thoughts on MVI, dependency injection (and annotation processors!), lifecycles and memory leaks, build times, working with developers across the pond, finally learning how to rebase, saying phab instead of pull request, senior developers and not knowing everything about the app you work on. Have I mentioned build times?

Merging multiple files into one with Kotlin

Kotlin lets us write top level functions, this enables us to write code that isn’t necessarily constrained to the concept of classes. It frees us from “util” classes of static methods (but it doesn’t free us from dumping methods or functions in one place).

Under the hood, Kotlin is constrained to classes, the compiler must generate bytecode that will run in the JVM (multiplatform is another story). To do this, it must put your functions into a class. It will take your file name and create a class from it. Functions in StringExtensions.kt will be placed in a class named StringExtensionsKt.

You may write a set of extensions on the Fragment type that are responsible aiding the retrieval of arguments:

// FragmentArgumentExtensions.kt
fun Fragment.requireStringArgument(name: String): String {
    return arguments?.getString(name, null) ?: // throw

The Kotlin compiler translates it into some bytecode that roughly looks like this: 

public final class FragmentArgumentExtensionsKt {
    public static String requireStringArgument(@NonNull Fragment fragment, String name) {
        // Implementation

You may also have another file containing extensions to help you create a ViewBinding for this Fragment:

// FragmentViewBindingExtensions.kt
fun <T : ViewBinding> Fragment.viewBinding(factory: (View) -> T): T {
    // Implementation

This Kotlin would then be compiled into a class named FragmentViewBindingExtensionsKt.

This all makes sense, we’ve kept our logically different extension functions in separate files. Sometimes we might want to combine our extensions into a single file:

  • If we had Java consumers of our extensions we might want to present the extensions in a single class named FragmentExtensionsKt.
  • Splitting our functions apart internally may not always be the best for a public API.
  • We could be working in an environment that requires we keep our class or method count as low as possibl e.g. two classes create two constructor methods, one class creates one constructor method

Kotlin provides a couple of handy annotations to support this functionality, @JvmMultifileClass and @JvmName.


This annotation tells the compiler what to call the class your file will be mapped into. This is useful if you want your api to look nice for Java users or you want to provide some API compatibility across a Java to Kotlin conversion.


This annotation tells the compiler that this file will be contribuing to a class that other files may also bee contributing to.

When used, they should have the file qualifier and be the first two lines of code in your file.


When added to our two files above, the Kotlin compiler will produce a single class under the hood.

Experimenting in a legacy code base

I work on what could be called a “legacy code base”. We’ve just crossed the 10 year anniversary of the first commit. Between then and now over 40 developers have contributed. Many features have come and gone, and the platform we develop for has changed beyond recognition and so have our ways of writing code.

Because of these reasons, we have a vibrant, frustrating, yet interesting code base. Over the past three or so years we have systematically refactored and improved it, but we have a lot further to go. We’re in a place where we can start to think about adopting new technologies to modernise our code base.

Using the newest technologies available to us has a lot of benefits the biggest for me is that developers get the satisfaction of using the newest and greatest tools.

But before we can adopt new technologies we have to make sure that developers have a shared understanding of how to use the new technologies and what we want to achieve by adopting it.

The best way to do this is to experiment.

When we experiment with code we learn new ways to write code, and more often than not, we learn why our previous ways writing code aren’t as good as we thought!

Legacy code bases come with a cost. You are surrounded with code full of history and reasons why you just can’t change it, and often there aren’t a lot of tests to make you feel safe! To make matters worse, you can’t just add a new technology for the sake of it – that’s the kind of thing that gives you a legacy code base in the first place. This combination makes experimenting a tricky thing in legacy code bases.

So how do we experiement in a legacy code base? Here are some ideas that I have been trying to adopt over the last six months:

Create disposable or small applications to demo your ideas. You aren’t constrained by your legacy code base and you can move quicker. Don’t forget that you will ultimately be integrating into a larger code base. If you can reuse these applications it is even better, keep them stored in a separate repository so you can use code reviews to explain your experiments to colleagues.

Create a “bleeding edge” application (better name pending), this is an app that can be used to incubate new technology before folding it into your main code base. Think Google Inbox and Google Gmail. If you can roll these changes out to users you get a better idea of what works and what doesn’t.

Design your code base with small modules and strong boundaries defined by abstract types. You can peacefully change your implementations of one module without impacting the rest of your application.

Each of the above options are just different ways of saying that you need to find somewhere outside, or inside, of your code base that allows your to make changes without those reprecussions being felt elsewhere.

When you’ve finished experimenting, you will know how best to propagate your changes safely into the rest of the code base without creating more legacy code.

To abstract or not to abstract

The longer I’ve written software the more I debate with myself about whether I should be adding an abstraction or not adding an abstraction.

Let us define an abstraction, it could be an interface, a trait, a protocol, or an abstract class. It is a structure that defines how a piece of code should interact with the outside. But not how that interaction is handled.

Abstractions are a powerful tool, but they should be used appropriately. They are powerful at the boundaries of your code but introduce too much indirection when used overzealously.

A good abstraction lets a developer switch out an implementation without any effort. A bad abstraction finds a developer clicking through many files trying to a lot of information in their head.

I have a few rules of thumb that I try to follow:

  1. An abstraction is useful when there is a genuine reason to swap out the implementation.
  1. If you are at a boundary of a key separation of concern use an interface to define that boundary.
  1. If you are designing a library use an abstraction to define a public API that can be added to or sensibly deprecated
  1. If you know your class isn’t going to be swapped out don’t use an abstraction

These are not a definitive set of rules but I find they rein me in from creating an abstraction for everything under the sun!

Some thoughts on testing

Photo by Oğuzhan Akdoğan on Unsplash

I associate a number of things with writing test code.

The first is finding peace of mind. In years gone by I have written some dodgy code that has gone to production, I still think about some of this code to this day. I still write dodgy code, but I’m able to stop it from going to production with a superpower I have gained. That superpower is to write tests for my code and mostly stop that code from being released (crashlytics will sometimes disagree). A good set of tests should be enough to give me confidence that what I have written actually works.

Testing is the quickest way to validate your code. As an app developer running a suite of tests from your IDE, in a matter of seconds, is far quicker than navigating to the relevant screen in your app and then doing a sequence of actions to find out your code doesn’t work! 

The code you write to test code is a good indicator of the complexity of the code you have written; large test functions, repeated test code or long lists of dependencies all indicate that your test code is a bit complex. I try to let this guide me when I am writing code. 

I’m not going to tell you I know how to write proper tests, because I don’t and have a long way to go before I start to write good tests. However, over the past few years I’ve started to pick up a few things and form some – hopefully useful – opinions. 

Concise test naming

Don’t be too descriptive, get to the point quickly, and make sure the name matches what is in the body of the test. You and a colleague will need: to review this code or refer back to tests in the future. Make your tests easy to understand now and avoid regret in the future. 

I like to imagine non-technical colleagues might want to read a report on test coverage and then share it to other teams. If you think your tests are easy to understand you are going in the right direction. If you aren’t sure, why not ask someone else?

I think testing code with a small public API helps keep your test names concise. The more API you have to test, the more words you need to describe what you are testing. If you really can’t avoid a large public API, split your tests into a number of different files focusing on a particular method or function of that API to help reduce potential confusion. 

Consistent structure for tests

If you are working on code in the same project you will want to see consistency across in the code that is written. If every test has a familiar structure your or a colleague won’t have to spend time getting up to speed with the general shape of the code, you can just get on with the testing. 

I think this fits nicely alongside the idea that your test code will inform you of the complexity of the code you are testing; if your tests are consistently different or hard to understand you should probably change the code you have written. 

White box or black box?

For the longest time I was an advocate for white box testing; I wanted to know that my tests rigorously tested internals of the code I have written, this is great for my peace of mind. However, changing the tiniest implementation detail would cause a butterfly effect of failing tests through the entire test code base. This is alarming and stressful for whoever is making a change, not a good developer experience! This has led me to becoming a fan of black box testing.

I do think white box testing is helpful, I think it is a great way to help understand difficult to understand code. Writing tests verifying the behaviour of different parts of this code can help you to understand what is happening. I like to think of it as writing notes “this function does this.. And causes this to happen..” the best thing is those notes will tell you if you are right or wrong as soon as you execute them!

Nowadays, I like to just test and output for a given set of inputs. It is a much nicer developer experience and your test code is less intimidating to look at. I also think it has helped to inform how I write code. I try to ensure any function returns something that can easily be used in a test and ensure there are no hidden side effects.

The three points above are things I regularly think of when I write tests. They certainly aren’t a recipe for cooking up the perfect tests but they do help me write better tests bit by bit. 

Slice don’t Splice

Photo by Juja Han on Unsplash

This weekend I’ve spent some time working on a side project written TypeScript, I’ve never used it before so I’ve spent a lot of time referring to documentation and learning a lot. One thing stood out.

I had an array of data that I wanted to create a sub-list of elements starting from and index, i, to a range. You can do this by calling array.slice:

“Extracts a section of the array and returns the new array”

Typescript, or the underlying Javascript also has a function that adds or removes elements from an array. This is called array.splice:

Adds or removes elements from the array

I suppose I don’t need to go into detail about what caused me to write this blog post, but I have some lessons:

  • Pay close detail to the functions you are writing or selecting from autocomplete.
  • Test every single piece of code you change, even if you think you are making a small change
  • Unit testing isn’t always enough to catch issues, especially when your unit is manipulating data being passed into it

I’d also like to call out the concept of immutability. This would have saved a stupid developer from a stupid mistake.