Dependency Graph
Definition
Wikipedia defines it as4:
A dependency graph is a directed graph representing dependencies of several objects towards each other. It is possible to derive an evaluation order or the absence of an evaluation order that respects the given dependencies from the dependency graph.
In software engineering, a dependency graph is a directed acyclic graph - a directed graph with no cycles. In the context of a module dependency graph, nodes represent modules, and edges represent dependencies between modules.
A dependency graph, in general, can have more than one root. A root module (indgree = 0) is one on which no other module depends, and a leaf module (outdegree = 0) is one that does not depend on any module.
Notation
Let’s say module A depends on module B (Figure 1). In this case, a directed edge is shown by an arrow, with its tail at node A and its head at node B. Module A is said to be dependent on module B if it uses classes or methods from B. While module B is unaware of module A’s existence, module A knows about module B.
Why use module dependency graphs2?
The structure of your code’s module dependency graph greatly affects the build times
As the development of the code progresses, the module graph may unintensionally evolve into a list-like structure
Gives an idea whether modules have clear separation of concerns and adhere to appropriate dependency structure
It is cheaper to prevent creating complicated module dependencies that would be difficult to fragment later
Unwanted module dependencies will emerge, if proper module structure is not implemented.
The diagram below shows the dependency graph of the older Kotlin multi-module architecture of mifos-wallet app (generated using Modules Graph Assert):
Significance of the Red Line
The red line represents the longest path in the graph from a root node to a leaf node. In Figure 2, the red line from the “:mifospay” module to the “:core:datastore-proto” module is the longest path from “:mifospay” module to any other module.
The path of the red line gives an idea about how good the modularization is. Ideally, a modularization is considered good if the red line does not pass through more than one module in the same layer, meaning the modules within a layer are independent of one another.
In practice, achieving this scenario is rare. One should always strive for highly cohesive (self-contained) and loosely coupled modules. Loose coupling should exist among the modules within the same layer of the architecture. For example, let’s look at the figures below (Figures 3-7, adapted from a talk by Josef Raska1):
Estimation of build time
This is a canonical application of topological sorting. The total build time of a project depends on scheduling of tasks (building each module) based on their dependencies5, as well as the build times of individual modules. See Appendix I for more details.
How do we improve modularization?
If an implementation depends on another implementation (as shown in Figure 4), the dependent part of the code should be moved into an API (see Figure 6). Similarly, for other layers in the architecture, similar measures can be taken to enforce a proper dependency structure.
The module rules above can be applied to improve the “bad” module graph (Figure 4), as shown below:
Can “greatness” be achieved?
Yes definitely! Add interfaces for parts of one module that another module is depends on.
You can use the Modules Graph Assert, a Gradle plugin, to analyze the dependency graph of the codebase and ensure that the modularization stays on track as you add new modules. It’s always better to prevent any problematic dependencies than to spend time refactoring them later.
Beginner’s Dilemma
A common question from a beginner might be: why build a module dependency graph using a library or plugin when the developers of the project already know the dependencies of the new module they are creating? Can’t the dependency graph be created manually? Not really!
In a project with just over ten modules, building the dependency graph might seem trivial, but it’s not. Tracking the edges between modules can become cumbersome. As the project matures and new features are added to meet evolving requirements, this task become even more complex.
The Real Problem
The problem is that adding dependencies is too easy—just one line of code in the Gradle build file, and you can access any class from any module, across any layer. Since Android studio or any other IDE doesn’t prevent one from doing this, as Murphy’s Law of dependencies states: Whatever they can access, they will access. This practice leads to tightly coupled modules.
In reality, the graphs (see Figure 9) can become very complicated, making it difficult to track the dependencies among modules let alone drawing them manually. The number of modules may increase linearly with time, the dependencies often grow at a much faster rate.
Advantages of generating a module dependency graph using a library or plugin (beyond the technical benefits):
Dependency graphs provide new contributors with a head start in understanding the existing codebase. These graphs eliminate the need to dig through the codebase to figure out dependencies manually.
Updating dependency graphs is quick and straightforward whenever new modules are added.
Apart from creating dependency graphs, Modules Graph Assert also has several features to improve modularization. It helps maintain a lean architecture. For example, you can set a maximum height for the tree with a root module. Enforcing a max height ensures that the graph tree doesn’t degenerate into a list.2
Acknowledgements: I would like to thank Rajan Maurya, Lead Engineer of Mobile Development at Mifos Initiative, for inspiring me to share my learnings about dependency graphs through this article.
If you are interested in contributing to open-source projects, I encourage you to explore the Mifos Initiative projects on GitHub: Mifos Initiative
References:
https://www.droidcon.com/2022/11/15/modularization-flatten-your-graph-and-get-the-real-benefits/ - Talk by Josef Raska, author of Modules Graph Assert.
GitHub - jraska/modules-graph-assert: Gradle plugin to keep your modules graph healthy and lean.
Appendix I
Estimation of build time
Build Time of Good Module Graph Example
Let’s estimate the build time for the example in “Good Module Graph” (Figure 3) using a diagram with simplified module names for clarity.
The valid topological sorts of the above graph are as follows:
A, B, C, D, E
A, C, D, B, E
A, D, B, C, E
A, B, D, C, E
A, D, C, B, E
A, C, B, D, E
For our purpose, the build path (reverse order of topological sort) can be represented as
E --> (B, C, D) --> A ,
Once module E is built, modules B, C, and D can be built in parallel, followed by module A. The total build time for the modules built in parallel, will be determined by the module with longest build time among B, C, and D.
Total build time = Build time of E + Max build time of (B, C, D) + Build time of A
= 3 s + max(15 s, 10 s, 10 s) + 2 s
= 3 s + 15 s + 2 s = 20 s
Build Time of Bad Module Graph Example
Let’s estimate the build time for the example in “Bad Module Graph” (Figure 4) using a diagram with simplified module names for clarity.
In this case, there is only one valid topological sort, i.e., A, B, C, D, E and the total build time is given by
Total build time = Build time of (E + D + C + B + A)
= 3 + 10 + 10 + 15 + 2 = 40 s