Team Ownership in a Monolithic Codebase
At incident.io, our codebase runs on a monolith, which comes with its own set of benefits that we value greatly. We don’t have to worry about the complexities of network requests, deployment processes, or cross-service optimizations. However, managing ownership within a monolith can be challenging.
It’s essential to ensure that the right individuals are notified about errors to prevent unnecessary disruptions. Microservices inherently provide clear ownership boundaries, but in a monolith, it can feel like everything is lumped together without clear ownership.
In this post, I’ll discuss how we establish ownership within our codebase to ensure that errors and alerts are directed to the appropriate team with minimal maintenance overhead. These strategies enable us to maintain agility and efficiency as our organization and codebase scale.
Defining Ownership
When determining ownership within our codebase, we aim to divide it into manageable chunks that can be assigned to individual teams effectively. We categorize our core business logic into:
- Apps: Separate applications outside the monolith server.
- Packages: Core backend packages representing distinct features or business functionalities.
- Integrations: Connections with external providers like Sentry, Jira, Datadog, etc.
Each subfolder within these categories should be assigned to a specific team to establish clear ownership. While this division can be challenging, especially with older or shared code, reaching a consensus within the team is crucial for effective ownership distribution.
Encoding Ownership
Once ownership definitions are established, we encode them in a module file within each subfolder. These files outline ownership details, criticality levels, and associated features, ensuring that every aspect of the codebase has a designated owner. We enforce the presence of these module files through CI checks to maintain ownership clarity.
Additionally, we consolidate all package ownership details into a single CODEOWNERS file for easy reference and searchability.
Routing Errors
Application errors are transmitted from the code to monitoring and alerting tools like Sentry and incident.io. To effectively route errors and determine appropriate actions and on-call teams, clear ownership tags are essential. By tagging errors at the lowest possible team level, we ensure that issues are directed to the most relevant stakeholders for resolution.
By implementing these strategies for defining ownership, encoding ownership details, and routing errors effectively, we streamline our error handling processes in the monolithic codebase.
Improving Error Tagging for Team Accountability
Let’s talk about how we can enhance error tagging to ensure accountability within our team. When errors occur outside of our application, it’s crucial to assign them to the right team for resolution.
Utilize your standard error tagging approach to include an additional tag that can be easily parsed in your monitoring and alerting tools.
if fields["team_override"] != nil {
fields["team"] = fields["team_override"]
} else if team != nil {
fields["team"] = *team
}
Ensuring Error Handling Fallbacks
When it comes to error routing, always aim for the best effort. While most errors may be tagged correctly, it’s essential to have a plan for unassigned errors that don’t fit into any team’s ownership.
In such cases, consider routing unowned errors to a default team, like the On-call
team in your alerting tool (e.g., incident.io). Treat unassigned errors as optional fields and handle them appropriately in your error handling processes.
In Conclusion
While monoliths have their advantages, defining ownership within a large codebase can be challenging. By breaking down your code into manageable chunks with assigned team ownership, error routing becomes seamless.
Automated error routing reduces the burden on on-call engineers and ensures accountability for code changes. This system enables scalability while maintaining the benefits of a monolithic architecture.