The Allure (and Danger) of Simple Metrics
In software delivery, we’re constantly trying to simplify the complex. We want to measure quality with a single score. We want to assess team productivity through a dashboard. We want to track delivery performance by counting how many tickets were closed, how often we deploy, or how many bugs remain.
It’s understandable. Simplicity is attractive, especially for decision-makers. A clean number promises clarity. It makes it easier to prioritise, justify, and feel in control.
But in our experience — having worked with dozens of teams across industries — this mindset often backfires. Problems arise when the metric in use is too narrow. It reflects part of the system but not the whole. And that’s when improvement efforts get misdirected. Teams chase the wrong metrics, optimise for local improvements, or embark on major software rewrites based on intuition rather than evidence.
Before we can improve anything, we need to frame the problem differently.
Why Quality and Productivity Can’t be Measured in Isolation?
Let’s begin with software quality.
There is no single metric that defines it. You can measure Cyclomatic Complexity, code duplication, bug count, and other valid proxies, but they only tell part of the story. Software quality is a composite of many concerns, including maintainability, correctness, performance, scalability, security, testability, architecture, and others.
We’ve seen systems with clean, readable code that are still painful to evolve. Why? Because the architecture is brittle. The domain logic is fragmented. The deployment pipeline is fragile. Although low-level code quality might be high, system-level complexity may introduce friction in unexpected places. The opposite is also true. Some companies invest heavily in creating a good architecture and utilising modern technology. Still, the complexity of the code, bad coding practices, and lack of test automation make the system difficult to maintain and hard to test, causing many bugs to slip into production.
The same goes for productivity. In theory, it’s about delivering valuable outcomes efficiently. But in practice, it’s mainly about perception. If stakeholders feel that delivery is slow — even if dozens of stories have been completed — the team is perceived as unproductive. The stakeholders’ perception of productivity is shaped by alignment with the development team, timely feedback, good expectation management, and low (or no) occurrences of unpleasant surprises.
To make things more complex, quality and productivity interact — but not always in obvious ways. Poor quality might not slow a team down today, but it introduces friction that compounds over time, ultimately hindering progress. Conversely, high-quality code alone won’t help a team to be more productive when the problem is related to vague requirements, continuous shift in direction (or misdirection), or being mired in decision paralysis. Some teams can be pretty efficient in building the wrong thing.
That’s why meaningful measurement — whether for internal improvement or executive reporting — must be grounded in a structured, multi-dimensional view. When metrics are grouped thoughtfully and interpreted in context, they can be far more helpful in effectively improving quality and productivity. They can tell a more complete story. But when they’re isolated or reduced to a single average or number, they mislead.
Ultimately, software quality is an enabler of delivery productivity. Software quality (or lack of) directly impacts productivity, but that is not the only thing that impacts productivity. We need to broaden our idea of what software quality is. We need to think about the quality of the entire production line, not only the quality of the code. That’s the only way we can understand what is impacting productivity and where we should focus to improve it.
Introducing the Concept of Holistic Engineering Performance
When companies come to us looking to improve software quality or productivity — and quite often both — the first challenge is framing the problem correctly.
Too many improvement efforts fail because they are focusing on the wrong thing or data. How many times have we seen a lot of time and effort spent with Agile and technical coaching, refactoring low-impact areas of the code, re-architecting or re-writing modules that are neither in the critical path nor causing significant issues, all that done with no concrete data or ways to measure improvement? The result is often the same: a lot of time and effort spent with improvements but with negligible impact on productivity, product stability or client satisfaction.
In our experience, sustainable productivity comes from something deeper: the structure and health of the engineering system as a whole. That’s why we’ve learned to look at engineering performance not through isolated metrics or maturity checklists, but as a system of interdependent forces.
To support this thinking, we developed a model that we are calling Holistic Engineering Performance. It’s not a rigid framework, and it’s not about scoring teams. It gives us — and our clients — a shared language to understand what’s holding teams back, where quality is silently degrading, and where improvements will have the most significant impact.
The model is structured around eight interdependent pillars. Each one represents a different perspective through which software quality and productivity manifest, and each plays a unique role in how effectively teams deliver value.
1. Code Quality & Maintainability
“Is the codebase structurally helping or blocking teams from delivering safely and quickly?”
This is where many teams instinctively focus — and for good reason. In our experience, delivery slowdowns often blamed on the process are rooted in the code itself.
We’ve worked with teams who tick every Agile box — standups, sprints, retrospectives — but still struggle to deliver, tripping over tightly coupled logic, sprawling complexity, or unstable module boundaries. Even when static analysis tools like SonarQube are in place, teams often lack clarity on how to prioritise or address issues meaningfully.
Key productivity levers:
- Clean, modular code → faster onboarding and safer changes
- Change locality → smaller blast radius, fewer regressions
- Reduced cognitive complexity → lower mental load, fewer mistakes
- Stable APIs and contracts → predictable integrations
- Security hygiene → fewer disruptions from vulnerabilities
- Code standards and static analysis → improve familiarity and maintainability
2. Knowledge Distribution & Capability
“Is critical knowledge shared broadly or concentrated in a few individuals?”
Some teams appear unproductive not because of poor practices or lack of effort, but because only a handful of people truly understand the system. Progress slows when others are constantly blocked, unsure, or second-guessing decisions.
Knowledge bottlenecks are among the most invisible and damaging constraints we encounter. Without active investment in knowledge sharing and onboarding, organisations become fragile, and progress becomes difficult to parallelise and scale.
Key productivity levers:
- Distributed knowledge → increased parallelisation of work and throughput.
- Low bus factor → protects continuity and reduces the risk of lowering productivity
- Shared code familiarity → smoother collaboration, fewer dependencies
- Strong domain expertise → valuable contributions and no micro-management
- Proactive onboarding → faster integration and productivity of new hires
3. Automation & Tooling
“How automated, reliable, and consistent is the delivery pipeline?”
Many organisations claim to have CI/CD, but beneath the surface, the pipelines reveal a different story. The tests are flaky, the builds break without warning, the releases require weekend shifts, and the environments drift over time.
Automation isn’t just about tooling. It’s about reliability, repeatability, and confidence. In our work, we’ve seen that automation maturity — not just the presence of tools — is a key multiplier of delivery speed and stability.
Key productivity levers:
- Build automation → fast feedback and faster cycles
- Test automation → safer releases, earlier defect detection
- Deployment automation → lower-risk releases, fewer human errors
- Environment automation → fewer delays, easier debugging
- Mature CI/CD → consistent delivery, reduced context switching
- Quality gates → early defect and maintainability checks
- Dependency automation → less risk from outdated or vulnerable packages
3. Automation & Tooling
“How automated, reliable, and consistent is the delivery pipeline?”
Many organisations claim to have CI/CD, but beneath the surface, the pipelines reveal a different story. The tests are flaky, the builds break without warning, the releases require weekend shifts, and the environments drift over time.
Automation isn’t just about tooling. It’s about reliability, repeatability, and confidence. In our work, we’ve seen that automation maturity — not just the presence of tools — is a key multiplier of delivery speed and stability.
Key productivity levers:
- Build automation → fast feedback and faster cycles
- Test automation → safer releases, earlier defect detection
- Deployment automation → lower-risk releases, fewer human errors
- Environment automation → fewer delays, easier debugging
- Mature CI/CD → consistent delivery, reduced context switching
- Quality gates → early defect and maintainability checks
- Dependency automation → less risk from outdated or vulnerable packages
4. Delivery Scalability & Architectural Fitness
“Can the system and teams scale safely and effectively as the business grows?”
Architecture is one of the most misunderstood drivers of delivery performance. We’ve worked with companies that utilise modern paradigms — including microservices, containers, serverless, and cloud-native platforms — yet still experience delivery bottlenecks, duplicated efforts, or chaotic integrations.
Scalable delivery isn’t just about the system design — it’s about how the architecture enables or restricts team autonomy, parallelisation, and safe evolution.
Key productivity levers:
- Domain-centric modularity → enables team autonomy and parallel work
- Clear data ownership → isolates changes and reduces regressions
- Architectural scalability → supports growth in business volume without chaos
- Tech stack fitness → current and future business needs supported by technology
- Cloud-native delivery platform → streamlined provisioning and safer deployments
5. Software Delivery Efficiency
“How efficiently does work move through the pipeline once development starts?”
Many teams have a high workload but still struggle to release regularly. PRs linger. Builds fail. Releases get postponed or rushed. In our view, delivery efficiency isn’t about rituals — it’s about flow.
Work should move through the system with minimal friction and clear feedback. When it doesn’t, quality drops, morale suffers, and iteration slows.
Key productivity levers:
- Short coding cycles → faster feedback, reduced inventory
- Efficient PR reviews → less idle WIP, fewer bottlenecks
- Reliable builds → higher trust and reduced waste
- Automated test presence → safety net for frequent change
- Fast test/build feedback → smoother iteration loops
- Frequent deployments → tighter business feedback and lower release anxiety
6. Outcome Alignment & Delivery Governance
“Is work being prioritised, sliced, and governed to achieve real business outcomes?”
Some of the costliest delivery delays happen inside the backlog, before a single line of code is written. Vague stories, shifting priorities, unclear trade-offs, and ungoverned scope expansion create silent drag.
Excellent delivery requires not just great execution, but good framing of the objective to be achieved, well-planned slicing and distribution of work, and overall good decision-making upstream.
Key productivity levers:
- Well-defined product strategy → sense of purpose, better focus, reduced waste
- Small, clear work items → faster progress and feedback, less rework
- Stable priorities → better focus, fewer context switching
- Transparent trade-offs → better collaboration across roles and decision making
- Product roadmap → aligns expectations and improves capacity planning
- Holistic backlog management → avoids feature-only delivery at the expense of sustainability
7. Team Health & Morale
“Are the teams healthy, motivated, and set up for sustainable delivery?”
Delivery performance is not just technical — it’s human. We’ve worked with technically capable teams whose output collapsed due to burnout, attrition, or dysfunctional leadership.
When engineers are overwhelmed, disengaged, or unsure of their value, no process will fix the problem. But when they’re supported and empowered, progress accelerates.
Key productivity levers:
- High satisfaction → stronger engagement and creativity, faster delivery
- Balanced cognitive load → clearer decisions, fewer mistakes
- Low attrition → preserves knowledge, stable continuity and speed
- Empowered teams → stronger ownership, faster decisions, less coordination
8. AI-Enabled Delivery Productivity
“Are you harnessing AI to improve delivery speed and confidence?”
The rise of generative AI is reshaping how teams code, plan, and troubleshoot. But the pressure to “adopt AI” often leads to misfires — premature tooling decisions, shallow integrations, or governance panic.
The key isn’t just adoption — it’s alignment, experimentation, and measured outcomes.
Key productivity levers:
- AI-assisted coding → speeds up delivery, research, and prototyping
- AI-refined backlogs → clearer scope, better refinement
- AI-driven incident support → faster diagnosis and recovery
- AI estimation → better forecasting, smoother planning
- AI documentation → reduced gaps, easier onboarding
- Responsible AI governance → enabling innovation with safeguards
Conclusion – Measuring What Matters, Holistically
Quality and productivity are not opposites. And they’re certainly not single numbers. They are emergent properties — outcomes of complex, interdependent forces across code, systems, people, and process.
The model we’ve shared here — Holistic Engineering Performance — is our way of making those forces visible. It’s not about simplifying judgment. It’s about providing a structured way to ask better questions and uncover what’s really shaping your delivery outcomes.
The eight pillars in this model don’t exist in isolation. Like in any healthy system, they reinforce and constrain each other. Progress in one area often requires awareness of another. That’s why we don’t expect teams to excel across every pillar at once — nor should they try.
Instead, the point is to provide a map. A way to surface hidden constraints, identify leverage points, and guide improvement where it will matter most. It also helps teams avoid over-focusing on one area while neglecting others that may be silently eroding delivery performance.
This model will continue to evolve. As delivery practices mature, tools advance, and our understanding deepens, so will our view of what shapes great engineering teams. But even now, it’s helped us and our clients have more grounded, evidence-based conversations about quality, productivity, and what “good” really looks like.