Articles /

Does AI Make Software Delivery Faster? What We Learned from Measuring It

Does AI make us faster? We posed that question to our data and analytics team, who answered it with real delivery data — not gut feel. The answer is yes, but not how you'd expect. AI’s real measurable benefit is not speed, but risk reduction and variance suppression.

a robot hand pointing at something

Why We Started Measuring AI Productivity

As a software development firm, Enlighten Designs' focus has been on agentic code development and engineering workflows, where we expected AI to have the most immediate and material impact.

Every developer works with coding agents, receives regular training on how to use them effectively, and many run multiple agents in parallel as part of their daily workflow. Agentic development is now part of our baseline delivery model, and we are continuing to push further.

Because we are investing heavily in AI-driven development and actively expanding what we do with it, we designed and ran a structured experiment, measuring delivery data across November 2025, to understand how AI is actually changing software delivery here at Enlighten — not based on impressions, but on data.

Our Measurement Approach

One advantage we have at Enlighten is a custom time-sheeting and delivery telemetry platform. We added a simple indicator that developers could use to mark when they were using AI on a task, capturing not just whether AI was used, but what proportion of time on that task was AI-assisted.

That one small change made it possible to compare AI-assisted and non-AI delivery work across real projects and teams. The result was a statistically meaningful foundation based on operational delivery data rather than surveys alone.

How We Use AI at Enlighten Designs

During the measurement period, AI usage in our dev team was not uniform, and we were deliberate about allowing that.
 
Some developers used AI primarily for troubleshooting, debugging, and clarification. Others used it to scaffold entire systems using disciplined Test-Driven Development (TDD) practices. A smaller group actively pushed the boundaries of what current models and tools could support, experimenting with new ways to automate and coordinate development work.
 
We treated all of these as valid points on the same maturity curve rather than competing approaches.
 
At any given time, several AI experiments were running in parallel across different teams and delivery contexts:

  • Results were shared openly
  • Patterns were discussed
  • Workflows evolved through practice rather than prescription

This approach has allowed knowledge to move quickly across the organisation and has been valuable in identifying which patterns deliver the most consistent results. As our AI adoption matures, we expect our practices to converge around the workflows and techniques that have proven most effective — moving from broad experimentation toward a more standardised and reliable delivery capability.
 
From a measurement perspective, this diversity was useful. It allowed us to observe AI impact across conservative, moderate, and advanced usage patterns rather than only at the extremes. From a delivery perspective, it allowed teams to adopt AI in ways that matched both their risk profile and the nature of their work.

We Expected Speed. We Found Stability.

We initially focused on time savings, using both self-reported estimates and delivery data. It became clear that time alone was not a reliable way to understand AI’s impact. 

Developers often felt faster when using AI, but those perceptions did not consistently translate into shorter task durations. The most consistent change appeared in delivery stability. 

Across fixed-price delivery work, AI-assisted tasks were less likely to exceed their estimates by large margins. Extreme overruns were rarer, and delivery outcomes were more tightly clustered. Typical tasks were not dramatically faster, but the overall distribution of results was noticeably tighter. 

a graph with a line graph and a line graph

Overrun Distributions at Enlighten Designs: AI vs Non-AI

For software delivery leaders, this is where AI starts to change the commercial equation. Most delivery risk doesn't come from tasks being slightly slow, it comes from the small number that go badly wrong and distort plans, budgets, and client confidence.

AI reduced how often that happened. As delivery becomes more predictable, risk profiles change. Fixed-price work behaves differently. Forecasting stabilises. Commercial conversations become simpler — and easier to win.

This pattern aligns with how agentic workflows operate in practice. Agents are most valuable when they support reasoning, explore solution paths, surface edge cases earlier, improve test coverage, and reduce cognitive load during complex work. These effects do not always shorten a task on paper, but they reduce rework, late discovery, and escalation. 

In delivery terms, AI reduces delivery variance before it reduces time. The impact appears first as a tighter distribution of outcomes — not a faster average.

a graph with a bar and a bar

Overrun Variance by AI Usage at Enlighten Designs

Self-Reported Time Savings Are Not a Reliable Productivity Metric

We compared self-reported time savings with measured delivery outcomes. They did not correlate in a stable or defensible way. 

People often felt more efficient, but estimating how long a task would have taken without AI proved unreliable. We now treat self-reported savings as sentiment rather than measurement and focus instead on what can be observed directly in delivery results. 

This finding is consistent with broader research on AI and developer productivity. A July 2025 METR study found developers believed AI had made them 20% faster, while measured data showed they were actually 19% slower.

As internal workflows have continued to evolve, we are beginning to see second-order effects in speed and throughput as teams adapt more deliberately around agentic development patterns. 

This is not a one-time study. We are committed to ongoing statistical analysis of our AI productivity so we can continue to learn how and where to invest. New findings will follow.

What is already clear is that AI productivity follows a curve rather than a single step. Organisations that invest in workflows, training, and measurement tend to move along that curve more quickly. 

Next Steps

We are continuing to monitor our AI productivity data as our tools, workflows, and team maturity evolve. This initial study covered November 2025, but measurement is ongoing.

Future blog posts will share updated findings as new patterns emerge from our continued analysis, including deeper insights into where speed gains begin to materialise and how different project types respond to AI-assisted delivery.

Building Our AI Capability

As we started to understand this more clearly, we invested further in training and upskilling our teams to use AI well, including:

  • AI-assisted development workflows that reduce manual effort
  • Structured experimentation to find what works
  • Active sharing of techniques and patterns across teams

The aim has been to turn AI from an individual productivity tool into an organisational capability. This approach has kept us at the front of practical AI development, not because we talk about it, but because it is how our delivery now operates.

For delivery leaders and technology executives, AI changes how confidently teams can plan and commit. It improves predictability, reduces surprise, and makes delivery conversations more grounded. These effects accumulate over time and shape the operating environment long before they show up as headline speed metrics. 

These observations come from building and delivering with AI every day inside our own organisation. They are grounded in operational data, evolving workflows, and continued measurement. We expect the story to continue to change as AI capability and usage mature, but the location of the first measurable impact has been consistent. 

AI Productivity: Key Findings

Finding 1: AI reduces delivery variance before it reduces delivery time.
Across fixed-price work, AI-assisted tasks overran estimates less often and by smaller margins. The average task duration changed less than expected. The distribution of outcomes tightened.

Finding 2: Agentic development amplifies stability gains.
Coding agents that support reasoning, surface edge cases, and improve test coverage reduce rework and late discovery — effects that lower variance rather than raw duration.

Finding 3: Self-reported time savings are unreliable.
Developers felt faster with AI, but those perceptions did not consistently match delivery data. Sentiment and measurement are different things.

Finding 4: Delivery environments become easier to plan, lead, and trust.
The headline benefit for delivery leaders is not speed. It is confidence — in estimates, in teams, and in commercial commitments.

Finding 5: AI maturity is a curve, not a switch.
Workflow design, training, and experimentation culture determine how quickly an organisation moves along the AI productivity curve.

We continue to analyse this data as our internal AI maturity progresses and can help other organisations run similar evaluations and build this capability in their own delivery environment.

AI Productivity: Frequently Asked Questions

Does AI make software development faster? Yes, but not in the way you'd expect. Average task duration remained largely unchanged. What reduced was overruns. AI-assisted tasks exceeded estimates less often and by smaller margins, and that turns out to be a more meaningful gain than raw speed.

How should you measure AI productivity in software delivery? Start with operational delivery data, not surveys or gut feel. We developed our own approach to track AI usage against real tasks and projects — and we can help you build a measurement approach that works for your organisation.

What is delivery variance in software development?
Delivery variance refers to the spread of outcomes around an estimate, specifically, how often and by how much tasks exceed their planned duration. High variance means unpredictable delivery; low variance means outcomes cluster tightly around estimates and plans are easier to trust.

Why don't self-reported AI productivity gains hold up under measurement? Estimating how long a task would have taken without AI is inherently unreliable. Developers tend to feel more efficient with AI assistance, but that perception does not consistently translate into shorter measured durations. Sentiment and delivery data need to be tracked separately.

How long does it take to see measurable AI productivity gains? In our experience, the first measurable gains (reduced delivery variance) appeared within weeks of structured AI adoption. Speed and throughput gains followed a longer curve and were shaped by workflow design, training investment, and experimentation culture.

Ready to Measure AI Productivity in Your Organisation?

We can help you run an AI productivity evaluation and deliver a report with findings and recommendations tailored to your needs.

a group of people sitting at a table with laptops

Elevating Excellence: Our Microsoft Partnership!

Our ongoing collaboration with Microsoft brings you the best in innovation and technology.

Tell us about your project

Got a question, need support, or ready to explore possibilities? We're here to help. Fill out the form and our team will reach out within 1-2 business days.