Back to Blog
DevOps
CI/CD
Testing
QA
Pipeline Optimization
The Fastest Way to Fix a Slow CI Pipeline Isn't More Hardware — It's Fixing QA
January 20, 2026
9 min read
# The Fastest Way to Fix a Slow CI Pipeline Isn't More Hardware — It's Fixing QA
If your CI pipeline takes close to an hour, people will tell you the same three things every time. Add more runners. Throw better hardware at it. Cache everything that doesn't move.
Sometimes that works. A lot of the time, it just makes an already messy system slightly faster at being frustrating.
The real problem usually lives somewhere less exciting: QA.
That was the case for one team whose pipeline averaged 58 minutes end to end. On paper, nothing looked especially broken. Builds were fine. Deploy steps were fine. The bottleneck sat squarely in automated tests, quietly eating 42 minutes of every run.
They deployed roughly ten times a day. Do the math and it's obvious what that kind of latency does to a team. Context switches everywhere. PRs stacking up. People pushing changes and walking away because "CI will take a while anyway."
A month later, the same pipeline averages 14 minutes. Developers wait for it again. They trust the results. And the biggest gains didn't come from faster machines or exotic tooling. They came from being honest about what tests are for — and when you actually need them.
## Long pipelines don't just slow code, they change behavior
There's a psychological tax to slow CI that rarely shows up in dashboards.
When feedback takes an hour, developers stop caring about it in real time. They open a PR, kick off the pipeline, and move on to something else. When it fails, the failure feels disconnected from the change that caused it. That's how broken tests get retried without investigation, or worse, ignored.
Over time, teams adapt in unhealthy ways. People batch changes to "make the wait worth it." They merge late in the day and hope nothing explodes overnight. They treat CI as a formality instead of a safety net.
The irony is that these behaviors make pipelines slower and flakier, which reinforces the cycle. QA becomes something that happens to the code, not something that helps it.
## The uncomfortable truth: not all tests deserve equal treatment
The first big change this team made was also the most controversial internally. They stopped pretending that every test needed to run on every pull request.
Instead, they split their test suite into two clear tiers.
The critical suite runs on every PR. It covers authentication, payments, and the core user flows that would be catastrophic to break. It takes about seven minutes when run efficiently. If this suite fails, nothing ships.
The full suite still exists. It's just not in the hot path anymore. It runs nightly and on release branches, where catching obscure edge cases is worth the extra time.
This wasn't about lowering quality. It was about aligning feedback speed with risk. Most changes don't touch the deepest corners of the system. Forcing every developer to wait for exhaustive coverage on every commit doesn't make software safer — it just makes teams slower and more cynical.
Once that mental shift clicked, everything else became easier to justify.
## Parallelization is obvious — until it isn't
Everyone knows parallel tests are faster. Actually making them work is another story.
The critical suite was originally running serially, which meant every test inherited the sins of every test before it. One slow setup step could ripple through the entire run.
By splitting the suite across six runners, the team cut execution time almost in half immediately. That was the biggest single win on paper.
But parallelism comes with sharp edges. Tests that share data, rely on global state, or make assumptions about execution order tend to fall apart when run side by side. Fixing that required touching test design itself, not just CI configuration.
Each test had to be responsible for its own setup and cleanup. Shared resources needed isolation. A few tests that were "fine" in serial turned out to be quietly brittle.
This was work. Real work. And it paid off beyond speed. Tests that can run in parallel are usually better tests, full stop.
## Flaky tests are worse than slow ones
If slow pipelines teach developers to ignore CI, flaky pipelines teach them not to believe it.
Before the overhaul, about 18 percent of failures were false positives. Random timing issues. UI tests breaking because a button rendered a few milliseconds late. Failures that vanished on rerun.
That number is brutal. It means developers are conditioned to assume CI is lying to them almost one out of every five times.
The team attacked this directly by replacing their worst offenders. Some legacy browser tests were rewritten using more stable tools. Others were rethought entirely, shifting away from brittle UI checks toward approaches that validated behavior without pixel-level precision.
The result wasn't perfection, but it was dramatic. False failures dropped to roughly three percent. That change alone probably saved more developer time than the raw speed improvements.
## Auto-retry isn't cheating — if you treat it honestly
One of the more pragmatic choices was adding automatic retries for single-test failures.
If a test fails once and passes on retry, the pipeline doesn't block the PR. But it does flag the test. Nothing gets swept under the rug.
This approach acknowledges reality. Some failures really are random. Infrastructure hiccups happen. Timing issues sneak through even well-designed tests.
The key is accountability. Retried tests are tracked and reviewed. Patterns emerge quickly. Tests that need frequent retries become candidates for fixes or removal. The retry mechanism becomes a diagnostic tool instead of a crutch.
Without that follow-up, retries can absolutely hide real regressions. With it, they reduce noise while still pushing the system toward stability.
## Speed changed trust — and trust changed everything else
Once the pipeline reliably finished in around 14 minutes, something subtle but important happened.
Developers started waiting for CI again.
They'd open a PR and actually watch the checks complete. When something failed, it felt immediate and actionable. Fixing a broken test no longer meant losing an hour of momentum.
That trust had knock-on effects. Smaller PRs. More frequent deploys. Less temptation to push risky changes late in the day. The pipeline stopped being an obstacle and started feeling like part of the workflow.
It also made conversations with management easier. A fast, reliable pipeline is easier to defend than an expensive one that still frustrates everyone.
## The real cost wasn't compute — it was focus
This entire effort took about a month. Not because the ideas were complex, but because they required sustained attention across tooling, tests, and team habits.
That's the part many organizations underestimate. Buying faster runners is easy. Carving out time to fix test design, reduce flakiness, and rethink QA strategy is harder. It competes with feature work. It doesn't ship anything visible to customers.
And yet, the return on investment is massive. Faster feedback loops don't just save minutes. They save mental energy. They reduce friction. They make engineering feel less like waiting and more like building.
## Fix QA first, then worry about everything else
If your CI pipeline is slow, it's tempting to start with infrastructure. Scale up. Optimize caches. Shave seconds off builds.
But if QA dominates your runtime, that's where the real leverage is. Ask which tests actually need to run. Ask which ones are flaky and why. Ask whether your test suite reflects how your team ships software today, or how it shipped three years ago.
Most teams don't have a tooling problem. They have a trust problem.
Fixing QA won't just make your pipeline faster. It'll make it worth paying attention to again. And that's when CI starts doing the job it was always supposed to do.
Keep Exploring
Ephemeral Kubernetes Namespaces: Smart Dev Environments or a Scaling Nightmare?
Exploring the benefits and challenges of using ephemeral Kubernetes namespaces for development environments, from automated cleanup to state management complexities.
It Works... But It Feels Wrong - The Real Way to Run a Java Monolith on Kubernetes Without Breaking Your Brain
A practical production guide to running a Java monolith on Kubernetes without fragile NodePort duct tape.
Kubernetes Isn’t Your Load Balancer — It’s the Puppet Master Pulling the Strings
Kubernetes orchestrates load balancers, but does not replace them; this post explains what actually handles production traffic.
Should You Use CPU Limits in Kubernetes Production?
A grounded take on when CPU limits help, when they hurt, and how to choose based on workload behavior.