How to Compare Accessibility Scan Results Over Time

To compare accessibility scan results over time, run scans on a consistent schedule against the same set of pages, then review issue counts, issue types, and affected pages side by side across each period. The goal is to spot regressions early, verify that fixes hold, and watch the overall trendline move downward. Scans detect approximately 25% of issues, so this approach tracks automated signals only and does not confirm WCAG conformance. Used correctly, scan comparisons give teams an early warning system between audit cycles and a clear record of how a site has changed.

Scan Comparison at a Glance
Element	What to Track
Scope	Same URLs, same depth, same settings across every scan
Cadence	Weekly or monthly, depending on release frequency
Core metrics	Total issues, issues by severity, issues by WCAG criterion, affected pages
Trendline	Period over period deltas, regressions vs. new fixes
Limitation	Scans flag about 25% of issues and cannot determine WCAG conformance

Why comparing scans over time matters

A single scan is a snapshot. It tells you what the scanner flagged on one day against one set of pages. That is useful, but it does not tell you whether the site is improving, holding steady, or sliding backward.

Comparison is where scans start to earn their keep. When you look at the same pages across weeks or months, patterns appear. A drop in color contrast issues after a design system update. A spike in missing alt text after a content push. A template change that reintroduces a problem someone fixed six months ago.

Scans will not confirm WCAG conformance. Only a (manual) accessibility audit does that. But scan trends are a strong early warning signal between audit cycles.

What should stay consistent across scans

For a comparison to mean anything, the inputs have to match. If the scope shifts between periods, the numbers are not actually comparable.

Keep these constant:

URL list. Scan the same pages each time. Adding or removing pages changes the totals for reasons unrelated to accessibility work.

Scan settings. Same rule set, same depth, same authentication state if you are scanning logged-in views.

Environment. Production vs. staging matters. Pick one and stay there.

Cadence. Weekly works for active development. Monthly works for stable sites. Pick a rhythm and hold it.

If you need to expand the URL list, note the change and track the new pages as a separate baseline until you have enough history on them.

Which metrics tell the real story?

Total issue count is the headline number, but it hides the detail that matters most. A site can drop 200 issues in a week and still have gotten worse if all the fixes were trivial and the new problems are critical.

Track these together:

Total issues. The top line. Useful for direction, not diagnosis.

Issues by severity. Critical and serious issues carry more weight than minor ones.

Issues by WCAG criterion. Which success criteria keep appearing? This points to systemic patterns.

Affected pages. An issue on one template that repeats across 400 pages is one problem, not 400.

New vs. recurring. New issues signal regressions. Recurring ones signal unresolved work.

The Accessibility Tracker Platform organizes scan history around these metrics so comparisons happen inside the platform rather than inside a spreadsheet.

How to read a trendline

Trendlines are easier to misread than they look. A downward line feels like progress, but context decides whether it actually is.

Ask three questions when reviewing a trend:

First, did the scope stay the same? If pages were removed, the drop may be mechanical rather than real.

Second, what kind of issues moved? Falling contrast issues after a palette update is genuine progress. A drop that coincides with a template being temporarily hidden is not.

Third, are regressions creeping in? The healthiest sites do not have flat or zero new-issue counts. They have low ones that get addressed quickly. A gradual rise in new issues week over week is the pattern to watch for.

Spotting regressions early

Regressions are the single most useful thing scan comparisons catch. A developer ships a component update. A marketer swaps in a new hero image. A theme update rolls through. None of those changes were meant to affect accessibility, but any of them can.

When you compare scan results week over week, regressions show up as new issues on pages that were previously clean. That is the signal. Without comparison, those issues blend into the overall count and get lost.

Teams that catch regressions within a week of shipping fix them faster and cheaper than teams that find the same issues six months later during an audit.

Where scans stop and audits start

Scan comparisons are a monitoring tool. They are not a conformance tool. Automated checks catch a fraction of WCAG issues, roughly a quarter, and they miss most of what users with disabilities actually encounter: keyboard traps that require interaction to surface, screen reader output that needs a human ear, focus order problems, form logic that fails in unexpected ways.

A healthy program uses both. Scans run continuously and flag what they can between audits. A (manual) accessibility audit identifies the full picture at a defined point in time. Comparing scan results over time fills the space in between.

Frequently asked questions

How often should we scan to get useful comparisons?

Weekly for sites with active development. Monthly for stable sites. Daily is usually overkill and produces noise without a proportional gain in signal. The cadence only matters if it is consistent.

Can scan comparisons replace an accessibility audit?

No. Scans flag approximately 25% of issues and cannot determine WCAG conformance. They are a monitoring layer, not an evaluation. An audit is the only way to know where a site actually stands against WCAG 2.1 AA or WCAG 2.2 AA.

What if our scan totals go up even though we are fixing issues?

This usually means new content or new code is introducing issues faster than old ones are being closed. That is a process signal. Either fixes need to speed up, or the team shipping new work needs accessibility guidance earlier in the cycle.

Should we compare scan results across different tools?

Avoid it. Different scanners apply different rule sets and weight issues differently. Comparisons only work cleanly when the same tool is running the same rules against the same pages across every period.

Scan comparisons work best when they are part of a larger program that includes audits, remediation, and validation. Looking at the same pages over time, with the same settings, turns a scanner from a one-time checker into an ongoing record of how accessibility is trending across a site.

Contact the Accessibility Tracker team to see how scan history and trend reporting work inside the platform: Contact Accessibility Tracker.

Navigation

Account