
It Looked Fine on My Machine — Until We Built a Visual Comparison Tool
The Journey of Catching Invisible Bugs
Every SDET or Developer has said it at least once.
“It looks fine on my machine.”
For years, that phrase was our safety net — a polite way of saying “I can’t reproduce your issue.”
We’ve all been there: a release goes out, the functionality is solid, the tests pass, everything is good in tests — yet somehow, the UI feels… off. A button shifts a couple of pixels, a gradient slightly changes tone, or a layout breaks on one browser in production for our customers.
As our funnel product evolved rapidly, “fine” started meaning very different things to different people. On my setup, a button aligned perfectly; on someone else’s, it floated half an inch off-center.
We found ourselves testing dozens of funnel and website variations every day. The problem wasn’t whether they worked — it was whether they looked right after every deploy.
That’s when we realized — our functional tests were solid, but our visual quality was invisible to automation.
So we built something new: a visual comparison tool that could literally see what our tests couldn’t.
🧩 The Hidden Problem: Websites Are More Fragile Than They Look
Funnels and websites are living systems. Every tweak — a color update, a headline change, a spacing adjustment — has ripple effects.
Traditional SDET checks functionality:
Does the orderform submit? ✅
Does the next step load? ✅
Is the pricing logic correct? ✅
After one release, a few customers reached out saying, “The page looks weird on my end,” or “The button seems off-center.”
Our tests had all passed. Functionality was perfect.
But when we compared the versions, we saw the problem — a “Buy Now” button had shifted just a few pixels after a style merge.
Functionally fine. Visually broken. That’s when we realized automation wasn’t seeing what our users were seeing.
That was our wake-up call.
How It Works: Seeing Differences Like a Human Would
At its core, our visual comparison system works in three stages:
1. Render and Capture
Using Playwright, we render each funnel step in a controlled viewport.
We capture screenshots under stable conditions — consistent fonts, no animations, masked timestamps.
2. Compare
We feed those images into Pixelmatch, a lightweight library for pixel-level comparison.
It returns a “diff” — highlighting any changed pixels in red.We then apply our own logic:
Ignore dynamic zones (like timers or user avatars).
Allow small thresholds (minor antialiasing differences).
Only flag meaningful visual shifts.
3. Report
Results are piped into our CI dashboard.
If a diff exceeds our visual tolerance (say 3%), the pipeline marks it for human review.
Engineers see a side-by-side comparison — before, after, and diff.It’s visual QA, automated.
Challenges We Faced
Building it wasn’t smooth sailing. Here’s what nearly broke us:
Dynamic Data: Funnel pages with changing timestamps or customer names triggered false positives.
Fix: We masked those content regions dynamically before capturing screenshots.Responsive Layouts: Mobile and desktop views looked different by design.
Fix: We captured and compared each viewport separately.Baseline Drift: Over time, small approved changes made older baselines obsolete.
Fix: We introduced versioning and tagging to manage baselines gracefully.Cross-System Visual Drift: Screenshots from macOS (local) didn’t match Jenkins (Linux) due to different Chrome builds, font stacks/hinting, device pixel ratio/scale, color profiles, and headless GPU paths—so text/antialiasing and layout rounded differently.
The biggest learning? Visual QA is never just about pixels — it’s about perception.
🚀 Impact: From Reactive QA to Visual Confidence
Within weeks, the impact was visible (pun intended):
We caught subtle regressions before they reached customers.
Designers could validate visual consistency faster.
Our SDET team shifted from pixel-pushing to insight-driven testing.
One engineer put it best:
“This tool gave our tests eyes.”
As a SDET, building this tool taught us something unexpected: Quality isn’t just a checkbox or a pass/fail — it’s something you see.We used to say “it looks fine on my machine.”
Now we say, “Let’s see what the diff says.”


