Why Most Computer Vision Projects Fail in Production (And How to Avoid It)

Every computer vision project starts the same way: a demo that works perfectly in the lab. Good lighting, clean backgrounds, cooperative subjects. The model hits 98% accuracy. Everyone celebrates.

Then you deploy it in the real world and everything breaks.

We’ve shipped CV systems across retail stores, public buses, and industrial settings. Here’s what we’ve learned about the gap between demo and production — and how to close it.

The Demo-to-Production Gap

A CV model trained on curated data behaves very differently when it encounters:

Lighting variation: A model that works under fluorescent lights fails at sunset, in shadow, or under mixed lighting. Retail stores have spotlights on some products and dark corners elsewhere. Buses go from bright sunshine to tunnel darkness in seconds.
Camera quality: Your training data was collected on a high-res camera. The deployment uses 720p CCTV from 2018 with compression artifacts and a dirty lens.
Edge cases at scale: A 99% accurate model sounds great until you realize that at 1000 events per day, you’re getting 10 wrong answers daily. In passenger counting, that’s 10 fare disputes. In quality inspection, that’s 10 defective products shipped.
Environmental factors: Dust on lenses, vibration, heat throttling the compute board, network dropouts mid-inference, power cuts that corrupt the model file on disk.

What Actually Matters in Production CV

1. Train on Ugly Data

Your training dataset should look like your worst deployment day, not your best. We deliberately include:

Blurry frames from camera shake
Overexposed and underexposed images
Partially occluded subjects
Unusual angles and perspectives
Frames with compression artifacts

If the model hasn’t seen it in training, it will fail on it in production.

2. Design for Failure

Every component will fail. The question is how gracefully.

Camera goes offline? The system should detect it, log it, and resume when it returns — not crash and require a manual restart.
Model inference hangs? A watchdog process should kill and restart it.
Storage fills up? Oldest data should be pruned automatically, not silently stop recording.
Network drops? Data should queue locally and sync when connectivity returns.

We’ve built all of these recovery mechanisms into our systems because we’ve encountered every one of these failures in the field.

3. Build the Annotation Pipeline First

Most teams treat annotation as a one-time task before training. In reality, it’s an ongoing operation.

Every deployment generates new edge cases. You need a pipeline that:

Flags low-confidence predictions for human review
Makes it easy for annotators to correct mistakes
Feeds corrections back into the training set
Triggers retraining when the correction set is large enough

The annotation pipeline is as important as the model itself.

4. Measure Everything in Production

Lab metrics (precision, recall, F1) are necessary but not sufficient. In production, you also need:

Latency per frame — is inference keeping up with the camera’s frame rate?
Drift detection — is accuracy degrading over time as conditions change?
Hardware health — CPU temperature, memory usage, disk space, network throughput
Business metrics — are the downstream decisions based on CV output actually improving outcomes?

If you’re not measuring these, you’re flying blind.

5. Keep the Human in the Loop

Fully autonomous CV is the goal, but for most real-world applications, you need a human verification step — at least initially.

In our retail analytics system, store managers can flag incorrect counts. In our passenger counting system, operators can review video clips of disputed boarding events. These corrections feed back into the model, creating a virtuous cycle of improvement.

The Boring Parts Matter Most

The glamorous part of CV is the model — the architecture, the training, the accuracy benchmarks. But in production, the model is maybe 20% of the system.

The other 80% is:

Data pipelines and storage
Edge device management and monitoring
OTA model updates without downtime
Error recovery and self-healing
API design for downstream consumers
Annotation tooling and retraining workflows

These aren’t exciting topics, but they’re what separates a demo from a product.

Conclusion

If you’re building a CV system, plan for production from day one. The model is the easy part. The hard part is making it work reliably, at scale, in conditions you haven’t anticipated, with graceful failure handling and continuous improvement.

That’s the engineering we do at Thoht Delta. Not just models — complete systems that see.

Building a CV system that needs to work in the real world? Let’s talk.

computer visionproduction MLengineeringdeployment

Want to learn more?

We build computer vision systems for retail, transit, and industrial applications.

Get in Touch