Pothole Patrol: How a Bad Road Became a Computer Vision Project

May 2026 · 5 min read

I was driving on an unfamiliar road when I hit a pothole I never saw coming.

It wasn't catastrophic. But it was jarring enough to make me think: why does Google Maps know everything about traffic and nothing about the road itself? Traffic lights, congestion, accidents — all mapped. But the surface you're actually driving on? You're on your own.

In India, that gap isn't just inconvenient. It's dangerous. Potholes cause thousands of accidents every year — many of them because drivers, especially on unfamiliar routes, simply had no warning.

That's what Pothole Patrol is built to fix.

The idea: Google Maps, but for the road surface

Everyone understands Google Maps' colour logic without thinking about it.

🔵 Blue — clear, drive normally
🟡 Yellow — slow down, moderate congestion
🔴 Red — stop-start traffic, expect delays

I wanted the same thing for road damage. If I'm travelling from Manipal University Jaipur to the Railway Station and I can see a red stretch on the map ahead, I know to ease off the accelerator before I get there. No surprise. No damage. No accident.

So I built it. I fetched the route using the Google Maps API, collected dashcam footage across that stretch, and built an end-to-end pipeline to detect, classify, and map every piece of road damage — colour-coded by severity, plotted on an interactive map anyone can read.

What it detects

The model identifies five types of road damage:

Potholes
Longitudinal cracks
Transverse cracks
Alligator cracks
Other surface damage

Each detection is mapped to a GPS coordinate and colour-coded. Red markers cluster into red route segments. If you're approaching a red zone, you slow down. Simple.

The engineering: what actually made this hard

The idea was straightforward. The execution had four problems nobody warns you about.

1. Dataset label conflicts — the 10GB merge problem

I used the RDD2022 dataset as the foundation — a massive global road damage dataset. But it lacked enough high-quality pothole examples for Indian roads, so I brought in a second dataset: Pothole.v1.

Problem: the two datasets used completely different class numbering schemes. In Pothole.v1, "Pothole" was Class 0. In RDD2022, Class 0 was "Longitudinal Crack" — Pothole was Class 3. Merging them blindly would have produced a model that confused cracks with potholes on every inference.

The fix was a custom Python script (train_colab_merged.py) that iterated through thousands of .txt annotation files in the new dataset, parsed the class IDs, and dynamically re-mapped them to match the master schema before merging into one unified 10GB dataset. Classic MLOps problem. Nobody writes tutorials about this part.

2. Motion blur destroying real-world accuracy

The model trained well on static images. Then I ran it on actual dashcam footage and it missed obvious potholes.

The issue: a dashcam at 40 km/h introduces motion blur that clean dataset images don't have. The model's confidence on blurry pothole frames was dropping to 60–70% — and my confidence threshold was set at 90%. The model was seeing the potholes and choosing to ignore them.

Dropping the threshold to 45% fixed it immediately. False negatives disappeared without meaningful noise being added. One config change. Massive real-world difference. This is why you can't evaluate a model only on a clean test set — you have to run it on the actual deployment environment.

3. 10GB of data breaking everything

The full merged dataset crashed my local machine. When I tried to push to GitHub, it rejected the commits — GitHub enforces strict size limits, and the .pt model weight files and .mp4 test videos had bloated the repository history.

The solution had two parts. Training moved to Google Colab, which gave me GPU access and the memory headroom to actually run the training job. For version control, I rewrote the .gitignore, used git commands to untrack and un-commit the large files from the index, and restructured the repository to only push what matters: the inference code (app.py), the dataset merging scripts, and the final HTML map outputs.

4. Making the output actually useful

A video with bounding boxes is technically impressive. It's not something a driver can use.

The final output is an interactive HTML map built with the Folium library. Every detection is plotted as a colour-coded marker — red for potholes, orange for cracks — with the route segmented by severity. The map loads in a browser. No app install. No account. Anyone travelling the route can open it and see exactly what's ahead.

Why YOLOv8m specifically

The model choice was deliberate, not default.

Faster R-CNN was too slow — it can't process video at anywhere near 30 frames per second, which makes it unusable for dashcam inference. YOLOv8 Nano (YOLOv8n) was fast enough but missed small cracks entirely. The medium variant (YOLOv8m) hit the right balance: fast enough for real-time inference, accurate enough to catch crack types that matter.

What's honest about the current version

The GPS coordinates in this version are simulated. I mapped the route using Google Maps API data and correlated detections to that route, but there's no live hardware GPS module attached yet.

There's also a counting problem: if a pothole appears in 10 consecutive frames of footage, it registers as 10 detections instead of one. Without object tracking, the map over-counts.

Both of these are known, solvable limitations. The next version connects to a Raspberry Pi with a real GPS module for live coordinate capture, and adds DeepSORT object tracking so each physical pothole gets one ID, one pin, one record — regardless of how many frames it appears in.

What this is really about

Road damage data exists. Dashcam footage is everywhere. The problem isn't the data — it's that nobody has built the pipeline to turn it into something a driver can actually use in the 10 seconds before they reach a bad stretch of road.

Pothole Patrol is a proof of concept that this pipeline works. Real route. Real footage. Real detections. Real map.

The vision is every road, mapped. So the next time someone drives an unfamiliar route, they're not discovering the potholes by hitting them.

Stack: YOLOv8m · Python · Folium · Google Maps API · Google Colab · Vercel

Live demo: project-pothole.vercel.app

Source code: github.com/piyushhvarma/ProjectPothole