Every car marketplace that accepts user-uploaded photos is sitting on a GDPR compliance problem. Sellers photograph their vehicles on public streets. The front plate is visible. The rear plate is visible. The photo goes live. Under GDPR Article 4, a license plate is personal data because it identifies a natural person through a registered vehicle. Publishing it without a lawful basis is a violation, and data protection authorities in Germany, Spain, and the Netherlands have all issued enforcement actions in the transport sector.
Manual review does not scale. A marketplace processing 5,000 listings per day cannot employ enough moderators to check every photo for visible plates. The practical answer is an automated redaction step built into the photo ingestion pipeline: detect the plate region, apply a blur, replace the original before the listing goes live. This article covers how to build that pipeline, which approaches work at different volumes, and what the real costs are.
The compliance reality
GDPR Article 4(1) defines personal data as any information relating to an identified or identifiable natural person. European courts and the Article 29 Working Party have confirmed that vehicle registration plates fall within this definition because a plate number can be cross-referenced with vehicle registration databases to identify the registered owner. The key ruling came from the German Federal Administrative Court in 2019, which affirmed that automated number plate recognition constitutes processing of personal data.
For a marketplace, publishing a visible plate means publishing personal data about the seller or registered owner without a documented lawful basis. Consent is the most common basis, but consent buried in terms and conditions does not satisfy GDPR requirements for specific, informed, freely given consent. The German DPA (BfDI) issued a 9.55 million euro fine to a real estate platform in 2021 for insufficient legal basis for processing personal data in user-generated content - the same legal theory applies to vehicle photos on car marketplaces.
Outside the EU, the exposure is lower but not zero. California Vehicle Code Section 1808.47 restricts the commercial use of vehicle registration information. UK GDPR, which mirrors EU GDPR post-Brexit, applies the same personal data definition. Any marketplace operating in or targeting European users should treat visible plates as a data protection risk.
Three approaches to automated redaction
The technical problem is straightforward: detect the bounding box of the plate in the image, apply a redaction effect to that region, return the modified image. There are three practical paths to get there, each with different tradeoffs on cost, accuracy, and engineering overhead.
Self-hosted detection with YOLOv8 or a fine-tuned ANPR model gives you full control over accuracy and latency. Roboflow hosts several open-weight license plate detection models trained on European plates. The engineering investment is one-time: set up an inference endpoint, write the crop-and-blur step, wire it into your ingestion queue. The ongoing cost is GPU compute and model maintenance when plate formats change or accuracy degrades on new vehicle types. At 50,000 images per day, self-hosted becomes the cheapest option per image.
Commercial detection APIs - Platerecognizer (now part of Rekor), Sighthound, and AWS Rekognition with a custom label detector - offer pay-per-call pricing with no model maintenance. Platerecognizer is the most widely used for European plates, with documented support for formats from 83 countries. At low to medium volume, the per-image cost is low enough that the API model is cheaper than the engineering time to build and maintain a self-hosted solution.
Orchestrated image pipelines - tools like Runflow - let you chain the detection step with the blur step and the replacement step in a single workflow without writing the glue code. The workflow accepts an image URL, calls the detection model, applies the redaction, and returns the processed image URL. The advantage is operational: you get retry logic, error handling, and monitoring without building it yourself. The tradeoff is that you are dependent on the pipeline platform for uptime.








| Stack | Infra /mo | AI team | Total cost | Revenue | Margin |
|---|---|---|---|---|---|
Runflow pay-per-use · no commitment | $500 | $0 | $500 | $1.5K | 67% |
Cloud API + manual QA similar pricing · no auto-QA · part-time engineer needed | $500 | ~$5K | $5.5K | $1.5K | loss |
Self-hosted GPU raw compute · full-time AI engineer required | $400 | $12K | $12K | $1.5K | loss |
Runflow Sentinel — built-in quality control layer that automatically detects and discards failed or low-quality outputs before delivery. You only pay for images that pass QA. No engineer needed to babysit the pipeline.
Pricing based on Runflow published rates (June 2026) with automatic volume discounts. Revenue column is illustrative — actual client pricing varies by vertical and contract size. GPU self-hosted estimate uses $0.04/img raw compute cost.
Approach comparison
| Self-hosted (YOLOv8) | Commercial API (Platerecognizer) | Orchestrated pipeline (Runflow) | |
|---|---|---|---|
| EU plate accuracy | 92-96% (fine-tuned) | 97-99% | Depends on underlying model |
| Latency p95 | 80-150ms on GPU | 200-400ms | 300-600ms (detection + blur + return) |
| Cost at 10K imgs/day | ~$3-5/day (GPU) | ~$10/day ($0.001/img) | ~$15-20/day (API + orchestration) |
| Cost at 100K imgs/day | ~$15-25/day (GPU) | ~$70/day (volume tier) | Negotiated |
| Engineering to deploy | 3-6 weeks | 1-2 days integration | 1-3 days integration |
| Ongoing maintenance | High (model drift, format updates) | None | None |
| Batch support | Yes (native) | Yes (async endpoint) | Yes (queue mode) |
Production architecture: the pre-publish hook
The redaction step belongs in the ingestion pipeline, not as a post-publication fix. A photo that goes live for even a few seconds before redaction is applied has already been indexed by search engine crawlers and possibly cached by CDNs. The only safe architecture is to hold the photo in a staging bucket, run detection and redaction, and only release the processed version to the public CDN.
The sequence is: seller uploads photo to a private S3 or GCS bucket with no public access. An upload event triggers a queue message. A worker pulls the message, sends the image URL to the detection API, receives the bounding box coordinates, applies a Gaussian blur to that region using ImageMagick, Pillow, or Sharp, uploads the processed image to the public CDN bucket, and marks the listing as ready for publication. The original photo stays in the private bucket for audit purposes. The entire operation adds 300-800 milliseconds to the listing creation flow, which is imperceptible to the seller.
Edge cases to handle: images with no detectable plate (the model returns no bounding box - pass the original through); images with multiple plates visible in a single photo (apply redaction to all bounding boxes returned); very small plates in wide-angle shots where the detection confidence is below threshold (flag for manual review rather than publishing without redaction).
Cost at scale
| Self-hosted | Commercial API | Orchestrated pipeline | |
|---|---|---|---|
| GPU / API cost | ~$600/mo | ~$1,500/mo | ~$1,800/mo |
| Engineering maintenance | $8,000-12,000/mo | $0 | $0 |
| Total monthly TCO | $8,600-12,600/mo | ~$1,500/mo | ~$1,800/mo |
| Break-even vs. API | Above ~500K imgs/day | N/A | N/A |
Self-hosting only wins on unit cost if you already have GPU infrastructure for other workloads and the incremental cost of running the plate detection model is near zero. For a marketplace that does not already run GPU workloads, the engineering overhead of standing up and maintaining a self-hosted inference endpoint makes the commercial API cheaper in practice at almost any realistic volume.
Detection accuracy by vehicle type
Detection accuracy varies by vehicle type, plate position, and photo quality. Sedans photographed from the standard 3/4 front angle with the plate centered and well-lit are the easiest case - commercial APIs report 97-99% recall on these. Cargo vans and commercial vehicles present a harder problem when the front plate is positioned close to the ground and the photo is taken from a high angle. Sports cars with low front ends and non-standard plate mounting positions have lower recall on some models. SUVs photographed from the rear at a distance are the most challenging because the plate occupies a smaller fraction of the frame.
The practical implication: do not use a single confidence threshold for all vehicle types. Log the detection confidence per image and per vehicle category. Images with confidence below 0.75 from sellers in the van and sports car categories should go to a manual review queue rather than publishing automatically. Over time this data tells you whether your chosen detection model has systematic blind spots that warrant switching providers or fine-tuning on your own data.
Blur method: Gaussian vs. pixelation vs. solid fill
All three redaction methods satisfy the compliance requirement that the plate is unreadable. The choice affects visual quality in the listing photo. Gaussian blur with a radius of 15-25 pixels produces a soft, photographic result that does not call attention to itself. Most buyers viewing the listing photo accept it without a second thought. Pixelation produces a more obvious, blocky rectangle that some platforms prefer because it signals intentional redaction rather than looking like a camera artifact. Solid fill (black or white rectangle) is the most bulletproof option from a data elimination standpoint - there is no residual plate texture in the image at all - but it looks conspicuous in listing photos and can reduce buyer trust.
Gaussian blur is the correct default for consumer-facing marketplaces. The blur radius should be calibrated so that the plate text is not recoverable with standard image sharpening tools. A radius that makes the text unreadable to the naked eye is not sufficient - a motivated actor can apply deconvolution filters to partially recover low-radius blurs. A radius of 20+ pixels on a 1080p image is the practical safe minimum.








| Stack | Infra /mo | AI team | Total cost | Revenue | Margin |
|---|---|---|---|---|---|
Runflow pay-per-use · no commitment | $500 | $0 | $500 | $1.5K | 67% |
Cloud API + manual QA similar pricing · no auto-QA · part-time engineer needed | $500 | ~$5K | $5.5K | $1.5K | loss |
Self-hosted GPU raw compute · full-time AI engineer required | $400 | $12K | $12K | $1.5K | loss |
Runflow Sentinel — built-in quality control layer that automatically detects and discards failed or low-quality outputs before delivery. You only pay for images that pass QA. No engineer needed to babysit the pipeline.
Pricing based on Runflow published rates (June 2026) with automatic volume discounts. Revenue column is illustrative — actual client pricing varies by vertical and contract size. GPU self-hosted estimate uses $0.04/img raw compute cost.
Storing the original: audit trail and retention policy
Deleting the original photo immediately after redaction is the simplest approach but creates operational problems. If the redaction model makes an error - either missing a plate or over-redacting a non-plate region - you have no way to reprocess the image. A better architecture stores the original in a private bucket with access restricted to system accounts only, no public URLs, and a documented retention period.
A 90-day retention period for originals covers the window during which listing disputes and model reprocessing needs are most likely to arise. After 90 days, the original is automatically deleted by a lifecycle policy on the storage bucket. Your data processing record should document this policy: personal data (the plate) is processed for the purpose of redaction, stored temporarily for quality assurance, and deleted on a fixed schedule. This documentation is what a regulator will ask for if there is ever an audit.