Deceptive Pattern Detector
Dark patterns are UI designs engineered to extract unintended actions from users. The Deceptive Pattern Detector is a Chrome extension that runs a two-stage machine learning pipeline - MiniLM semantic embeddings computed in-browser and a suite of per-pattern XGBoost classifiers on the server - to flag eight classes of dark patterns in real time as users browse.
The Problem
Deceptive design patterns, commonly known as dark patterns, are manipulative UI techniques deployed by websites and applications to influence user behavior in ways that benefit the business at the expense of the user. These range from guilt-laden decline buttons (confirm shaming) and fabricated stock limits (scarcity) to deliberately confusing checkbox language (trick questions) and interfaces that bury privacy controls to maximize data collection (privacy zuckering).
Unlike outright fraud, dark patterns operate in a legal gray zone: they are psychologically coercive but technically compliant. Regulatory attention is growing - the FTC and European regulators have issued guidance and enforcement actions - yet users remain largely unaware of the techniques being applied against them in ordinary browsing. A 2019 study by Mathur et al. found dark patterns on over 11% of 11,000 shopping websites surveyed.
The core challenge is detection at scale. Dark patterns are embedded in real page structure and depend on contextual signals - visual asymmetry, linguistic framing, interface hierarchy - that are difficult to capture with keyword rules alone. Existing academic datasets label patterns at the UI-element level but generalize poorly to live, adversarially designed production pages. We built a system that combines semantic understanding with structural and CSS analysis to detect dark patterns robustly on real websites.
|
|
Technical Approach
The Deceptive Pattern Detector implements a two-stage pipeline designed to balance latency, accuracy, and privacy. The first stage runs entirely in the browser: the extension extracts choice sets (clusters of related UI elements that present the user with a decision) from the Accessibility Tree and DOM, embeds their text using MiniLM via ONNX Runtime WebAssembly, and applies a lightweight noise filter before making any network call. The second stage runs on a Flask server: candidates that pass the gatekeeper threshold are scored by eight independent XGBoost binary classifiers - one per pattern type - each trained on 80+ engineered features covering text signals, CSS visual asymmetry, and structural properties.
Why Two Stages?
A naive approach - sending every UI element to a server - would be too slow for real-time browsing and would raise privacy concerns by transmitting raw page content for every visited URL. Our gatekeeper filter eliminates roughly 80% of candidate elements before any server call, reducing latency and the volume of data transmitted while preserving recall on genuine dark patterns.
Running MiniLM in-browser via ONNX Runtime Wasm means semantic embeddings are generated without sending text to any external service. Only candidates above the gate threshold, stripped of personally identifiable context, reach the classification endpoint.
Choice Set Abstraction
Rather than evaluating individual DOM elements in isolation, the system groups related elements into choice sets: the accept button and shame-laden decline link on a cookie banner form a single choice set; the pre-checked newsletter checkbox and its double-negative label form another. This framing allows each model to reason about the relationship between options, which is where most dark patterns manifest their manipulative intent.
Pipeline Architecture
The extension processes each page through six sequential stages:
- Choice Set Extraction - choice_set_extractor.js walks the Accessibility Tree and DOM to identify decision-point UI groups: clusters of elements (buttons, links, checkboxes, form fields) that together present a binary or multi-way choice. Each choice set captures option labels, CSS computed properties, and subtree structure.
- Noise Filtering - noise_classifier.js uses Jaccard similarity and keyword matching to discard navigation chrome, footers, social share buttons, and other non-decision UI before any embedding work is performed. This stage eliminates the majority of candidate elements with negligible compute cost.
- In-Browser MiniLM Embedding - offscreen_embedder.js runs all-MiniLM-L6-v2 via ONNX Runtime WebAssembly in an offscreen document, generating 384-dimensional semantic vectors for choice-set text without any network call. Per-choice-set embedding completes in under 150 ms on typical hardware.
- Gatekeeper Filter - dp_gatekeeper.js applies a cosine-similarity threshold against a set of dark-pattern anchor embeddings. Only candidates whose similarity exceeds the gate pass to the server.
- XGBoost Classification + Sigmoid Calibration - The Flask API receives gated candidates and scores them against each of the eight per-pattern models on 80+ V3 features - text signals, CSS aggregates, and structural metrics. Raw XGBoost probabilities are calibrated via Platt scaling (sigmoid) to produce well-calibrated confidence scores.
- User Alert - The extension popup displays flagged patterns with plain-English descriptions and confidence scores. The toolbar badge icon updates to reflect the current page risk level.
Data Collection and Model Training
Each of the eight target patterns is detected by a dedicated binary XGBoost classifier. Training followed a structured pipeline covering data collection, synthetic augmentation, feature engineering, model fitting, and calibration.
Training Data Collection
Real labeled data was collected using a Playwright-based browser automation pipeline that navigated live websites, extracted choice sets from the Accessibility Tree and DOM, and stored the results - including option text, CSS computed properties, and subtree structure - in a PostgreSQL database. Captured choice sets were then reviewed and labeled by human annotators using a custom labeling toolchain, assigning each a decision (dark or not dark) and, for dark examples, one or more pattern type tags.
Synthetic Data Augmentation
Real labeled data for rare or hard-to-capture patterns was supplemented with pattern-specific synthetic choice sets generated programmatically. Synthetic distributions were calibrated to match the statistical properties of real captures - dialog rates, option counts, CSS aggregate targets, and text-regex hit rates - to reduce distributional shift between training and inference. Synthetic positives were mixed with real negatives to maintain realistic class ratios.
Per-Pattern Binary Classifiers
One XGBoost model is trained per pattern using binary:logistic objective and log-loss as the evaluation metric. Each training run draws positive examples labeled for that pattern and a stratified sample of negatives drawn from unlabeled choice sets, with a configurable negative-to-positive ratio. Class imbalance is handled via XGBoost's scale_pos_weight parameter (set to neg/pos ratio), and early stopping is applied against a held-out validation split using grouped shuffle splitting to prevent within-site leakage across train and test. Hyperparameters include max depth 5, learning rate 0.05, and subsampling rates of 0.9 for both rows and columns.
Calibration
After training, raw XGBoost output probabilities are post-hoc calibrated using Platt scaling: a logistic regression is fit on the logit of raw scores against held-out labels. Calibrated probabilities are validated for reduced Brier score and log-loss relative to uncalibrated scores on a golden set of real labeled examples. Only calibrated scores are surfaced in the extension popup and stored in the database.
Feature Engineering: V3 Feature Groups
The V3 feature set comprises 80+ features organized into four groups designed to capture the multiple dimensions through which dark patterns manifest:
Text Signals
Lexical and semantic features targeting manipulative language: urgency phrase detection (countdown timers, deadline language), scarcity claims ("only N left," "selling fast"), social proof language ("X people viewing," "bestseller"), guilt-framing in decline buttons, and promotional/discount language. Phrase lists are curated from academic dark-pattern taxonomies and augmented with real-world examples collected from the capture pipeline.
CSS Aggregates
Visual asymmetry is a defining characteristic of many dark patterns: accept buttons rendered prominently while decline paths are styled as gray tiny text. CSS features include font-size and font-weight per option, opacity and visibility flags, low-contrast and gray-text counts, z-index and position: absolute outlier detection, and button sizing differentials between accept and reject paths.
Structural Features
DOM and Accessibility Tree structural features: AX subtree size, option count, flag label lengths, dialog and modal detection, duplicate container counts, and nesting depth. These capture patterns like hidden cancel links buried deep in a modal hierarchy or privacy settings buried behind multiple confirmation steps.
Embedding Features
Semantic features derived from MiniLM embeddings: cosine similarity between option labels (high similarity may indicate a trick question), accept-versus-decline role scores, choice asymmetry signals, and cross-option semantic distance. Embedding features provide robustness against paraphrase attacks that evade keyword-based detection.
Dark Pattern Taxonomy
The classifier targets eight classes drawn from established academic taxonomies (Gray et al. 2018, Mathur et al. 2019) and refined based on what is detectable from client-side structure:
| Pattern | Risk | Description |
|---|---|---|
| Confirm Shaming | High | Decline buttons use guilt-inducing language to pressure acceptance. |
| Urgency | High | Artificial deadlines or countdown timers rush decision-making. |
| Scarcity | High | Fabricated low-stock or high-demand claims pressure immediate action. |
| Privacy Zuckering | High | Interfaces obscure privacy controls or defaults to maximize data collection beyond user intent. |
| Asymmetric Choice | Medium | Accept path is visually prominent; reject path is styled to be ignored. |
| Trick Question | Medium | Double-negative or confusing phrasing makes the correct opt-out action unclear. |
| Misdirection | Medium | Visual emphasis steers users away from their preferred option. |
| Social Proof | Low | Unverifiable peer-pressure claims nudge decisions through false consensus. |
Key Technical Achievements
Fully In-Browser First Stage
Running MiniLM via ONNX Runtime WASM in a Chrome MV3 extension is technically non-trivial: the model is approximately 23 MB and must execute without blocking the main thread or requiring remote computation. We use a dedicated offscreen document and message-passing to perform embedding in a background context, completing per-choice-set embedding in under 150 ms on representative hardware.
Gatekeeper Efficiency
The dual noise-filter and cosine-similarity gate reduces server calls by approximately 80% compared to a naive all-elements approach, enabling the extension to run unobtrusively on content-rich pages without perceptible latency or privacy leakage.
Calibrated Probability Outputs
Raw XGBoost outputs are well-separated but not reliably calibrated for probability interpretation. Sigmoid (Platt) calibration was applied post-training, validated against a golden set of real-labeled choice sets, and confirmed to reduce Brier score and log-loss across all eight pattern classes. Calibrated scores are displayed in the extension popup to give users meaningful confidence context.
Production Deployment
The classification API is deployed at darkpatterns.duckdns.org behind HTTPS. The Flask server handles concurrent requests from multiple active extension users, with average response latency under 200 ms for a full scoring request.
Technology Stack
| Component | Technologies |
|---|---|
| Extension (Client) | Chrome MV3, JavaScript (ES2022), WebExtensions API, Offscreen Documents API |
| ML Runtime (Client) | ONNX Runtime Web (WebAssembly), all-MiniLM-L6-v2 sentence transformer |
| Classification (Server) | XGBoost, scikit-learn (calibration, preprocessing), pandas, numpy |
| API & Backend | Flask, Python 3.11, SQLAlchemy, PostgreSQL |
| Data Infrastructure | Playwright-based capture pipeline, custom choice-set labeling toolchain, calibrated synthetic data generation |
| Deployment | Docker, HTTPS reverse proxy, DuckDNS dynamic DNS |
Impact & Future Work
The Deceptive Pattern Detector demonstrates that robust dark-pattern detection is achievable with a privacy-preserving, low-latency architecture that runs on consumer hardware without requiring page content to leave the browser in raw form. By surfacing plain-English explanations alongside calibrated confidence scores, the extension aims to raise user awareness and informed consent rather than silently blocking content.
Future work includes expanding the training dataset with additional labeled captures from e-commerce and SaaS checkout flows, improving per-class precision by refining the synthetic augmentation distributions, exploring cross-browser compatibility (Firefox, Edge) through a unified WebExtensions build, and evaluating additional server-side models for better performance and ease of scalability.
