BugMojoBugMojoBugMojo
FeaturesPricingBlogGuidesAbout
Log inGet started
BugMojoBugMojo

Bug reports that actually help fix bugs — capture, replay, share.

Product

  • Features
  • Pricing
  • Get started
  • Log in

Resources

  • Blog
  • Guides
  • Compare
  • Glossary

Company

  • About
  • Contact
  • Privacy
  • Engineering
  • Playbooks
© 2026 BugMojo. All rights reserved.
AllGuidesEngineeringPlaybooksCompareGlossaryAlternativesBy roleBug tracking by framework
  1. Home
  2. Blog
  3. Engineering
  4. PII Redaction in Session Replay: Patterns That Work in Production
Engineering

PII Redaction in Session Replay: Patterns That Work in Production

How to redact PII from rrweb session replays, console logs, and network HAR data — GDPR and CCPA-compliant patterns we ship in production at BugMojo.

BugMojo TeamBugMojo Team·May 22, 2026·7 min read
A partially obscured keyboard representing privacy and PII redaction in software

Key takeaways

  • PII redaction must happen client-side, before capture data leaves the browser — server-side redaction is a GDPR violation.
  • Three independent layers: DOM input masking, text-node regex sweeping, and network payload sanitization.
  • A 12-pattern regex catalog covers ~95% of real-world PII (emails, phones, credit cards, JWTs, auth headers, SSNs, IBANs).
  • The consent flow is a three-state machine — unset / granted / denied — and recording is gated on granted.

What counts as PII in session-replay context

GDPR Article 4 defines personal data as "any information relating to an identified or identifiable natural person." In session-replay context that includes obvious cases (names, emails, government IDs, credit cards) and context-dependent ones (a user ID in a URL, an IP in a console.log, an order number that maps to a customer). Default-deny is safer than default-allow.

The mistake teams make: they redact the obvious PII (the email field, the credit-card input) and miss the implicit PII that flows through console logs and network payloads. A typical SaaS app logs the authenticated user's id and email on every API call for debugging — that's PII bleeding into your replay. Same for Authorization: Bearer eyJ... headers, OAuth tokens in URLs, and webhook payloads with subscriber emails.

Three layers of redaction (DOM, console, network)

A production-grade redaction pipeline has three independent layers. Each layer runs in the user's browser before data is buffered for upload. If any one layer leaks, the others act as defense in depth. The layers are: DOM input masking (handled by rrweb), text-node regex sweeping (custom), and network payload sanitization (custom).

DOM layer: rrweb input masking

rrweb's built-in masking handles inputs cleanly. Set maskInputOptions to default-mask everything and selectively allow the inputs you actually want to capture:

import * as rrweb from 'rrweb';

rrweb.record({
  emit: bufferEvent,
  maskAllInputs: true,
  maskInputOptions: {
    password: true,
    email: true,
    tel: true,
    text: true,           // mask plain text inputs by default
    textarea: true,
    color: false,         // safe to capture
    date: false,
    number: false,
  },
  // CSS selector for explicit block regions
  blockSelector: '.rr-block, [data-rr-block]',
  // CSS selector to allow specific inputs (e.g. search box)
  maskTextSelector: 'input:not([data-rr-allow])',
});

Text-node layer: regex sweep

Inputs aren't the only PII source in the DOM. Marketing pages render the user's name in a greeting, dashboards show order totals with credit-card last-4s, support widgets surface email threads. A text-node regex sweep runs before serialization and replaces matched substrings with placeholders.

Network layer: header + body sanitization

Console and network captures are the leakiest surface. Headers like Authorization, Cookie, X-Api-Key, and Set-Cookie should be dropped or hashed entirely. Bodies (JSON, form-data) should be regex-swept the same way text nodes are.

A PII regex catalog you can ship today

The 12 regex patterns below cover roughly 95% of real-world PII without producing significant false positives. We compose them into a single sweep function applied to text nodes, console arguments, network bodies, and URL query strings. The credit-card pattern includes a Luhn checksum to reduce false hits on order numbers.

const PII_PATTERNS = {
  email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
  phoneUS: /\b(\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g,
  phoneIntl: /\b\+\d{1,3}[\s-]?\d{2,4}[\s-]?\d{2,4}[\s-]?\d{2,4}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g,  // post-Luhn
  iban: /\b[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}\b/g,
  jwt: /\beyJ[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,}\b/g,
  authHeader: /\b(authorization|bearer|x-api-key|x-auth-token)\s*[:=]\s*\S+/gi,
  ipv4: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g,
  awsAccessKey: /\bAKIA[0-9A-Z]{16}\b/g,
  githubToken: /\bghp_[A-Za-z0-9]{36}\b/g,
  stripeKey: /\b(sk|pk)_(test|live)_[A-Za-z0-9]{24,}\b/g,
};

function luhnValid(cc: string): boolean {
  const digits = cc.replace(/\D/g, '').split('').reverse().map(Number);
  let sum = 0;
  for (let i = 0; i < digits.length; i++) {
    const d = i % 2 === 1 ? digits[i] * 2 : digits[i];
    sum += d > 9 ? d - 9 : d;
  }
  return sum % 10 === 0;
}

export function redactPii(input: string): string {
  let out = input;
  for (const [name, pattern] of Object.entries(PII_PATTERNS)) {
    out = out.replace(pattern, (match) => {
      // Only redact CC if Luhn passes — reduces false positives on order numbers
      if (name === 'creditCard' && !luhnValid(match)) return match;
      return `[REDACTED:${name}]`;
    });
  }
  return out;
}
Warning

Regex is not perfect. Test with a fuzz corpus of known PII strings before shipping. The Luhn check on credit-card matches reduces but doesn't eliminate false positives; some user IDs and order numbers will still trip the pattern. Accept the over-redaction — over-redacting is safer than leaking.

Wiring redaction into rrweb's pipeline

The redaction sweep runs before the event is added to the buffer. rrweb's emit callback fires for every event; we intercept it, run the sweep over any text content in the event, then forward the redacted event to the actual buffer. This keeps redaction in a single chokepoint that's easy to audit and unit-test.

import * as rrweb from 'rrweb';
import type { eventWithTime, IncrementalSource } from '@rrweb/types';

rrweb.record({
  emit(event: eventWithTime) {
    const redacted = redactEvent(event);
    captureBuffer.push(redacted);
  },
  maskAllInputs: true,
  // ... other config
});

function redactEvent(event: eventWithTime): eventWithTime {
  // FullSnapshot — sweep text nodes in the serialized tree
  if (event.type === 2) {
    redactSerializedTree(event.data.node);
    return event;
  }
  // IncrementalSnapshot — mutation source has text changes
  if (event.type === 3 && event.data.source === 0) {
    for (const t of event.data.texts) t.value = redactPii(t.value);
  }
  return event;
}

GDPR consent flow architecture

Consent has three states: unset (first visit, recording disabled), granted (user opted in, recording active), and denied (user opted out, no further prompts). The state lives in chrome.storage.local for extensions, or localStorage for in-page SDKs. Recording is gated on state === 'granted'. The dialog must appear before any capture data is buffered.

type ConsentState = 'unset' | 'granted' | 'denied';

async function getConsent(): Promise<ConsentState> {
  const { consent } = await chrome.storage.local.get('consent');
  return consent ?? 'unset';
}

async function ensureConsent(): Promise<boolean> {
  const state = await getConsent();
  if (state === 'granted') return true;
  if (state === 'denied') return false;
  // unset — show the dialog, await the user's choice
  const choice = await showConsentDialog();
  await chrome.storage.local.set({ consent: choice });
  return choice === 'granted';
}

The consent dialog should disclose what's recorded (DOM, inputs masked, console, network with PII redacted), how long it's retained, who has access, and a link to the privacy policy. The dialog must be explicitly dismissible without granting consent — the user must be able to say "no" with a single click.

Pitfalls we hit in production

Three classes of redaction bugs cost us real outages in early development. First, async input events fired before our mask propagated to the DOM, leaking 1-2 characters per keystroke for a 50ms window. Second, our credit-card regex matched a 16-digit order number that happened to pass Luhn. Third, gzip-compressed network bodies bypassed regex sweeps because we forgot to decompress before scanning.

  • Async masking race. Solution: capture inputs synchronously in the rrweb event handler rather than relying on a downstream DOM mutation.
  • False-positive credit-card matches. Solution: combine Luhn with a context-aware filter (only match in input values and known billing-field selectors).
  • Compressed body leaks. Solution: decompress before redaction, recompress after — or skip redaction on compressed payloads entirely and rely on header-based content type filtering.

Auditing your own redaction

A fuzz test belongs in CI. Construct a corpus of ~50 known-bad strings (an email, a credit card, an SSN, a JWT, an AWS key, etc.), inject them into a test page, trigger a capture, then assert that none of those exact strings appear in the buffered events. Run this on every PR.

The fuzz test should also include false-positive cases — a 16-digit order ID, an internal username that contains an @ symbol, a UUID that looks like a JWT prefix. If those get redacted, the dev experience suffers; if PII gets through, the company has a breach.

Common mistakes

  • Doing redaction server-side. Once data hits your wire, GDPR considers it processed.
  • No consent dialog. "Implied consent" by visiting the site is not GDPR consent under Article 7.
  • Forgetting query-string secrets. Tokens in URLs are PII too — sweep location.search and any captured URLs.
  • Trusting rrweb defaults. rrweb's maskAllInputs is true by default in modern versions, but maskTextContent is not. Set both.

Next steps

The rrweb privacy recipes doc is the authoritative reference for the masking config. For the regex catalog, the Detect Secrets project maintains a battle-tested set of patterns broader than the 12 above.

Want to see this pipeline shipped in production? BugMojo implements all three redaction layers, the consent dialog, and the fuzz tests — open-source the patterns or use the Chrome extension out of the box.

Frequently asked questions

Sources

  1. GDPR Article 4 — Definitions — gdpr-info.eu (2018)
  2. CCPA — California Consumer Privacy Act — California Attorney General (2024)
  3. rrweb input masking documentation — rrweb-io (2026)
Share:
BugMojo Team
BugMojo Team· Engineering & QA

The BugMojo team builds tools for developers, QA engineers, and PMs who want bug reports that actually help fix bugs.

On this page

  • What counts as PII in session-replay context
  • Three layers of redaction (DOM, console, network)
  • DOM layer: rrweb input masking
  • Text-node layer: regex sweep
  • Network layer: header + body sanitization
  • A PII regex catalog you can ship today
  • Wiring redaction into rrweb's pipeline
  • GDPR consent flow architecture
  • Pitfalls we hit in production
  • Auditing your own redaction
  • Common mistakes
  • Next steps

Get bug-tracking insights, weekly.

Engineering deep-dives, QA playbooks, and honest tool comparisons. No spam — unsubscribe in one click.

Keep reading

A circuit board representing the low-level machinery of DOM serialization for browser session replay
Engineering

How rrweb Works: A Deep Dive into Browser Session Recording

A 2026 engineering deep dive into rrweb — how the open-source library captures DOM mutations, inputs, and scroll into a replay-able session timeline.

May 22, 2026· 6 min
A laptop screen showing JavaScript code in a dark editor with Chrome DevTools open, illustrating browser extension development on Manifest V3
Engineering

Building a Bug Capture Browser Extension on Manifest V3

Engineering lessons from shipping a Chrome MV3 bug-capture extension: service worker death, rrweb buffering, MAIN-world hooks, and PII redaction at the edge.

May 22, 2026· 16 min
A developer pair-programming with an AI coding assistant on a dark IDE, with a bug tracker visible on a second monitor.
Guides

How to Connect Claude Code to Your Bug Tracker via MCP

Step-by-step guide to wire Claude Code into BugMojo via the Model Context Protocol so your AI agent can read, triage, and update bugs in about 10 minutes.

May 22, 2026· 10 min
Browse:GuidesPlaybooks