How to Redact Cardholder Data from Call Recordings Stored in AWS S3
by Ali Rind, Last updated: May 5, 2026, ref:

Modern contact center platforms drop call recordings into AWS S3 by default. Five9, Genesys Cloud, NICE CXone, RingCentral, and Amazon Connect all use S3 as the system of record for the recordings their softphones produce. The architecture is convenient until a customer reads a card number aloud during a refund call. From that point forward, the bucket holding that recording is in PCI scope, whether the team running it knows or not.
PCI scope discovery usually happens during a QSA assessment. The contact center has been operating for years. The S3 bucket retention is set to seven years for QA and dispute resolution. The PAN, CVV, and expiration dates spoken during thousands of calls are sitting unredacted in tens of thousands of recordings. The remediation work is now both a backfill project on the existing archive and an ongoing redaction pipeline on new recordings. This post covers the architectural choices that determine how that work runs.
Three architectures for redacting S3 recordings
Pick the architecture that matches the data sensitivity and compliance posture, not the one with the fewest moving parts.
Architecture A: Download, redact, re-upload
The default for desktop tools and small projects. A user pulls the recording from S3 to a local workstation, runs it through a redaction tool, and uploads the redacted copy back. AWS data egress charges apply on the download. The audit trail is whatever the desktop tool produces. The orchestration is manual. This pattern fits one-off compliance projects in low single-digit volumes. It does not fit ongoing PCI compliance for a working contact center, and at backfill scale it becomes operationally expensive on egress alone.
Architecture B: API-triggered same-region redaction
New recordings land in S3. An S3 event notification triggers redaction (typically through Lambda or a queue worker) that calls the redaction platform's API with a reference to the S3 object. The platform processes the recording inside the same AWS region, reads the source from S3, writes a redacted copy to a designated output bucket, and returns the audit log to the caller. No egress out of the region. Automated end to end. The audit trail is generated as part of the workflow.
VIDIZMO Redactor's S3 connector supports this pattern as an out-of-the-box integration. Combined with the Redactor API and webhook support, the architecture wires up cleanly inside AWS. Most ongoing PCI compliance pipelines for contact center recordings end up here.
Architecture C: VPC-deployed redaction inside the customer's AWS account
For organizations whose compliance posture forbids data leaving their AWS environment, the redaction platform deploys inside the customer's own VPC. The recordings never leave the customer's AWS boundary. This pattern fits regulated workloads where data residency, contractual restrictions, or regulator preference point to in-account processing. It is also the pattern for AWS GovCloud workloads where federal data handling requirements apply.
PCI DSS requirements your architecture must satisfy
The architecture choice is downstream of the PCI DSS requirements that apply to the recordings.
- Requirement 3.2.1: Sensitive authentication data (CVV, magnetic stripe, PIN) cannot be retained after authorization. CVVs spoken during calls fall under this rule. Redaction has to remove CVV before any long-term storage.
- Requirement 3.5: Stored PAN must be rendered unreadable. For audio, that means permanent mute or bleep on the timestamps where PAN is spoken.
- Requirement 7: Access to cardholder data must be restricted by business need to know. Bucket policies and IAM roles tightly scope access to the unredacted source bucket; broader access stays on the redacted output bucket only.
- Requirement 10: Access must be logged. The redaction platform's audit log plus AWS S3 server access logging on both buckets together form the audit evidence a QSA will request.
The full PCI DSS v4 documentation covers each requirement in operational detail. For an audio-specific overview, see the PCI DSS audio redaction page.
How to backfill historical call recordings without breaking compliance
The backfill problem is the harder of the two. New recordings can flow through Architecture B from day one. The existing archive sits in S3 (and often Glacier for older recordings) at scale. Handling it requires three steps.
Sample first: Pull 100 to 200 recordings across the archive. Run them through the pipeline. Count PAN and CVV occurrences. The detection rate gives a defensible scope estimate and surfaces format variations (deprecated codecs, unusual sample rates) that need workflow adjustments before bulk processing.
Segment the archive: Divide by business unit, call type, date range, or retention tier. Each segment may have different urgency and different stakeholders.
Sequence the work: Newest-first prioritizes recordings most likely to be subject to active QA or audit. Retention-driven prioritizes older recordings approaching deletion under retention policy.
VIDIZMO Redactor has been tested on bulk redaction at over 1.1 million recordings, putting most contact center backfills within capacity. For organizations that prefer not to run the backfill internally, Redaction-as-a-Service handles the operational work with dual QA review and chain of custody documentation.
See how VIDIZMO Redactor handles your S3 call recordings
Whether you're standing up a new redaction pipeline or backfilling years of archived calls, VIDIZMO Redactor's S3 connector, API, and bulk processing engine handle the architecture decisions covered above. Start a free trial on your own recordings, or talk to an expert to scope your backfill project.
Audit trail requirements for QSA review
A PCI QSA will ask three questions about the redaction pipeline.
What was redacted, and when?
The redaction platform's audit log answers this with operator or service principal, timestamp, action type, and the detection rule that fired. VIDIZMO Redactor stores audit logs in tamper-proof storage.
Who could access the unredacted version?
IAM role policies on the source bucket plus S3 server access logs answer this. The combination shows which identities had read access and which exercised it.
How do you verify it worked?
Sampling-based verification. A random sample of redacted recordings is reviewed by a human operator to confirm PAN and CVV occurrences were caught. VIDIZMO Redactor supports configurable confidence thresholds (25 to 90 percent) and a human-in-the-loop review step that produces this verification artifact as part of the standard workflow.
S3 redaction pipeline implementation checklist
A short cloud team's-eye-view of the work:
- Pilot batch (100 to 200 recordings) to confirm detection accuracy
- Validate redaction output against PCI DSS Requirements 3.2.1 and 3.5 by hand
- Set up IAM role for the redaction service with minimum viable permissions
- Decide on output bucket strategy (separate bucket, versioning, encryption)
- Wire up S3 event triggers or scheduled batch for ongoing recordings
- Plan the historical archive backfill sequencing
- Update retention policy to align unredacted source retention with PCI scope
- Document the pipeline for QSA review (architecture diagram, IAM policies, audit log samples, verification procedure)
People Also Ask
Yes, when those recordings contain cardholder data and are retained beyond authorization. PCI DSS 4.0 prohibits storing sensitive authentication data (CVV, full track, PIN) after authorization regardless of encryption (Requirement 3.2.1) and requires PAN to be rendered unreadable when stored (Requirement 3.5). Call recordings retained for QA, dispute resolution, training, or compliance fall under both. Redaction is the practical mechanism for meeting these requirements when the recordings cannot simply be deleted.
Yes, with an architecture that runs redaction in the same AWS region as the source bucket. The redaction service reads the source object from S3, processes it in-region, and writes redacted output back to a designated bucket. No egress, no workstation download, no manual orchestration. VIDIZMO Redactor's S3 connector supports this pattern through its REST API and webhook integration.
It depends on policy. Some organizations delete the unredacted source immediately after the redacted output is verified. Others retain the source under tighter access controls (separate bucket, restricted IAM, dedicated KMS key) for a defined window before deletion. The choice should be documented in retention policy before the pipeline goes live, because changing it mid-project creates audit complexity.
About the Author
Ali Rind
Ali Rind is a Product Marketing Executive at VIDIZMO, where he focuses on digital evidence management, AI redaction, and enterprise video technology. He closely follows how law enforcement agencies, public safety organizations, and government bodies manage and act on video evidence, translating those insights into clear, practical content. Ali writes across Digital Evidence Management System, Redactor, and Intelligence Hub products, covering everything from compliance challenges to real-world deployment across federal, state, and commercial markets.
Jump to
You May Also Like
These Related Stories

Permanent Redaction vs. Overlay Redaction: Difference & Importance

Screen Recording Redaction for SaaS Marketing Videos


No Comments Yet
Let us know what you think