PCI Compliance for AI Companies: What Developers Need to Know

If your AI application accepts payments, you're subject to PCI DSS (Payment Card Industry Data Security Standard). There's no exemption for startups, no grace period for MVPs, and no exception because you "only use Stripe." PCI applies to every entity that stores, processes, or transmits cardholder data — and the definition of "transmits" is broader than most developers realize.

The good news: for most AI companies, PCI compliance is manageable. The bad news: AI introduces compliance risks that don't exist for traditional software companies — risks around training data, agent access to payment information, and logging practices that can inadvertently capture card data. This guide covers what you need to know.

What Is PCI DSS and Why It Applies to AI Companies

PCI DSS is a set of security standards created by Visa, Mastercard, American Express, Discover, and JCB. If you accept cards from any of these networks, you must comply. The standard covers how you handle cardholder data: primary account numbers (PANs), expiration dates, CVVs, and cardholder names.

"But I use a payment processor that handles all the card data," you might say. True — and using a processor significantly reduces your compliance scope. But it doesn't eliminate it. Even if card numbers never touch your servers, PCI still applies to the web pages that host the payment form, the servers that redirect to the processor, and any system that could influence the payment transaction. You're always in scope to some degree.

For AI companies specifically, PCI matters because:

AI products process high-value transactions. Enterprise AI tools bill thousands per month. A data breach affecting these accounts has outsized financial impact.
AI systems interact with data in ways traditional software doesn't. A model trained on customer support logs might inadvertently learn card numbers. An agent with API access might programmatically retrieve payment details. These are new attack surfaces.
Compliance failures have real consequences. Fines range from $5,000 to $100,000 per month, and your acquiring bank can terminate your merchant account — which means you can't accept card payments at all.

The 4 PCI Compliance Levels

PCI compliance requirements scale with your transaction volume. Most AI startups are Level 4 (the lightest requirements), but growth can push you into higher levels faster than you expect.

Level	Annual Transactions	Requirements	Typical For
Level 4	Under 20,000 e-commerce or under 1M total	SAQ + quarterly vulnerability scan	Early-stage AI startups
Level 3	20,000 to 1M e-commerce	SAQ + quarterly vulnerability scan	Growing SaaS with moderate billing
Level 2	1M to 6M total	SAQ + quarterly vulnerability scan by ASV	Scaled AI platforms
Level 1	Over 6M total	Annual on-site audit by QSA + quarterly ASV scan	Enterprise AI infrastructure

A key nuance: "e-commerce transactions" means card-not-present (online, API-based). Since AI products are almost exclusively card-not-present, you hit the lower e-commerce thresholds, not the higher "total transaction" thresholds. An AI startup doing 25,000 API-billed charges per year is Level 3, not Level 4.

SAQ Types Explained

The Self-Assessment Questionnaire (SAQ) is how most AI companies demonstrate compliance. But there are multiple SAQ types, and using the wrong one is a compliance violation in itself.

SAQ A — Fully Outsourced

For merchants who fully outsource all payment processing. Card data is entered on the processor's page (hosted checkout), and your systems never touch, see, or influence the payment form. This is the shortest SAQ — about 22 questions.

Applies if: You redirect users to a hosted checkout page (like Stripe Checkout or PayPal) and they return to your site after payment. Your site doesn't host any payment form elements.

SAQ A-EP — Partially Outsourced (E-Commerce)

For merchants whose website hosts payment form elements (even via iframe) but doesn't directly receive card data. The processor's JavaScript or iframe captures the card details, but your page is the context in which the form renders.

Applies if: You use an embedded payment form (Stripe Elements, Braintree Drop-in, etc.) on your own page. Card data goes directly from the browser to the processor — your server never sees it — but your page could theoretically be compromised to capture the data (e.g., via XSS). About 139 questions.

SAQ D — Full Scope

For merchants who directly handle card data on their servers. If raw card numbers pass through your backend at any point — even briefly, even in memory — you're SAQ D. This is the most comprehensive assessment, with 329 questions and significant infrastructure requirements.

Applies if: Your server receives, processes, or stores raw card numbers. Almost no AI startup should be in this category. If you are, you're probably doing something wrong architecturally.

Tokenization is not encryption. Encryption transforms card data into ciphertext that can be reversed with a key. If you hold the decryption key, you're still in scope for SAQ D. Tokenization replaces card data with a meaningless token that has no mathematical relationship to the original number. The token is useless without the processor's systems. This is why tokenization reduces your PCI scope and encryption doesn't.

How Tokenization Reduces Your PCI Scope

Tokenization is the single most important architectural decision for PCI compliance. Here's how it works in practice:

The user enters their card number in a form element hosted by your processor (an iframe, a JavaScript component, or a mobile SDK).
The card data goes directly from the user's browser to the processor's servers. It never touches your infrastructure.
The processor returns a token — a random string like tok_a1b2c3d4e5 — that represents that card.
You store the token in your database and use it for all future charges. Your systems never hold, see, or transmit the actual card number.

With tokenization, your PCI scope drops from SAQ D (329 questions) to SAQ A-EP (139 questions) or even SAQ A (22 questions), depending on how you render the payment form. The token is worthless outside the context of your processor relationship — an attacker who steals your token database gets nothing usable.

AI-Specific Compliance Considerations

Here's where PCI compliance for AI companies diverges from the standard guidance. These are the risks that your QSA (Qualified Security Assessor) might not think to ask about, but that can break your compliance nonetheless.

Card Data in Training Sets

If your AI model is trained on customer data — support tickets, chat logs, transaction records, emails — there's a risk that card numbers, expiration dates, or CVVs are embedded in that training data. A model that has "learned" card numbers from training data is technically storing cardholder data, even though it's encoded in model weights rather than a database row.

The mitigation: scrub all training data for PCI-relevant patterns before it enters your training pipeline. Use regex patterns to detect and redact 15-16 digit sequences that pass Luhn validation, 3-4 digit CVV patterns adjacent to card numbers, and expiration dates in MM/YY format. This should be an automated step in your data pipeline, not a manual review.

Agent Access to PANs

If you're building AI agents that handle payments, the agent's context window might include card data. A customer support agent that can "look up the last four digits of your card" needs careful scoping — does the underlying API return only the last four, or does it return the full PAN with masking applied at the display layer? If the full PAN is in the API response, even if the agent's UI only shows the last four, the agent's memory/context has the full number.

Design agent APIs so they never return full PANs. The truncated version (last four digits) should be what the API returns, not what the display layer masks.

Logging and Data Retention

AI applications tend to log aggressively — request/response payloads, model inputs/outputs, debugging traces. If any of these log streams capture card data (even accidentally, such as a customer pasting their card number into a chat input), your log storage becomes part of your PCI scope.

Implement log scrubbing that runs before data hits your logging infrastructure. Filter for card number patterns, and either redact them or reject the log entry entirely. This applies to application logs, API gateway logs, model inference logs, and any observability platform (Datadog, Splunk, etc.) that ingests your data.

Data Retention Policies

PCI DSS requires that cardholder data be deleted when no longer needed for business purposes. AI companies often retain data indefinitely for model improvement. If any retained dataset contains cardholder data (even tokenized data in some interpretations), you need a documented retention policy that specifies what's kept, why, and when it's purged.

Building a Compliant Architecture

Here's a practical checklist for building a PCI-compliant AI application. The goal: minimize what touches card data, and protect everything that does.

What Should Touch Card Data

Your processor's hosted form or JavaScript SDK (this is where card data is entered)
Your processor's API (this is where tokens are converted to charges)
Nothing else.

What Should NOT Touch Card Data

Your application servers
Your database (store tokens, not card numbers)
Your logging infrastructure
Your AI model training pipeline
Your AI agents' context windows
Your analytics or observability systems
Your customer support tools (truncated PANs only)

Infrastructure Controls

TLS everywhere. All data in transit must be encrypted. This includes internal service-to-service communication, not just external-facing endpoints.
Network segmentation. Systems that process payments should be in a separate network segment from your general application infrastructure. If your AI inference servers are compromised, they shouldn't have a path to your payment processing components.
Access controls. Limit who can access payment-related systems. Use role-based access, require MFA, and log every access event.
Vulnerability scanning. Quarterly external vulnerability scans by an Approved Scanning Vendor (ASV) are required for all PCI levels. Internal scans should be more frequent.

Common Mistakes That Break Compliance

These are the mistakes we see most often with AI companies:

Logging full API request/response payloads that include card data from customer-facing inputs. Even if your payment form is tokenized, a chat input or support ticket might contain a card number that a customer typed in plain text.
Using card data as a feature in ML models. Transaction amount, merchant category, and time of purchase are fine as model features. Card number, even hashed, is not. Hashing is not tokenization — a hash of a card number can be brute-forced because the input space (valid card numbers) is relatively small.
Storing tokens without access controls. Tokens are lower risk than raw card data, but they're still credentials. Treat your token store with the same access controls as any sensitive data.
Forgetting about the developer environment. If developers pull production data (including tokens or card data) into local development environments for debugging, those laptops are now in PCI scope. Use synthetic test data in non-production environments.
Not updating your SAQ when your integration changes. You started with hosted checkout (SAQ A), then switched to embedded forms (SAQ A-EP), but never updated your SAQ. You're now non-compliant.
Ignoring third-party risk. Every third-party service that touches or could influence your payment flow is part of your PCI scope. That includes your CDN, your analytics provider, and any JavaScript you load on payment pages. If a third-party script is compromised, it could capture card data from your embedded payment form.
No incident response plan. PCI requires a documented plan for what happens when (not if) a security incident occurs. Most startups skip this until they need it. Write the plan before you need it.

How AI Payware Handles PCI for You

AI Payware is a PCI Level 1 certified service provider. When you process payments through us, the PCI burden shifts almost entirely to our infrastructure:

Client-side tokenization. Our JavaScript SDK captures card data in a hosted field and returns a token. Card numbers never touch your servers — you stay at SAQ A or SAQ A-EP.
Token storage. We store the encrypted card data; you store the token. Your database contains nothing an attacker could use to make a charge.
Compliant tokenization infrastructure. Our tokenization uses industry-standard format-preserving encryption with per-merchant isolation. Tokens from one merchant can't be used with another.
AI-aware compliance guidance. We help AI companies identify AI-specific PCI risks — training data scrubbing, agent PAN exposure, logging hygiene — that generic processors don't consider.
Quarterly ASV scans included. We include Approved Scanning Vendor scans for merchants who need them, so you don't have to source your own ASV.

The goal is simple: you focus on building your AI product, and we make sure the payment layer is compliant by default. If you're evaluating how to embed payments in your AI application, starting with a PCI-compliant foundation saves you from retrofitting compliance later.

PCI Compliance for AI Companies:What Developers Need to Know

What Is PCI DSS and Why It Applies to AI Companies

The 4 PCI Compliance Levels

SAQ Types Explained

SAQ A — Fully Outsourced

SAQ A-EP — Partially Outsourced (E-Commerce)

SAQ D — Full Scope

How Tokenization Reduces Your PCI Scope

AI-Specific Compliance Considerations

Card Data in Training Sets

Agent Access to PANs

Logging and Data Retention

Data Retention Policies

Building a Compliant Architecture

What Should Touch Card Data

What Should NOT Touch Card Data

Infrastructure Controls

Common Mistakes That Break Compliance

How AI Payware Handles PCI for You

Ready to process payments without the PCI headache?

PCI Compliance for AI Companies:
What Developers Need to Know