Deterministic Defense: How mDLs End Generative AI Attacks on Identity

Topics

Verifiable Digital Credentials, AI, Digital Customer Experience, Security

For the last decade, digital identity verification (IDV) has relied on a visual proxy: scan a physical driver’s license, capture a video selfie, and hope the person in the scan and video match. It approximated the in-person handshake well enough, at a time when images were relatively trustworthy and alternatives didn’t exist yet.

But that era is effectively over. Advances in generative AI and accessible deepfake tools have collapsed the trust boundary around visual verification methods, turning what was once a reasonable control into a growing liability.

We’ve reached the practical ceiling of what traditional IDV can deliver. The industry is now locked in an escalating arms race: fraudsters produce increasingly better fakes, while vendors compensate with heavier, more intrusive liveness detection to catch them, adding friction without restoring trust.

Traditional IDV is inherently probabilistic — it guesses. Automated algorithms attempt to guess if an image is genuine. But when AI can generate hyper-realistic "evidence" in milliseconds, guessing becomes an untenable security strategy. The rollout of mobile driver’s licenses (mDLs) across the US offers a path to exit this arms race entirely.

In this post, we explore why the industry must shift from probabilistic visual checks to deterministic cryptographic verification, not only to combat new AI-based attack vectors but also to significantly simplify the user experience, and why mDLs, which address the vast majority of attacks against video selfie verification, are the superior technology for the next generation of modern digital identity systems.

Note: We’ll explore this topic under the general expectation that you have substantial identity verification needs and you’re aiming to meet NIST’s IAL2 identity assurance guidelines, which we believe to be a practical minimum bar for online use cases.

How we got here

To understand why selfies are unnecessary for mDLs, we have to understand why selfie-based checks for physical IDs were introduced in the first place.

The internet matured faster than civil identity documents and systems could evolve. As more of our lives moved online, so did the need to verify remotely. In the absence of a native digital credential, businesses were forced to rely on documents designed for physical, face-to-face interactions. Early approaches simply asked users to upload a photo of their physical driver's license. But these are static images — easy to copy, alter, and counterfeit. In the age of generative AI, producing a convincing fake license is not just possible, it’s trivially easy.

Enter selfie checks

Even when data from a static ID image is validated against an authoritative database, such as the AAMVA DLDV lookup service, it only confirms that the data is accurate and was issued by a DMV. It doesn’t confirm that the person presenting the ID is the person to whom it was issued. These attacks are real — legitimate license data is superimposed on blank cards next to an unrelated image and used to pass verification checks.

To close that gap, the industry adopted biometric holder binding, commonly known as the “selfie check” or “liveness check”: If you have been asked to hold your camera up to your face, move your head side to side, blink, etc... that process is a biometric capture to try prove that you are the valid holder of the ID you're presenting.

The goal was to bind a static document to a living human. When physical, analog documents are used online, this step becomes non-negotiable. Without it, systems are vulnerable to basic presentation attacks: screenshots, replays, and stolen documents.

And so this became the standard pattern. Take a photo of your driver’s license (in front of a high contrast background, make sure your lighting is good, too!), then perform a series of gestures on video. After all that friction, the verifier still doesn’t get a definitive answer — only a confidence score. Not a yes or no, but a probability. And to top it all off, the probability might be too low and require an alternative method of verification anyway.

Why mDLs are different

A mobile driver’s license is not a picture of a card on a phone. It’s a set of verifiable claims encoded in a digitally signed document — also known as a verifiable digital credential — backed by an extensive library of industry standards. In simple terms: it’s a digital version of the physical driver’s license issued by your state DMV. When we compare mDLs to physical IDs, three fundamental differences emerge.

1. Issuer signature

Physical IDs in the US, which generally lack embedded secure elements, rely on holograms and UV light features to deter counterfeiting. mDLs rely on strong cryptography.

Every mDL carries a digital signature from the issuing authority (e.g., the DMV). When a relying party (RP) receives the data, it verifies that signature. If even a single bit has been altered, verification fails. No guessing or estimation required, the RP knows definitively that the credential came from the DMV. On top of that, the cryptography ensures that it is effectively impossible to counterfeit an mDL.

2. Possession (device binding)

The ability for a user to cryptographically prove possession of an mDL is entirely new, and fundamentally different from presenting a physical ID. With physical documents, possession is implied. With mDLs, it is provable.

Each mDL is bound to non-extractable cryptographic material stored in the device’s secure element. The private keys used to present the credential can’t be exported, replayed, or even copied to another phone. Proof of possession is enforced at the hardware level, making it one of the strongest authentication factors available today.

This is not a heuristic. It is a cryptographic fact.

Note: Issuers actively enforce scarcity. For example, Colorado strictly limits mDL issuance to two active devices per resident. You can’t ‘farm’ these credentials.

3. User activation (wallet unlock)

In the physical world, “holder binding” is performed visually: a human compares the photo on the card to the face of the person standing in front of them. In the digital world, the wallet facilitates holder binding by authenticating the user at the moment of use, often using platform biometric capabilities.

This process, called user activation, typically requires platform-level authentication, such as FaceID or fingerprint unlock. The act of unlocking the wallet gates access to the credential’s signing key and authorizes its use for a specific presentation.

Because mDLs establish a strong cryptographic chain of custody from issuer to device to user — the issuing authority (e.g. the DMV) trusts the wallet to enforce holder binding on its behalf.

User presence is mandatory. An mDL can’t be silently shared or replayed in the background. The user must actively unlock the device or credential to authorize each presentation.
Biometrics stay local. The key that unlocks the mDL — bound at issuance — typically requires the device owner's biometric profile to unlock. No biometric data is transmitted to, or stored by, the relying party.

At a minimum, the wallet is responsible for authenticating the user and ensuring user presence — representing a significant privacy and security win. It moves away from server-side biometric authentication, which is brittle (biometrics can’t be revoked), privacy-invasive (it requires storage of biometric data in the relying party database), and represents a huge scalable remote attack vector (an attacker can obtain biometric profiles on everyone).

The flow looks like this:

The DMV issues the credential to the wallet → The wallet secures the user’s (aka the credential subject’s) signing key, which is stored in secure enclave, behind platform authentication (typically biometrics) → The user unlocks the key to present.

In short, holder binding now occurs at the device and user level, not the relying party. The verifier no longer needs to perform biometric checks because it trusts the issuer — and by extension, the wallet — to have done so correctly.

NIST’s evolution of SP 800-63-4

NIST Special Publication 800-63-4 has quickly become the de facto reference for identity, authentication, and federation in the US. More than a checklist of requirements, it provides a framework for modeling identity risk and assurance across the entire user lifecycle.

Revision 4, finalized in July 2025, introduced meaningful changes to how identity evidence is evaluated — most notably, formal treatment of mobile driver’s licenses. The full details are extensive, but one change is particularly relevant: the introduction of a “Superior” evidence classification capable of supporting IAL2 identity verification at the highest level. Excitingly, under this framework, mDLs are classified as Superior evidence, while physical identity documents remain classified as “Strong” evidence.

Strong evidence (physical ID)

In SP 800-63A-4, NIST classifies physical identity documents (like physical driver's licenses) as Strong evidence. While reliable, they are fundamentally static.

“Evidence that… is reasonably resistant to tampering, but for which the binding to the subscriber can be forged or mimicked (e.g., by use of a high-quality photocopy or video injection).”

Because physical IDs aren’t holder bound in a mathematically provable way (e.g., one can present someone’s stolen ID), relying parties must perform additional biometric comparison — typically facial match and liveness — to bind that credential to the presenter.

Superior evidence (mDL)

By contrast, NIST classifies cryptographic credentials — specifically those held in subscriber-controlled wallets, such as mDLs — as Superior evidence.

“Evidence that… contains a digital signature or message authentication code (MAC) over the evidence data that validates the integrity of the evidence... and is bound to an authenticator possessed by the subscriber.”

The definition maps directly to how mDLs work:

Cryptographic integrity is enforced by the issuer’s digital signature
Holder binding is enforced by device-bound keys
User presence is enforced at the time of presentation

No probabilistic matching. No visual inspection. No selfie.

Shift in binding model

NIST SP 800-63A-4 (§ 4.2.6.2) clarifies that for Superior evidence, when using the digital evidence pathway for verification, the user activation itself proves possession and control.

For physical IDs: You need visual binding (selfie check).
For mDLs: You need cryptographic binding unlocked by user activation (digital signature and wallet unlock).

Visualizing the shift

Feature

Physical ID (Legacy flow)

Mobile driver’s license (mDL)

Evidence type

Static image

A picture of a physical document

Verifiable digital credential

Digitally signed data object

Tamper proofing

Analog security

Holograms, UV, and microtext

Digital signature

Cryptography. If a single bit changes, verification fails

Holder binding

Visual binding

Human/AI compares a selfie to the ID photo

Cryptographic binding

Proof of possession of device key

User presence

Liveness check

"Turn your head," "Blink."

Device authentication

Face, fingerprint, PIN unlock

NIST classification

"Strong" evidence

Requires a selfie check and additional evidence for IAL2

"Superior" Evidence

No extra selfie needed

Primary risk

Presentation attacks

Deepfakes, masks, counterfeits

Device compromise

Coerced device unlock, stolen device PIN

The status quo has changed

In addition to introducing the Superior evidence classification, NIST also raised the bar for meeting an IAL2 standard with Strong evidence alone.

Under Revision 3 of SP 800-63 (published in 2017), a single piece of Strong evidence paired with a selfie-based check was sufficient to meet IAL2. Revision 4 acknowledges that this model no longer reflects the current threat landscape.

Under the updated guidance, NIST now requires two pieces of Strong evidence — or one Strong plus one Fair — to meet IAL2 in the absence of Superior evidence.

The implication is clear: if you’re still relying on a basic ID scan plus selfie check and assuming you meet IAL2, you do not. As of Aug 2025, that model falls short.

While we’ll likely see stopgap solutions emerge — such as combining selfies with phone number verification — the most direct and robust path to IAL2 is the use of mDLs as Superior evidence.

Addressing concerns

Moving from visual identity verification to cryptographic verification represents a genuine paradigm shift, and skepticism is natural. The most common reaction is:

“If mDLs don’t involve a selfie check, that feels less secure. I’d only be comfortable adopting them alongside a selfie.”

However, when viewed through the lens of modern threats (particularly AI-driven fraud) the arguments for retaining the selfie quickly fall apart.

1. Doesn’t mDL just move the selfie check from verification time to issuance time?

Yes — and that is exactly where it belongs. This is a move from probabilistic matching to deterministic assertion.

When a vendor asks a user to take a selfie today, they are performing a probabilistic match against a photo of a physical document. This process is highly susceptible to "injection attacks," where AI-generated images or video feeds mimic a live user. Defending against these attacks requires continuous tuning of liveness detection algorithms — an endless game of cat-and-mouse.

By contrast, the DMV performs the biometric verification once during issuance, against an authoritative biometric record captured in person during a visit to the DMV.

Two critical properties follow:

Reduced surface area for AI injection: The DMV matches the individual against high-fidelity biometrics captured in person (at a specialized workstation) — not against low-resolution images uploaded from an unmanaged device.
The cryptographic chain of trust: When you verify an mDL, you are not merely "trusting" that the DMV did a good job. You are cryptographically verifying the integrity of that check — consuming the result of a high-assurance, in-person binding event rather than attempting to recreate a low-assurance, remote one.

NIST’s definition of Superior evidence explicitly requires the issuer to have performed an in-person identity proofing step. Wallet-based issuance flows are deliberately downstream of the high-assurance onsite visit to the DMV.

Accepting an mDL is not outsourcing your security decision — it is consuming the result of the highest-assurance verification event in the ecosystem. This moves the security burden to the point of highest friction and highest data fidelity — issuance — drastically reducing the attack surface for everyone else. This is the shift from a transactional model of identity to a credential-based one.

2. Isn't a selfie still useful for “defense in depth”?

True defense in depth means layering distinct types of security controls, not simply stacking redundant ones. NIST’s stance is clear: the cryptographic proof of possession and user activation (wallet unlock) inherent to mDLs provides "Superior" evidence, compared to the assurance provided by a visual selfie.

Adding a selfie to an mDL flow doesn't increase assurance; it primarily increases friction. Selfies were designed to stop someone from holding up a stolen physical card, but mDLs solve this via user verification. You can’t "hold up" or remotely present someone else's mDL because you can’t extract it from their device. Furthermore, NIST guidelines have evolved precisely because selfie checks are increasingly vulnerable to generative AI injection attacks. Adding a weak, AI-susceptible check on top of a cryptographically secure one does not make you safer.

If your concern is that a legitimate user might hand off the session to a fraudster immediately after verification, a selfie won't save you — that’s a Session Security problem, not an identity verification problem. Solving that requires token binding (like OAuth DPoP), not more video gestures.

3. What if my partner’s biometric data can also unlock my phone?

This question touches on the concept of “intimate collusion”. There is an important distinction between scalable criminal fraud and local authorized delegation.

mDLs eliminate scalable fraud (bot farms, remote attackers) because the credentials can’t be farmed. That leaves local scenarios: a spouse or child who knows the device PIN or has a registered face or fingerprint.

For many real-world services, delegated authority is normal and necessary: a parent managing a child’s healthcare portal, or an adult assisting an elderly family member with government services. Legacy "scan + selfie" systems often fail these legitimate guardian scenarios.

The existential threat to digital identity today is not local, one-off misuse. It’s scalable, AI-driven fraud: bot farms creating thousands of synthetic identities, and deepfake-based remote attacks. mDLs directly neutralize this threat because the credentials are hardware-bound and non-exportable.

The "spouse unlocking the phone" scenario is a non-scalable, local event — and a selfie check does little to prevent it anyway. A user willing to unlock their phone for someone else is just as likely to hold the phone up for the selfie first.

If your threat model can’t tolerate that level of intimate collusion, the only solution is onsite attended verification — requiring the user to be physically present and supervised for the duration of the session.

4. Okay, but what about adversarial intimate collusion?

If you are operating a product or service in an environment where adversarial intimate collusion is a concern, you know who you are. Your concern is valid and the industry is working to make it more clear (1) what type of verification process was used to issue an mDL and (2) what type of local authentication occurred when the user activated the credential in their wallet. That way, if you require a biometric factor and one was not present, you can engage additional checks.

It is still preferable to perform the biometric check locally on the user’s device (for privacy and friction reasons), and an mDL plus biometric activation signaling is the mechanism to make that happen.

NIST is actively working on building assurance in the mDL ecosystem, including developing a specific profile of SP 800-63A-4 for mDL issuance that addresses these concerns. This creates a unified baseline for security across the ecosystem, ensuring that when you accept an mDL, the intense scrutiny of this regulated issuance process is transitively conferred to you at verification time — and that the signals you need are present to inform your authorization policy engine.

The path forward is certain

Mobile driver’s licenses are not simply digital copies of physical cards. They represent a fundamental architectural shift in how identity is proven. We’re moving from a world of visual approximation to a world of cryptographic certainty.

The "scan + selfie" model has served its purpose, but it has reached its technical limits. Generative AI has eroded the trust we can place in optical evidence. Continuing to layer visual checks on top of digital credentials isn't defense in depth — it is adherence to a legacy security model that is rapidly becoming obsolete.

Conclusion

The technology to defeat modern identity fraud already exists and it no longer requires a camera. This is one of the rare moments where a stronger security and better user experience move in the same direction. Take advantage of it.

If you’re building for the future of identity, the path forward is deterministic, and Okta can help. We’re building the platform for enterprises to incorporate high-assurance, cryptographically verifiable digital credentials directly into their identity flows.

Trust the math, not the image: Cryptographic signatures can’t be deepfaked. Device-bound keys can’t be injected.
Embrace Superior evidence: NIST has made its position clear: mDLs classified as Superior evidence provide stronger assurance than physical IDs classified as Strong evidence, by design.
Improve friction without reducing security: Eliminating the selfie removes one of the highest-friction steps in identity verification, improving conversion rates and reducing manual review without sacrificing security.

Supporting mDLs in your identity verification flows does not create a security gap. It closes one. You replace a fragile, probabilistic, friction-heavy image-based process with a seamless interaction rooted in cryptographic proof.

To learn more about how Okta is enabling this shift, explore our Okta Digital ID Verification Beta and understand how deterministic, cryptographic identity can fit into your existing architecture.

Want more? Try VDCs in the real world with a live, hands-on demo straight from your phone.

About the Author

David Cowden

Principal Engineer

David Cowden is a Principal Engineer at Okta dedicated to uniting civil and digital identity. He is focused on replacing friction-heavy processes with the vastly superior experiences enabled by verifiable digital credentials (VDCs). Drawing on a background in credential managers, passkeys, workload identity, and network security, David designs solutions that make high-assurance identity verification seamless and accessible for Okta's customers.

Okta

Auth0

Discover our latest stories

The end of the selfie era: Why generative AI demands deterministic identity

Topics

Table of Contents

Share