Okta Threat Intelligence has detected and dissected multiple custom phishing kits that have evolved to meet the specific needs of voice-based social engineers (“callers”) in vishing campaigns.
These custom kits are made available on an as-a-service basis and are increasingly used by a growing number of intrusion actors targeting Google, Microsoft, Okta and a range of cryptocurrency providers.
The kits are capable of intercepting the credentials of targeted users, while also presenting the supporting context required to convince users to approve MFA challenges, or to take other actions in the interests of the attacker on the phone. They can be adapted on the fly by callers to control what pages are presented in the user’s browser, in order to sync with the caller’s script and whatever legitimate MFA challenges the caller is presented with as they attempt to sign-in.
“Once you get into the driver’s seat of one of these tools, you can immediately see why we are observing higher volumes of voice-based social engineering,” said Moussa Diallo, threat researcher at Okta Threat Intelligence. “Using these kits, an attacker on the phone to a targeted user can control the authentication flow as that user interacts with credential phishing pages. They can control what pages the target sees in their browser in perfect synchronization with the instructions they are providing on the call. The threat actor can use this synchronization to defeat any form of MFA that is not phishing-resistant.”
Okta Threat Intelligence has published a detailed threat advisory for customers that provides an inside look at the capabilities of two such kits used by intrusion actors. This blog post summarizes the key features that make these kits so effective.
When all else fails, hit the phones
The phishing kits appear, based on common features, to have evolved from the same lineage to specifically meet the needs of callers that are interacting with targeted users in real-time.
The most critical of these features are client-side scripts that allow threat actors to control the authentication flow in the browser of a targeted user in real-time while they deliver verbal instructions or respond to verbal feedback from the targeted user. It’s this real-time session orchestration that delivers the plausibility required to convince the threat actor’s target to approve push notifications, submit one time passcodes (OTP) or take other actions the threat actor needs to bypass MFA controls.
Figure 1. A conceptual view of this hybrid social engineering attack
Attacks tend to follow a similar sequence:
The threat actor performs reconnaissance on a target, learning the names of users, the apps they commonly use, and phone numbers used in IT support calls;
The threat actor sets a customized phishing page live and calls targeted users, spoofing the phone number of the company or its support hotline;
The threat actor convinces the targeted user to navigate in their browser to the phishing site under the pretext of an IT support or security requirement;
The targeted user enters their username and password, which is automatically forwarded to the threat actor’s Telegram channel;
The threat actor enters the username and password into the legitimate sign-in page of the targeted user and assesses what MFA challenges they are presented with;
The threat actor updates the phishing site in real-time with pages that support their verbal ask for the user to enter an OTP, accept a push notification, or other MFA challenges.
This real-time session orchestration provides a new level of control and visibility to the social engineer. If presented a push notification (type of MFA challenge), for example, an attacker can verbally tell the user to expect a push notification, and select an option from their C2 panel that directs their target’s browser to a new page that displays a message implying that that a push message has been sent, lending plausibility to what would ordinarily be a suspicious request for the user to accept a challenge the user didn’t initiate.
Figure 2. A C2 panel analyzed by Okta Threat Intelligence shows how callers can control the authentication flow on Microsoft-themed pages
It’s worth noting that these hybrid phishing operations are also capable of bypassing push notifications that use number challenge/number matching as an additional method of verification. Push with number matching/challenge is not phishing-resistant by definition, as a social engineer interacting on the phone with a targeted user can simply request a user to choose or enter a specific number.
By contrast, users that are required to sign in with phishing resistant methods such as Okta FastPass or FIDO passkeys are protected from these attacks.
This is how it’s done now
Diallo predicts that we’re only at the beginning of a wave of voice-enabled phishing attacks, augmented by tools that provide real-time session orchestration.
“Vishing is becoming such an in-demand area of expertise that, much like access to these kits, that expertise is also sold on an as-a-service basis,” Diallo said.
Further, he has observed the real-time session orchestration features of earlier kits being copied into new phishing kits designed exclusively to augment the needs of callers.
Where threat actors could once pay for access to a kit with basic features that targeted all popular Identity Providers (Google, Microsoft Entra, Okta etc) and cryptocurrency platforms, a new generation of fraudsters are attempting to sell access to bespoke panels for each targeted service.
Recommendations
Thankfully there is absolutely no doubt about what defenders need to do.
“In a workplace context, there is no substitute for enforcing phishing resistance for access to resources,” said Diallo.
When using Okta for workforce authentication, that would equate to enrolling users in Okta FastPass, passkeys or “both for the sake of redundancy,” he said.
Social engineering actors can also be frustrated by setting network zones or tenant access control lists that deny access via the anonymizing services favoured by threat actors.
“The key is to know where your legitimate requests come from, and allowlist those networks,” Diallo said.
Some banks and cryptocurrency exchanges are also experimenting with live caller checks - in which a user can sign into a mobile app to find out whether they are on a phone call with an authorized representative at the time.
Read More
Okta Threat Intelligence has published threat advisories on voice-enabled phishing campaigns in April 2025 and January 2026 that are available exclusively to the security contacts of Okta customers.
These threat advisories include:
Indicators of Compromise (IoCs)
Analysis of multiple phishing kits
TTPs of intrusion actors conducting these attacks
Detailed control recommendations
To learn more about Okta’s approach to phishing resistance, start here.