Deepfakes have been around for a few years now. This technology is currently being used in malicious ways, like spreading false information on social media. It is also used for more comedic purposes, like putting Nicholas Cage in not-terrible movies. To some of you, this is might be old news, since Deepfakes first got widespread attention back in 2017. We even talked about it in our recent post where we called “Deepfakes a Cybersecurity Threat for 2020”. So, why bring it up again?

Deepfakes

The Target Has Changed

Previously, to create a believable deepfake, hours of source video showing the target’s face were required. So initially this limited its usage to celebrities, politicians, and other public figures. Recent advancements in Machine Learning have allowed fakes to be created using a single picture of the target and just 5 seconds of their voice. Nowadays, it is common for people to post pictures and videos of themselves on social media. This could be all an attacker needs to create a realistic deepfake. Sound scary? It is. The target has changed. Anyone who has a presence on social media could be vulnerable to impersonation over the phone and potentially even video calls. Let’s go over how these attacks work and how you can defend yourself and your company against them.

On the Phone

Voice deepfakes have been around for quite a while. A few years ago, Adobe showed off a program called VoCo. It required about 20 minutes of a person’s speech and was able to imitate them surprisingly well. Even though this product was intended for audio editing professionals, it is assumed to be discontinued due to ethical and security concerns. More recently, other companies have picked up where Adobe left off. There are now commercially available products, such as Lyrebird, Descript and others, that replicate or even improve on this technology. An open-source project called “Real-Time Voice Cloning” can generate believable voice clips using only seconds-long samples of a person’s speech.

Unfortunately, this kind of attack is no longer hypothetical. In 2019, “A Voice Deepfake Was Used To Scam A CEO Out Of $243,000”. The CEO thought he was speaking to the chief executive of the firm’s German parent company. What convinced him? He recognized his boss’s slight German accent and the melody of his voice on the phone. In this situation, having the correct voice gave this attacker enough credibility to extract $243,000 from his target. We have talked in the past about how powerful these vishing attacks can be but, with a tool like this in an attacker’s arsenal, vishing will be far more dangerous.

On a Video Call

Imagine you are working from home, thanks to COVID-19. You receive an email link from a coworker you have talked to a few times before. He is requesting that you join him in a video conference. The call proceeds as expected: you exchange greetings and discuss some sensitive company data. If the person looks and sounds like they normally do, what reason would you have to doubt their identity? Unfortunately, in this example, the coworker is a scammer intent on stealing company info. It may seem farfetched, but advancements in deepfake tech make it obvious that this kind of attack will soon be possible.

What took many hours of source video and computer time before, can now be done with a single picture in a fraction of the time. One of the more recent tools available for free on GitHub does exactly this:

While it might look like science fiction, this is real. The program only has access to one image of each actor, but as you can see, it’s able to copy blinking, eye movements, mouth movements, and even head tilts with minimal distortion. Tools like this one are iterating quickly, and are now useable in real-time. This opens the door to vishing-like attacks over video conferencing tools like Zoom.

How Can We Defend Against This?

Deepfakes are getting harder and harder to detect with our eyes and ears. AI-based detection methods are being developed that can help us identify fakes, but it’s important to keep in mind that these will likely never be foolproof. It’s like a game of cat and mouse; as detection gets better, so will the fakes. You need to be vigilant for when an attack slips through the cracks.

It’s important to have strict verification procedures enforced. Be sure to practice them, even when you recognize someone’s voice or face. Which verification method you choose depends on the security requirements of your company. Once employees have been educated, you need to be sure that verification procedures are being followed. You can test your employees by having them receive live calls from trained professionals who can emulate the tactics of real attackers.

You can protect yourself personally by limiting your public presence on social media. By enabling privacy restrictions, you can prevent scammers from easily stealing your voice and likeness. It’s always important to practice good account security. One of the main ways you can do this is to use multi-factor authentication on every account, if possible. We list more actions you can take in our article: “Secure It – Keep Your Digital Profile Safe from Vishers and Phishers”

Stay up to date on current attacks happening in the wild and spread awareness so others can defend against them.

Sources:

https://www.wired.com/story/future-of-artificial-intelligence-2018/
https://www.social-engineer.com/cybersecurity-threats-for-2020/
https://www.youtube.com/watch?v=I3l4XLZ59iw
https://www.bbc.com/news/technology-37899902
https://github.com/CorentinJ/Real-Time-Voice-Cloning
https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402
https://www.social-engineer.com/have-you-ever-received-one-of-those-calls/