Faking It (Again): There’s More Than Meets The Eye With Deepfake Audio

Who would have thought the old maxim that “if it walks like a duck and quacks like a duck” would no longer hold true? The meteoric rise of computing power and the internet have given rise to capabilities we at one time thought impractical if not impossible (streaming video on demand, anyone?). Unfortunately, such powerful capabilities can also be used for more nefarious purposes. Enter the rise of “deepfakes” (a combination of the phrase “deep learning” and the word “fake”). As I have written about here before, such deepfakes are getting more realistic and harder to detect. This is not limited to photos and videos — audio deepfakes are on the rise as well. Although still somewhat detectable, the technology continues to improve and get better. What’s scarier is this technology is looking more disruptive in the context of intellectual property and privacy law than you may think.

If you think I am being alarmist, think again. According to Siwei Lyu, director of SUNY Albany’s machine learning lab, as quoted in Axios, “having a voice [that mimics] an individual and can speak any words we want it to speak” will be a reality in a couple of years. Realistic audio deepfakes are not something on the horizon — they are on the doorstep. In this political season it is easy to see how such deepfakes may be used. For example, it’s not hard to imagine deep fake audio of Bernie Sanders’ voice designed to erode his primary chances, or audio attributed to President Trump that has been pieced together from his numerous interviews and appearances (like this) designed to disrupt and damage his 2020 presidential re-election campaign.

Unlike deep fake video, however, deep fake audio is more complex — it requires addressing the modulation of a person’s voice and intonation, as well as synthesis of the phraseology to properly mimic the target individual. Without the visual imagery to help convince the listener, the audio deepfake must be incredibly close to the original voice. Without question, the technology is getting there, but the law addressing it has a lot of catching up to do.

From a copyright perspective, I have written about the application of the fair use doctrine to deepfake works and how it takes the fair use analysis to a whole new level. I also addressed the Digital Millennium Copyright Act (DMCA) and the application of state privacy torts to such deepfakes (such as the misappropriation of name or likeness for commercial gain) and how a patchwork of state privacy torts is ultimately inadequate. With the inevitable advent of realistic deepfake audio, I am beginning to think there needs to be a more meaningful way to address the problem:

Deepfakes Require A Reassessment Of Section 230 Of The Communications Decency Act (CDA). CDA Section 230 provides that “[n]o provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.” Although originally designed as a mechanism to protect telecommunications providers from liability for third-party content, the federal courts have interpreted Section 230 to provide a liability shield to internet service providers (as interactive computer services) for third-party defamatory content online. The problem is that the CDA Section 230 liability shield disincentivizes such internet service providers from taking action against deepfakes (as they may be monetizing off of that content). As tools are developed to address and identify deepfakes (both video and audio), such providers should be incentivized to implement such technology so that users know that the content contains deepfake elements. I am not advocating censorship here whatsoever — some content may in fact represent satire or commentary that is legitimate fair use under Section 107 of the Copyright Act. I am merely advocating that users know deepfake elements are present so they are aware of those elements should they choose to view or listen.
Deepfakes Require A Uniform Federal Legislative Response. I do not take this point lightly, but federal privacy legislation designed to preempt certain state privacy torts may be a necessary response to the growing impact of deepfake content. The current state tort framework is simply inadequate to address the improper use of a person’s voice or likeness in deepfakes. Federal legislation designed to address fraudulent use of a person’s voice or likeness in a deepfake may be a start. I realize that this approach may open up a host of other problems (such as the line between legitimate criticism or commentary under copyright law and outright fraud), but I feel strongly that the discussion needs to occur. Existing legal frameworks are simply not enough.

When it comes to deepfake audio, realistic content is right around the corner. Although tools are being developed to address such content, the law needs to both incentivize the adoption of such tools as well as protect against improper deepfake use. Hopefully, this discussion will happen sooner rather than later. Until then, the next time you hear something from a public figure that sounds crazy, listen once and think twice.

Tom Kulik is an Intellectual Property & Information Technology Partner at the Dallas-based law firm of Scheef & Stone, LLP. In private practice for over 20 years, Tom is a sought-after technology lawyer who uses his industry experience as a former computer systems engineer to creatively counsel and help his clients navigate the complexities of law and technology in their business. News outlets reach out to Tom for his insight, and he has been quoted by national media organizations. Get in touch with Tom on Twitter (@LegalIntangibls) or Facebook (www.facebook.com/technologylawyer), or contact him directly at tom.kulik@solidcounsel.com.

+263 242 744 677

4 Gunhill Avenue,

Faking It (Again): There’s More Than Meets The Eye With Deepfake Audio