Content provenance is our best chance in the fight against deepfakes

According to Sophie Nightingale (Lancaster University) and Hany Farid (UC-Berkeley), AI-synthesized content has passed through the uncanny valley.

In their recent study, which is set to publish next week, study participants were unable to distinguish between AI-synthesized faces and real faces; even more disconcerting, the participants found the AI-synthesized faces to be more trustworthy.

Unfortunately, identifying synthetic media at scale is no easy task. In agreement with the majority of deepfake experts, Nina Schick, who penned the book Deepfakes in 2020, believes we’ll never be able to detect all manipulated content.

As AI-synthesized content becomes increasingly prevalent, it will take more than deepfake detection tools to keep society on an even keel. Schick believes that we’ll need the ability to ascertain the provenance of any given piece of media: Where did it come from originally? And has it been altered?

Why we can’t rely solely on deepfake detection

Speaking with journalists at Rest of World, Sam Gregory, executive director of the international non-profit Witness, says,

“Deepfake detection tends to only work well if you know the method of how the deepfake was created, and when you’re working with high-quality media.”

Unfortunately, due to the nature of JPEGs, video files, and the internet as a whole, most of the content circulating on the web is composed of low-quality media that has been modified. That said, there have been successful deepfake detection systems; however, detecting deepfakes quickly becomes a game of cat and mouse. The detection algorithms only work for a short period of time.

Hao Li, a scholar at UC-Berkeley who creates generative adversarial networks, realized three years ago that deepfake detection systems will only work for a brief period. As Li bluntly told The Verge, “At some point it’s likely that it’s not going to be possible to detect [AI fakes] at all.” Li and his colleague Hany Farid are in agreement that deepfake detection systems are not the sole answer.

Social engineering attacks are the tip of the iceberg

Deepfakes have the potential to facilitate devastating social engineering attacks on organizations, government agencies, and individuals. In a recent conversation between Hany Farid and Nina Schick, Farid admits that he is quite worried about phishing attacks, and other small scale attacks that are facilitated via synthetic media.

As Farid laments,”Phishing scams, spams, things that are highly targeted to individuals are going to disrupt our economy. [Think about] insurance companies. Think about all the e-commerce we do, and how much we have to trust—not just the websites we go to, but the content.”

Put simply, if we cannot trust the content we come across on the Internet, there’s no limit to the level of disruption we’ll encounter.

In the same conversation, Farid offers Schick another example of the potential havoc caused by synthetic media. Farid says, “Here’s just one example, I create a deepfake of Mark Zuckerberg saying, ‘Facebook profits are down 20%.’ They’re not. I let that go viral. How quickly can I move the market to the tune of billions of dollars before anybody sorts it out?”

It’s a good point. However, deepfakes are more than just fodder for social engineering attacks, phishing efforts, and stock market manipulation. The potential repercussions are far greater. As Farid says,

“I don’t think it’s hyperbolic to say that deepfakes have the potential to disrupt democracies, societies, economies, and individual lives—across the board.”

The liar’s dividend

The very existence of synthetic media is a blessing for nefarious actors and unscrupulous people in positions of power. Deepfakes’ mere existence allows bad actors to cast doubt on credible news clips. In the near future, anyone may be able to deny the veracity of a real media clip, as Trump allegedly did with the Access Hollywood tape. Legal scholars Danielle Citron and Robert Chesney have popularized a term for this phenomenon: the liar’s dividend.

In their PNAS study on AI-synthesized faces, Nightingale and Farid explain, “Perhaps most pernicious is the consequence that in a digital world in which any image or video can be faked, the authenticity of any inconvenient or unwelcome recording can be called into question.”

Without a doubt, synthetic media has the potential to muddy the waters. As Farid puts it, “When anything can be fake, then nothing has to be real, and anyone can easily dismiss inconvenient facts.”

So, what’s the solution to this forthcoming existential nightmare?

At least in part, the solution is content provenance.

On January 26, 2022, the Coalition for Content Provenance and Authentication (C2PA) released an open-standard, technical specification for digital provenance.

Formed initially as an alliance between Adobe, Arm, BBC, Intel, Microsoft, and Truepic, C2PA aims to help hardware and software companies, social media platforms, government agencies, and regulatory bodies make strides toward the global adoption of media content provenance.

The C2PA Specification defines secure, tamper-evident, cross-platform, standardized techniques for determining content authenticity. When implemented, the C2PA spec lets content creators and editors disclose who created the content; how, when, and where it was created; and when and how it was edited throughout its life. In turn, content consumers can view those disclosures and verify authenticity. (The C2PA homepage includes a video introduction to the C2PA spec and a few use cases.)

Although it is probably the most visible group, C2PA is not the only organization on the front lines. The aforementioned international non-profit, Witness, which uses technologies to help prevent human rights abuses, has established a “Prepare, Don’t Panic” initiative that also focuses on the threats of synthetic media.

As a caveat, it’s worth highlighting that synthetic media isn’t inherently bad (just as Adobe Photoshop isn’t an evil tool in and of itself); for example, think of the waves that deepfake technologies will likely make in the film and television industries. Thus, the outright banning of deepfake technologies is arguably not beneficial to society, nor is it even really possible, for that matter; the horse has left the stable, so to speak. The question is, in addition to content provenance initiatives, how else can we handle synthetic media moving forward?

Legal issues

For better or worse (mostly worse), social media platforms hold a great deal of power in our current societal landscape. As synthetic media becomes increasingly easy to create, deepfakes will be ubiquitous on social media platforms, and legislation will likely be required.

As Metaphysic’s Henry Adjer and M.I.T.’s Joshua Glick, in conjunction with folks from Witness and the M.I.T. Open Documentary Lab, write in their “Just Joking” report, “Too light a [regulatory] touch could confuse viewers or lead to the spread of online misinformation. Too aggressive an approach could result in deepfake art being unable to find a platform.” Thus, it is rather complicated, and a Goldilocks’ approach is being sought by these early thinkers.

Historically—and much to the ire of politicians on both sides of the aisle—social media companies’ content hosting has largely been protected by Section 230 of the Communications Decency Act, which says: “No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.”

Section 230 effectively grants a social media company, say Facebook, the permission to moderate its users’ content without being legally considered a publisher itself. Given the increased power held by Big Tech, Congress is currently looking closely at the scope of Section 230.

Another piece of legislation that specifically addresses synthetic media is California’s AB 730. Under California AB 730, unless political synthetic media content is clearly deemed to be satire, all political deepfakes are banned for 30 days before California state elections. In the coming months, we will likely see more bills like this one making the rounds on Capitol Hill.

Key takeaways

Deepfakes are already here, and it’s inevitable that synthetic media will become more widespread and more convincing in the future. Not to be an alarmist, the potential, dire ramifications of synthetic media are hard to overstate.

Due to the nature of the internet, including image/video construction and modification, it simply isn’t possible to engage in deepfake detection at scale. The best course of action, besides measured legislation at the state and federal level, is to embrace media content provenance initiatives.

Artificial Intelligence

Content provenance is our best chance in the fight against deepfakes

John Donegan

Explainable AI in high-stakes decision environments

The geography of AI and cyber risk: How talent scarcity is now an enterprise risk

Fair use or mass piracy? The AI industry’s use of copyrighted works for LLM training garners scrutiny

Want to read
this article on the go?

Artificial Intelligence

Content provenance is our best chance in the fight against deepfakes

Why we can’t rely solely on deepfake detection

Social engineering attacks are the tip of the iceberg

The liar’s dividend

So, what’s the solution to this forthcoming existential nightmare?

Legal issues

Key takeaways

John Donegan

Browse more

Explainable AI in high-stakes decision environments

The geography of AI and cyber risk: How talent scarcity is now an enterprise risk

Fair use or mass piracy? The AI industry’s use of copyrighted works for LLM training garners scrutiny

Want to read this article on the go?

Gain instant access to premium content designed for tech leaders

Want to read
this article on the go?

Gain instant access to premium content
designed for tech leaders