Deepfake Detectors Can be Bypassed into Identifying Fake Images as Real

venturebeat | April 09, 2020

  • AI systems trained to distinguish between real and synthetic content — are susceptible to adversarial attacks, or attacks leveraging inputs designed to cause mistakes in models.

  • Researchers demonstrated that it’s possible to bypass fake video detectors by adversarially modifying.

  • It’s a troubling, if not necessarily new, development for organizations attempting to productize fake media detectors, particularly considering the meteoric rise in deepfake content online.


In a paper published this week on the preprint server Arxiv.org, researchers from Google and the University of California at Berkeley demonstrate that even the best forensic classifiers — AI systems trained to distinguish between real and synthetic content — are susceptible to adversarial attacks, or attacks leveraging inputs designed to cause mistakes in models. Their work follows that of a team of researchers at the University of California at San Diego, who recently demonstrated that it’s possible to bypass fake video detectors by adversarially modifying — specifically, by injecting information into each frame — videos synthesized using existing AI generation methods.


It’s a troubling, if not necessarily new, development for organizations attempting to productize fake media detectors, particularly considering the meteoric rise in deepfake content online. Fake media might be used to sway opinions during an election or implicate a person in a crime, and it’s already been abused to generate pornographic material of actors and defraud a major energy producer.


The researchers first tackled the simpler task of evaluating classifiers to which they had unfettered access. Using this “white-box” threat model and a data set of 94,036 sample images, they modified synthesized images so that they were misclassified as real and vice versa, applying various attacks — a distortion-minimizing attack, a universal adversarial-patch attack, and a universal latent-space attack — to a classifier taken from the academic literature.


The distortion-minimizing attack, which involved adding a small perturbation (i.e., modifying a subset of pixels) to a synthetically generated image, caused one classifier to misclassify 71.3% of images with only 2% pixel changes and 89.7% of images with 4% pixel changes. Perhaps more alarmingly, the model classified 50% of real images as fake after the researchers distorted under 7% of the images’ pixels.


As for the loss-minimizing attack, which fixed the image distortion to be less than a specified threshold, it reduced the classifer’s accuracy from 96.6% to 27%. The universal adversarial-patch attack was even more effective — a visible noise pattern overlaid on two fake images spurred the model to classify them as real with a likelihood of 98% and 86%. And the final attack — the universal latent-space attack, where the team modified the underlying representation leveraged by an image-generating model to yield an adversarial image — reduced classification accuracy from 99% to 17%.


READ MORE: EVEN THE AI BEHIND DEEPFAKES CAN’T SAVE US FROM BEING DUPED


The researchers next investigated a black-box attack where the inner workings of the target classifier were unknown to them. They developed their own classifier by collecting one million images synthesized by an AI model and one million real images on which the aforementioned model was trained, and then training a separate system to classify images as fake or real and generating a white-box adversarial example on the source classifier using a distortion-minimizing attack. They report that this reduced their classifier’s accuracy from 85% to 0.03% and that when applied to a popular third-party classifier, it reduced that classifier’s accuracy from 96% to 22%.

To the extent that synthesized or manipulated content is used for nefarious purposes, the problem of detecting this content is inherently adversarial. We argue, therefore, that forensic classifiers need to build an adversarial model into their defences.

- Researchers


 

 


Demonstrating attacks on sensitive systems is not something that should be taken lightly, or done simply for sport. However, if such forensic classifiers are currently deployed, the false sense of security they provide may be worse than if they were not deployed at all — not only would a fake profile picture appear authentic, now it would be given additional credibility by a forensic classifier. Even if forensic classifiers are eventually defeated by a committed adversary, these classifiers are still valuable in that they make it more difficult and time-consuming to create a convincing fake.

- Researchers


Fortunately, a number of companies have published corpora in the hopes that the research community will pioneer new detection methods. To accelerate such efforts, Facebook — along with Amazon Web Services (AWS), the Partnership on AI, and academics from a number of universities — is spearheading the Deepfake Detection Challenge. The Challenge includes a data set of video samples labeled to indicate which were manipulated with AI. In September 2019, Google released a collection of visual deepfakes as part of the FaceForensics benchmark, which was cocreated by the Technical University of Munich and the University Federico II of Naples. More recently, researchers from SenseTime partnered with Nanyang Technological University in Singapore to design DeeperForensics-1.0, a data set for face forgery detection that they claim is the largest of its kind.


READ MORE: UNDERSTANDING AI DECEPTION AND HOW ONE CAN PREPARE AGAINST IT

Spotlight

Did you know that with a combination of multi-dimensional analysis capabilities and advanced analytics, you can transform your defensive cyber strategy into a proactive one? Download this infographic to counter and mitigate more threats by learning- How cyber threat hunting works; The number of cyber attacks an organization faces in one week; What the cost of one breach can be to your organization.


Other News

Cisco Cloud Security modernizes data analytics with Matillion ETL for Snowflake

Snowflake | July 28, 2020

Cloud data integration company Matillion Ltd. today announced a big customer win: Cisco Systems Inc.’s cloud security business. U.K.-based Matillion offers extract/transform/load services for cloud data warehouses such as Snowflake, Amazon Redshift, Microsoft Azure Synapse and Google BigQuery. ETL refers to the three steps — extract, transform and load — that are used to prepare and blend data from multiple sources so it can be analyzed for business use.

Read More

Acronis Releases Hardened Backup Solution Specifically Designed to Meet the Unique Needs of Air-Gapped Networks

Acronis | August 05, 2020

SCHAFFHAUSEN, Switzerland, 05 August 2020 – Acronis, a global leader in cyber protection, announced today the launch of Acronis Cyber Backup SCS Hardened Edition, a full disk image backup solution designed specifically for ‘no internet’ air-gapped networks. Based on the same code base as Acronis SCS’ Department of Defense Information Network Approved Products List (DoDIN APL) certified solution, Acronis Cyber Backup SCS Hardened Edition delivers complete protection for critical assets and data across the most sensitive national security, industrial, and financial environments, including weapons testing sites, development labs, training simulators, deployed military elements, supervisory control and data acquisition (SCADA) and industrial control systems, and more.

Read More

GENERAL AI

Continual Launches With $4 Million in Seed to Bring AI to the Modern Data Stack

Continual | December 20, 2021

Continual, a company building a next-generation AI platform for the modern data stack, today announces its public beta launch with $4 million in seed funding. The round was led by Amplify Partners, a firm that invests in companies with a vision of transforming infrastructure and machine intelligence tools. Illuminate Ventures, Essence, Wayfinder, and Data Community Fund also participated in the round. The modern data stack centered on cloud data warehouses like Snowflake is rapidly democratizing data and analytics, but deploying AI at scale into business operations, products, or services remains a challenge for most companies. Powered by a declarative approach to operational AI and end-to-end automation, Continual enables modern data and analytics teams to build continually improving machine learning models directly on their cloud data warehouse without complex engineering. Continual brings together second time founders Tristan Zajonc and Tyler Kohn who previously built and sold machine learning infrastructure startups. Cofounder and CEO Tristan’s first startup, Sense, a pioneering enterprise data science platform, was acquired by Cloudera in 2016. Continual’s cofounder and CTO, Tyler Kohn, built RichRelevance, the world’s leading personalization provider, before it was acquired by Manthan in 2019. Tristan and Tyler saw the huge gap between the transformational potential of AI and the day-to-day struggle most companies faced operationalizing AI using real world data. They founded Continual to radically simplify operational AI by taking a fundamentally new approach. Artificial intelligence has the potential to transform every industry, department, product and service but current solutions require complex infrastructure, advanced skills, and constant maintenance. Continual breaks through this complexity with a radical simplification of the machine learning development lifecycle, combining a declarative approach to operational AI, end-to-end automation, and the agility of the modern data stack. Our customers are deploying state-of-art predictive models that never stop learning from their data in minutes rather than months,” Tristan Zajonc, CEO and cofounder of Continual. Getting continually improving predictive insights from data is critical for businesses to operate efficiently and better serve their customers. Yet operationalizing AI remains a challenge for all but the most sophisticated companies, Continual meets data teams where they work - inside the cloud data warehouse - and lets them build and deploy continually improving predictive models in a fraction of the time existing approaches demand. We invested because we believe their approach is fundamentally new and, most importantly, the right one to make AI work across the enterprise." David Beyer, Partner at Amplify Partners With the new capital, Continual plans to more than double its team over the next year with new hires for sales and engineering roles. It will expand into new AI/ML use cases such as NLP, realtime, and personalization, and broaden support for additional cloud data platforms. Continual is offering a 14-day trial with its open beta release, enhancements for dbt users, and support for Snowflake, Redshift, BigQuery, and Databricks. About Continual Based in San Francisco, Continual is a next-generation AI platform for the modern data stack powered by end-to-end automation and a declarative workflow. Modern data teams use Continual to deploy continually improving predictive models to drive revenue, operate more efficiently, and power innovative products and services. Continual has raised $4 million in funding from Amplify Partners, Illuminate Ventures, Essence, Wayfinder, and Data Community Fund. About Amplify Partners Amplify Partners invests in early-stage companies pioneering novel applications in machine intelligence and computer science. The firm's deep domain expertise, unrivaled relationships with leading technologists and decades of operational experience, positions it uniquely with enterprise insight and the ability to serve technical founding teams.

Read More

AI APPLICATIONS

Enterprise AI platform Dataiku launches managed service for smaller companies

Dataiku | June 15, 2021

Dataiku is going downstream with a new product today called Dataiku Online. As the name suggests, Dataiku Online is a fully managed version of Dataiku. It lets you take advantage of the data science platform without going through a complicated setup process that involves a system administrator and your own infrastructure. If you’re not familiar with Dataiku, the platform lets you turn raw data into advanced analytics, run some data visualization tasks, create data-backed dashboards and train machine learning models. In particular, Dataiku can be used by data scientists, but also business analysts and less technical people. The company has been mostly focused on big enterprise clients. Right now, Dataiku has more than 400 customers, such as Unilever, Schlumberger, GE, BNP Paribas, Cisco, Merck and NXP Semiconductors. There are two ways to use Dataiku. You can install the software solution on your own, on-premise servers. You can also run it on a cloud instance. With Dataiku Online, the startup offers a third option and takes care of setup and infrastructure for you. “Customers using Dataiku Online get all the same features that our on-premises and cloud instances provide, so everything from data preparation and visualization to advanced data analytics and machine learning capabilities,” co-founder and CEO Florian Douetteau said. “We’re really focused on getting startups and SMBs on the platform — there’s a perception that small or early-stage companies don’t have the resources or technical expertise to get value from AI projects, but that’s simply not true. Even small teams that lack data scientists or specialty ML engineers can use our platform to do a lot of the technical heavy lifting, so they can focus on actually operationalizing AI in their business.” Customers using Dataiku Online can take advantage of Dataiku’s pre-built connectors. For instance, you can connect your Dataiku instance with a cloud data warehouse, such as Snowflake Data Cloud, Amazon Redshift and Google BigQuery. You can also connect to a SQL database (MySQL, PostgreSQL…), or you can just run it on CSV files stored on Amazon S3.

Read More

Spotlight

Did you know that with a combination of multi-dimensional analysis capabilities and advanced analytics, you can transform your defensive cyber strategy into a proactive one? Download this infographic to counter and mitigate more threats by learning- How cyber threat hunting works; The number of cyber attacks an organization faces in one week; What the cost of one breach can be to your organization.

Resources