Artificial intelligence is the central theme of this year's European Cyber Week from 19-21 November in Rennes, Brittany. In a challenge organised to coincide with the event by France's Defence Innovation Agency (AID), Thales teams have successfully developed a metamodel for detecting AI-generated images. As the use of AI technologies gains traction, and at a time when disinformation is becoming increasingly prevalent in the media and impacting every sector of the economy, the deepfake detection metamodel offers a way to combat image manipulation in a wide range of use cases, such as the fight against identity fraud.
AI-generated images are created using AI platforms such as Midjourney, Dall-E and Firefly. Some studies have predicted that within a few years the use of deepfakes for identity theft and fraud could cause huge financial losses. Gartner has estimated that around 20% of cyberattacks in 2023 likely included deepfake content as part of disinformation and manipulation campaigns. Their report1 highlights the growing use of deepfakes in financial fraud and advanced phishing attacks.
The Thales metamodel uses machine learning techniques, decision trees and evaluations of the strengths and weaknesses of each model to analyse the authenticity of an image. It combines various models, including:
The CLIP method (Contrastive Language-Image Pre-training) involves connecting image and text by learning common representations. To detect deepfakes, the CLIP method analyses images and compares them with their textual descriptions to identify inconsistencies and visual artefacts.
The DNF (Diffusion Noise Feature) method uses current image-generation architectures (called diffusion models) to detect deepfakes. Diffusion models are based on an estimate of the amount of noise to be added to an image to cause a "hallucination", which creates content out of nothing, and this estimate can be used in turn to detect whether an image has been generated by AI.
The DCT (Discrete Cosine Transform) method of deepfake detection analyses the spatial frequencies of an image to spot hidden artefacts. By transforming an image from the spatial domain (pixels) to the frequency domain, DCT can detect subtle anomalies in the image structure, which occur when deepfakes are generated and are often invisible to the naked eye.
The Thales team behind the invention is part of cortAIx, the Group's AI accelerator, which has over 600 AI researchers and engineers, 150 of whom are based at the Saclay research and technology cluster south of Paris and work on mission-critical systems. The Friendly Hackers team has developed a toolbox called BattleBox to help assess the robustness of AI-enabled systems against attacks designed to exploit the intrinsic vulnerabilities of different AI models (including Large Language Models), such as adversarial attacks and attempts to extract sensitive information. To counter these attacks, the team develops advanced countermeasures such as unlearning, federated learning, model watermarking and model hardening.