Synthetic Realities and Data in Biometric Analysis and Security
Fadi Boutros · Eduarda Caldeira · Laura Cassani · Naser Damer · Marija Ivanovska · Vishal Patel · Ajita Rattani · Anderson Rocha · Matthew Stamm · Vitomir Štruc
Abstract
Recent advancements in generative models have revolutionized the way researchers approach data-driven tasks. The advent of sophisticated generative models, such as Generative Adversarial Networks, Variational Autoencoders, and Diffusion Models have empowered practitioners to create partially or fully synthetic data closely reflecting real-world scenarios. These generative models' significance lies in their ability to produce remarkably realistic data, thus mitigating challenges associated with data scarcity. As a result, the usage of synthetic data has become increasingly prevalent in various research domains, offering a versatile and ethical alternative for training and testing machine learning algorithms. However, the very realism that makes synthetic data valuable also blurs the line between authentic and manipulated content, resulting in datasets that can potentially be used to mislead, manipulate, or even harm individuals when used unethically. The SynRDinBAS Workshop \& Challenge aims to explore the diverse applications of synthetic realities and data in biometric analysis, while addressing critical security issues such as data privacy and ethical concerns of data manipulation. Participants will examine how synthetic datasets have been instrumental in training systems for facial recognition, emotion detection, gesture recognition, etc. The workshop will showcase exemplary use cases demonstrating how synthetic data not only overcomes limitations of real-world datasets but also fosters the development of more robust and accurate models. Additionally, potential risks and ethical dilemmas that arise from manipulating data will also be discussed, ensuring that our approaches prioritize privacy and integrity in biometric applications. The hosted competition will focus on bridging the research gap associated with the detection of partially synthetic data, as localized changes (such as adding or removing objects or subtly altering faces) are more difficult to detect and more likely to deceive viewers. **Challenge Design:** We will host a script-based challenge, ensuring open access and reproducibility for the research community. At least one of the tasks will focus on detecting of images generated from state-of-the-art models, but the central novelty will be the inclusion of more nuanced manipulation scenarios. Tasks include detecting and localizing object additions/removals, identifying in-painted or altered regions, and distinguishing between fully synthetic, partially synthetic, and pristine content. By anchoring the challenge in more subtle forms of manipulation, we aim to stimulate new methods that move beyond binary classification toward a fine-grained understanding of image authenticity. New datasets will be generated specifically for this effort, with a substantial portion synthetically manipulated at varying levels of granularity, and annotated for both detection and localization. Only small pilot task samples will be released in advance, enabling participants to calibrate their approaches while ensuring that test sets remain robust for evaluation. The models submitted by the participants will be evaluated using testing benchmarks. The organizers will not have access to individual models, ensuring protection of participants' intellectual property. A submission platform will be dedicated to the competition. Submissions will be assessed using a combination of standard detection metrics: accuracy, balanced accuracy, AUC, and localization-specific metrics such as IoU and pixel-level F1 scores. \textbf{Note}: To encourage participation and reward innovation, top performers may be eligible for research grants of up to \$$250,000 and travel stipends will be provided for invited teams.
Successful Page Load