A. Dörsch · M. Grimmer · L. J. Gonzaler-Soler · R. Casula · G. L. Marcialis · C. Busch · C. Rathgeb
This is the official repository of the paper: FaceSpoofLDM: Language-guided synthesis of face presentation attacks based on latent
- Trained model checkpoint and configuration are available upon request
This research work contributes to the development of more fair and secure biometric systems by introducing FaceSpoofLDM, a latent diffusion model (LDM) for language-guided image synthesis to generate synthetic face PAs and non-attacks across various demographic groups.
Presentation Attacks (PAs) pose a serious threat to face recognition (FR) systems. These attacks cover a broad range of scenarios, including images replayed on various devices, printed photographs, or more sophisticated approaches such as 3D masks used to impersonate another identity. Recent advances in deep neural networks have led to an increasing number of face presentation attack detection (PAD) methods, replacing traditional approaches with great success. However, these methods are highly data-intensive and require large amounts of training data for reliable decision-making. Although several face PAD datasets have been introduced, they often come with restricted usage, limited subject and attack diversity and privacy or legal constraints. In this work, we introduce FaceSpoofLDM, a latent diffusion model (LDM) for language-guided image synthesis to generate synthetic face PAs and non-attacks across various demographic groups. Our approach reduces the need for manually crafting physical presentation attack instruments (PAI) while increasing scalability and attack diversity. Extensive experiments demonstrate the effectiveness of our model and show that incorporating synthetic PAIs, on average, enhances security against PAs.
As FaceSpoofLDM builds upon the official implementation of Latent Diffusion Models (LDM) by Rombach et al. please clone the official CompVis Latent Diffusion repository and follow its setup instructions. Additionally, download the required pre-trained VQ4 autoencoder model as described by the LDM repository.
Download the FaceSpoofLDM checkpoint file and the associated YAML configuration file.
Note: The FaceSpoofLDM checkpoint and YAML configuration file are available upon request.
Requests must be made from an institutional email address. Use of the model is permitted only to researchers for non-commercial academic purposes.
Please contact André Dörsch (andre.doersch -at- h-da.de) to request access.
Inference can be performed by adapting the original Latent Diffusion txt2img.py script to load the FaceSpoofLDM checkpoint and configuration:
...
model_path = "path/to/facespoofldm_model.ckpt"
model_config_path = "path/to/facespoofldm_config.yaml"
config = OmegaConf.load(model_config_path)
model = load_model_from_config(config, model_path)
...
# FaceSpoofLDM latent shape
shape=[3, 64, 64]FaceSpoofLDM was trained on a fixed prompt template and does not support arbitrary text input. Text prompts must match the prompt template used during model training.
One of the following (PAI species or non-attack) prefixes must be included in the input text prompt:
A real face image of a live person.
A face spoof attack replayed on a Samsung device screen.
A face spoof attack replayed on an iPad screen.
A face spoof attack replayed via a webcam.
A face spoof attack using a printed photograph.
A face spoof attack using a printed image on a t-shirt.Further use of arbitrary prefixes (unsupported PAI species) is beyond the models generation capabilities.
Each prompt-prefix must be followed by the following template:
The person appears to be {age} years old, {gender}, and of {ethnicity} ethnicity.
Supported values:
- age: integer value within the synthetic training domain
- gender: male | female
- ethnicity: white | black | asian
A face spoof attack replayed via a webcam. The person appears to be 23 years old, female, and of white ethnicity.
A real face image of a live person. The person appears to be 54 years old, female, and of asian ethnicity.
A face spoof attack using a printed photograph. The person appears to be 55 years old, male, and of white ethnicity.
A face spoof attack replayed on a Samsung device screen. The person appears to be 33 years old, female, and of black ethnicity.
This repository and associated model files are provided exclusively for academic research use. The authors disclaim any responibility for misuse of the generated images. Details regarding model limitation and usage constraints can be found in the official paper
If you use this repository and found it useful for your research, please consider citing this paper:
@article{Doersch-FaceSpoofLDM-2026,
author={A. Dörsch and M. Grimmer and L. J. Gonzalez-Soler and R. Casula and G. L. Marcialis and C. Busch and C. Rathgeb},
journal={IEEE Access},
title={FaceSpoofLDM: Language-guided synthesis of face presentation attacks based on latent diffusion},
year={2026},
doi={10.1109/ACCESS.2026.3651853}}
As FaceSpoofLDM is based on Latent Diffusion, please cite:
@article{Rombach-LDM-CVF-2021,
title={{High-Resolution Image Synthesis with Latent Diffusion Models}},
author={R. Rombach and A. Blattmann and D. Lorenz and P. Esser and B. Ommer},
journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021},
pages={10674-10685},
}
As FaceSpoofLDM was exclusively trained on synthetic identities from SynthASpoof and TFPA, please further cite:
@InProceedings{Fang-SynthASpoof-CVPR-2023,
author = {M. Fang and M. Huber and N. Damer},
title = {{SynthASpoof: Developing Face Presentation Attack Detection Based on Privacy-Friendly Synthetic Data}},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
year = {2023},
pages = {1061-1070}
}
@article{Ibsen-AttackingFaceRecognitionWithTshirts-IEEEAccess-2023,
author = {M. Ibsen and C. Rathgeb and F. Brechtel and R. Klepp and K. P\"oppelmann and A. George and S. Marcel and C. Busch},
journal = {{IEEE} Access},
pages = {57867--57879},
title = {{Attacking Face Recognition With T-Shirts: Database, Vulnerability Assessment, and Detection}},
volume = {11},
year = {2023}
}

