Skip to content

dasec/FaceSpoofLDM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

FaceSpoofLDM: Language-guided synthesis of face presentation attacks based on latent diffusion

A. Dörsch · M. Grimmer · L. J. Gonzaler-Soler · R. Casula · G. L. Marcialis · C. Busch · C. Rathgeb

IEEE Access 2026

This is the official repository of the paper: FaceSpoofLDM: Language-guided synthesis of face presentation attacks based on latent

News

January 2026

  • Trained model checkpoint and configuration are available upon request

Overview

This research work contributes to the development of more fair and secure biometric systems by introducing FaceSpoofLDM, a latent diffusion model (LDM) for language-guided image synthesis to generate synthetic face PAs and non-attacks across various demographic groups.

Abstract

Presentation Attacks (PAs) pose a serious threat to face recognition (FR) systems. These attacks cover a broad range of scenarios, including images replayed on various devices, printed photographs, or more sophisticated approaches such as 3D masks used to impersonate another identity. Recent advances in deep neural networks have led to an increasing number of face presentation attack detection (PAD) methods, replacing traditional approaches with great success. However, these methods are highly data-intensive and require large amounts of training data for reliable decision-making. Although several face PAD datasets have been introduced, they often come with restricted usage, limited subject and attack diversity and privacy or legal constraints. In this work, we introduce FaceSpoofLDM, a latent diffusion model (LDM) for language-guided image synthesis to generate synthetic face PAs and non-attacks across various demographic groups. Our approach reduces the need for manually crafting physical presentation attack instruments (PAI) while increasing scalability and attack diversity. Extensive experiments demonstrate the effectiveness of our model and show that incorporating synthetic PAIs, on average, enhances security against PAs.

Inference Setup

1. Install the Latent Diffusion Model repository:

As FaceSpoofLDM builds upon the official implementation of Latent Diffusion Models (LDM) by Rombach et al. please clone the official CompVis Latent Diffusion repository and follow its setup instructions. Additionally, download the required pre-trained VQ4 autoencoder model as described by the LDM repository.


2. Download FaceSpoofLDM model and configuration

Download the FaceSpoofLDM checkpoint file and the associated YAML configuration file.

Note: The FaceSpoofLDM checkpoint and YAML configuration file are available upon request.

Requests must be made from an institutional email address. Use of the model is permitted only to researchers for non-commercial academic purposes.

Contact


Please contact André Dörsch (andre.doersch -at- h-da.de) to request access.

Usage

Running inference

Inference can be performed by adapting the original Latent Diffusion txt2img.py script to load the FaceSpoofLDM checkpoint and configuration:

...
model_path = "path/to/facespoofldm_model.ckpt"
model_config_path = "path/to/facespoofldm_config.yaml"

config = OmegaConf.load(model_config_path)
model = load_model_from_config(config, model_path)
...
# FaceSpoofLDM latent shape
shape=[3, 64, 64]

Prompt constraints:

FaceSpoofLDM was trained on a fixed prompt template and does not support arbitrary text input. Text prompts must match the prompt template used during model training.

1. Supported prompt prefixes (PAI species or non-attack)

One of the following (PAI species or non-attack) prefixes must be included in the input text prompt:

A real face image of a live person.
A face spoof attack replayed on a Samsung device screen.
A face spoof attack replayed on an iPad screen.
A face spoof attack replayed via a webcam.
A face spoof attack using a printed photograph. 
A face spoof attack using a printed image on a t-shirt.

Further use of arbitrary prefixes (unsupported PAI species) is beyond the models generation capabilities.

2. Soft-biometric characteristics

Each prompt-prefix must be followed by the following template:

The person appears to be {age} years old, {gender}, and of {ethnicity} ethnicity.

Supported values:

  • age: integer value within the synthetic training domain
  • gender: male | female
  • ethnicity: white | black | asian

3. Examples:

A face spoof attack replayed via a webcam. The person appears to be 23 years old, female, and of white ethnicity.

A real face image of a live person. The person appears to be 54 years old, female, and of asian ethnicity.

A face spoof attack using a printed photograph. The person appears to be 55 years old, male, and of white ethnicity.

A face spoof attack replayed on a Samsung device screen. The person appears to be 33 years old, female, and of black ethnicity.

Disclaimer

This repository and associated model files are provided exclusively for academic research use. The authors disclaim any responibility for misuse of the generated images. Details regarding model limitation and usage constraints can be found in the official paper

Citation

If you use this repository and found it useful for your research, please consider citing this paper:

@article{Doersch-FaceSpoofLDM-2026,
  author={A. Dörsch and M. Grimmer and L. J. Gonzalez-Soler and R. Casula and G. L. Marcialis and C. Busch and C. Rathgeb},
  journal={IEEE Access}, 
  title={FaceSpoofLDM: Language-guided synthesis of face presentation attacks based on latent diffusion}, 
  year={2026},
  doi={10.1109/ACCESS.2026.3651853}}


As FaceSpoofLDM is based on Latent Diffusion, please cite:

@article{Rombach-LDM-CVF-2021,
    title={{High-Resolution Image Synthesis with Latent Diffusion Models}},
    author={R. Rombach and A. Blattmann and D. Lorenz and P. Esser and B. Ommer},
    journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021},
    pages={10674-10685},
}

As FaceSpoofLDM was exclusively trained on synthetic identities from SynthASpoof and TFPA, please further cite:

@InProceedings{Fang-SynthASpoof-CVPR-2023,
    author    = {M. Fang and M. Huber and N. Damer},
    title     = {{SynthASpoof: Developing Face Presentation Attack Detection Based on Privacy-Friendly Synthetic Data}},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    year      = {2023},
    pages     = {1061-1070}
}

@article{Ibsen-AttackingFaceRecognitionWithTshirts-IEEEAccess-2023,
    author = {M. Ibsen and C. Rathgeb and F. Brechtel and R. Klepp and K. P\"oppelmann and A. George and S. Marcel and C. Busch},
    journal = {{IEEE} Access},
    pages = {57867--57879},
    title = {{Attacking Face Recognition With T-Shirts: Database, Vulnerability Assessment, and Detection}},
    volume = {11},
    year = {2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published