Watermarking Language Models through Language Models

🔍 Project Overview

This repository presents our approach to watermarking language model outputs using dynamically constructed system instructions generated by the Prompting LM. The figure below illustrates our full pipeline, highlighting the interaction between user requests, system instructions, the Marking LM, and the Detecting LM.

Overview of the proposed scheme:

💬 Fixed Prompt for the Prompting Language Model:

The fixed prompt used to guide the Prompting LM in generating system instructions is shown below. This template ensures consistent instruction generation while allowing for diverse lexical, semantic, or structural watermarking strategies.

📊 Results Snapshot:

The following screenshot summarizes our main quantitative findings, including detection accuracy under fine-tuning and distillation. Our method achieves high robustness even under model transformations, demonstrating the reliability and adaptability of the watermark signal.

📚 Citation:

@ARTICLE{11146861,
  author={Dasgupta, Agnibh and Tanvir, Abdullah All and Zhong, Xin},
  journal={IEEE Transactions on Artificial Intelligence}, 
  title={Watermarking Language Models through Language Models}, 
  year={2025},
  volume={},
  number={},
  pages={1-10},
  keywords={Watermarking;Adaptation models;Robustness;Training;Large language models;Intellectual property;Closed box;Tuning;Context modeling;Codes;Content authentication;instruction control;large language models;prompt engineering;robust watermarking},
  doi={10.1109/TAI.2025.3605117}
}

📌 A. Dasgupta, A. A. Tanvir and X. Zhong, "Watermarking Language Models through Language Models," in IEEE Transactions on Artificial Intelligence, doi: 10.1109/TAI.2025.3605117.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
DLM.ipynb		DLM.ipynb
LICENSE		LICENSE
PMLM.ipynb		PMLM.ipynb
PMLM_distill.ipynb		PMLM_distill.ipynb
PMLM_finetune.ipynb		PMLM_finetune.ipynb
README.md		README.md
SI_similarity.ipynb		SI_similarity.ipynb
attack.ipynb		attack.ipynb
generate prompts.ipynb		generate prompts.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Watermarking Language Models through Language Models

🔍 Project Overview

Overview of the proposed scheme:

💬 Fixed Prompt for the Prompting Language Model:

📊 Results Snapshot:

📚 Citation:

About

Uh oh!

Releases

Packages

Languages

License

cent664/LLMWM

Folders and files

Latest commit

History

Repository files navigation

Watermarking Language Models through Language Models

🔍 Project Overview

Overview of the proposed scheme:

💬 Fixed Prompt for the Prompting Language Model:

📊 Results Snapshot:

📚 Citation:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages