Direct Preference Optimization Implementation

This repository contains an implementation of Direct Preference Optimization (DPO), a policy optimization method for reinforcement learning that directly optimizes user preferences.

In this repository, we make available the implementation, experiments comparing DPO with Supervised Fine-Tuning (SFT), and a full theoretical and experimental report on the DPO method.

Report

The report, titled Direct Preference Optimization: A Theoretical and Experimental Analysis, includes a full mathematical revision of the method, covering the complete theoretical framework from the Plackett General Model for Preference Probabilities to the most recent advances in Direct Preference Optimization.

It includes:

A clear derivation of the DPO loss function using the Bradley-Terry and Plackett-Luce models.
A reformulation of the KL-constrained reward objective into a closed-form expression, leading to a simpler training procedure.
Detailed theoretical insights, including proofs showing that language models can be seen as implicitly trained reward models.
A comparison of DPO with PPO and RLHF methods, highlighting DPO's computational simplicity and stability.
Experimental results based on GPT-2-medium, along with preference datasets, showing the effectiveness of DPO even with small models.
Evaluation using Sentence-BERT to approximate win-rates based on semantic similarity.

The full report can be found here and provides an accessible yet rigorous mathematical background for the method, its motivation, and implications.

Installation

Prerequisites

Ensure you have Python 3.8+ installed. Then, clone the repository and install the dependencies:

git clone https://github.com/giovanni-br/DPO-Implementation.git
cd DPO-Implementation

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
dpo_finetuning.py		dpo_finetuning.py
evaluation.py		evaluation.py
experiments.ipynb		experiments.ipynb
report.pdf		report.pdf
requirements.txt		requirements.txt
supervised_fine_tuning.py		supervised_fine_tuning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Direct Preference Optimization Implementation

Report

Installation

Prerequisites

About

Uh oh!

Releases

Packages

Uh oh!

Languages

giovanni-br/DPO-Implementation

Folders and files

Latest commit

History

Repository files navigation

Direct Preference Optimization Implementation

Report

Installation

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages