ObjectRL is a deep reinforcement learning library designed for research and rapid prototyping. It focuses on deep actor-critic algorithms for continuous control tasks such as those in the MuJoCo environment suite, while providing a flexible object-oriented architecture that supports future extensions to value-based and discrete-action methods.
- Object-oriented design for easy experimentation
- Implements popular deep RL algorithms for continuous control
- Includes experimental implementations of Bayesian and value-based methods
- Supports easy configuration via CLI and YAML files
- Rich examples and tutorials for customization and advanced use cases
- DDPG (Deep Deterministic Policy Gradient)
- TD3 (Twin Delayed DDPG)
- SAC (Soft Actor-Critic)
- PPO (Proximal Policy Optimization)
- REDQ (Randomized Ensemble Double Q-Learning)
- DRND (Distributional Random Network Distillation)
- OAC (Optimistic Actor-Critic)
- PBAC (PAC-Bayesian Actor-Critic)
- BNN-SAC (Bayesian Neural Network SAC) — experimental, in examples
- DQN (Deep Q-Network) — experimental, in examples
conda create -n objectrl python=3.12 -y
conda activate objectrlpip install objectrlgit clone https://github.com/adinlab/objectrl.git
cd objectrl
pip install -e .To enable additional features such as documentation generation:
pip install objectrl[docs]Run your first experiment using Soft Actor-Critic (SAC) on the default cheetah environment:
If installed from PyPI:
python -m objectrl.main --model.name sacIf running from a cloned repo:
python objectrl/main.py --model.name sacOther examples will assume running from a cloned repo.
Run DDPG on the hopper environment:
python objectrl/main.py --model.name ddpg --env.name hopperTrain SAC for 100,000 steps and evaluate every 5 episodes:
python objectrl/main.py --model.name sac --env.name hopper --training.max_steps 100000 --training.eval_episodes 5For more complex or reproducible setups, create YAML config files in objectrl/config/model_yamls/ and specify them at runtime:
python objectrl/main.py --config objectrl/config/model_yamls/ppo.yamlExample ppo.yaml:
model:
name: ppo
training:
warmup_steps: 0
learn_frequency: 2048
batch_size: 64
n_epochs: 10If you encounter common issues or errors during installation or usage, please see the Issues guide for solutions and tips.
For other questions or to report bugs, visit our GitHub Issues page.
Explore detailed documentation, tutorials, and API references at: https://objectrl.readthedocs.io
If you use ObjectRL in your research, please cite:
@article{baykal2025objectrl,
title={ObjectRL: An Object-Oriented Reinforcement Learning Codebase},
author={Baykal, Gulcin and Akg{\"u}l, Abdullah and Haussmann, Manuel and Tasdighi, Bahareh and Werge, Nicklas and Wu Yi-Shan and Kandemir, Melih},
year={2025},
journal={arXiv preprint arXiv:2507.03487}
}