Making Multi-Agent Reinforcement Learning accessible to everyone - from students to researchers
📖 View Official Documentation | 🚀 Quick Start Guide | 🧮 Algorithm Library
EasyMARL is a comprehensive, beginner-friendly framework for multi-agent reinforcement learning that bridges the gap between educational simplicity and research-ready performance. Whether you're learning MARL for the first time or conducting cutting-edge research, EasyMARL provides the tools you need.
- Beginner-Friendly Design: Clear, well-documented code structure
- Interactive GUI: Web-based interface requiring no coding experience
- Step-by-Step Learning: Detailed tutorials and examples
- Algorithm Comparisons: Side-by-side performance analysis
- 🧠 21+ MARL Algorithms: All algorithms comprehensively implemented in a single unified framework
- 📚 Single Algorithm Library: All state-of-the-art algorithms organized by taxonomy in
algorithms/__init__.py - 🖥️ Full CLI Support: Complete command-line interface for training, evaluation, and management
- 🧪 Comprehensive Testing: Extensive test suite ensuring reliability and correctness
- 🧠 Custom Neural Networks: 4 built-in architectures (FeedForward, CNN, Attention, Residual) + custom network support
- 🔬 Enhanced Performance: Vectorized environments and optional JIT compilation
- 📈 Scalable Architecture: Handle complex multi-agent scenarios
- 📊 Professional Logging: Weights & Biases integration
- MultiGrid Environments: 12+ cooperative and competitive scenarios
- Custom Environment API: Easy integration of new environments
- Real-time Visualization: Watch agents learn in interactive environments
- Real-time Monitoring: Live training graphs and metrics
- Experiment Tracking: Comprehensive experiment management
- Performance Profiling: Memory and compute usage analysis
- Video Generation: Create videos of trained agent behavior
EasyMARL offers 4 primary deployment methods to suit different needs:
# Method 1: Command Line Interface (30 seconds)
pip install easymarl && easymarl-train --algorithm qmix --env MultiGrid-Empty-6x6-v0
# Method 2: Python Library (30 seconds)
pip install easymarl && python -c "import easymarl; easymarl.launch_gui()"
# Method 3: GitHub Codespaces (30 seconds)
# Click: https://codespaces.new/shreyanmitra/EasyMARL
# Method 4: Local Web Interface (5 minutes)
git clone https://github.com/shreyanmitra/EasyMARL.git
cd EasyMARL && ./tools/start-easymarl.shBest for: Researchers, batch experiments, automated workflows
# Install EasyMARL
pip install easymarl
# Train agents with CLI
easymarl-train --algorithm qmix --env MultiGrid-Empty-6x6-v0 --episodes 1000
# Launch GUI
easymarl-gui
# Run demos
easymarl-demo --example comparison --quick
# Manage environments
easymarl-env --list
easymarl-env --info MultiGrid-Empty-6x6-v0
# Algorithm information
easymarl-algo --list
easymarl-algo --info qmix
easymarl-algo --compare qmix vdn ippo
# Weights & Biases integration
easymarl-wandb --login
easymarl-wandb --project my_researchFeatures:
- ✅ Complete CLI suite with 6 main commands
- ✅ Batch experiment support for research workflows
- ✅ Algorithm comparison tools built-in
- ✅ Environment management and validation
- ✅ Weights & Biases integration for experiment tracking
- ✅ Comprehensive help system and documentation
Best for: Full-featured development, complete control, local resources
# Install EasyMARL
pip install easymarl
# Clone repository for web interface
git clone https://github.com/shreyanmitra/EasyMARL.git
cd EasyMARL
# Start both frontend and backend
./tools/start-easymarl.sh
# Access web interface:
# Frontend: http://localhost:3000
# Backend API: http://localhost:5000/apiFeatures:
- ✅ Complete web interface with real-time training monitoring
- ✅ REST API for programmatic access
- ✅ All algorithms and environments available
- ✅ Experiment tracking with Weights & Biases
Best for: Students, no-setup experience, cloud development
# 1. Click "Open in GitHub Codespaces" above
# 2. Wait 3-5 minutes for automatic environment setup
# 3. Services start automatically on creation
# Access your cloud environment:
# Frontend: https://CODESPACE-NAME-3000.app.github.dev
# Backend: https://CODESPACE-NAME-5000.app.github.dev/apiBenefits:
- ✅ FREE with GitHub Student Pack (180 hours/month)
- ✅ Zero installation - works in your browser
- ✅ Full ML environment with GPU support
- ✅ 8GB RAM + 4 CPU cores
Best for: Quick experimentation, Jupyter notebooks, research workflows
# Install and use directly in Python
pip install easymarl
import easymarl
# Launch Gradio web interface (one line!)
easymarl.launch_gui()
# Or use programmatic API
env = easymarl.make_env("MultiGrid-Empty-6x6-v0")
controller = easymarl.UnifiedMultiAgentController(
env=env,
algorithm="qmix",
educational_mode=True # Detailed explanations for learning
)
# Train agents with progress tracking
controller.train(episodes=1000)
results = controller.evaluate()
print(f"Average reward: {results['avg_reward']:.2f}")Advantages:
- ✅ Simplest setup - just one
pip install - ✅ Jupyter notebook friendly
- ✅ Educational mode with detailed explanations
- ✅ Self-contained - no separate backend needed
EasyMARL provides a comprehensive CLI for all operations:
# Training and evaluation
easymarl-train --algorithm qmix --env MultiGrid-Empty-6x6-v0 --episodes 1000
easymarl-train --algorithm ippo --vectorized --n_envs 8
# GUI and demos
easymarl-gui --interface gradio
easymarl-demo --example comparison --quick
# Environment management
easymarl-env --list
easymarl-env --info MultiGrid-Empty-6x6-v0
easymarl-env --create my_env --template cluttered
# Algorithm information and comparison
easymarl-algo --list
easymarl-algo --info qmix
easymarl-algo --compare qmix vdn ippo --episodes 200
easymarl-algo --benchmark --env MultiGrid-Empty-6x6-v0
# Weights & Biases integration
easymarl-wandb --login
easymarl-wandb --project my_research --viewCLI Features:
- ✅ 6 main command groups covering all functionality
- ✅ Comprehensive help system with examples
- ✅ Algorithm comparison tools for research
- ✅ Environment validation and testing
- ✅ Experiment management with W&B integration
- ✅ Batch processing support for large experiments
📖 Complete CLI Documentation - Detailed CLI reference and examples
🎯 Unique Feature: All MARL algorithms are implemented in a single unified framework with consistent APIs, comprehensive documentation, and taxonomical organization in
algorithms/__init__.py
� What Makes Our Algorithm Library Special:
- ✅ Unified Implementation: All algorithms share the same base architecture
- ✅ Taxonomical Organization: Algorithms organized by their fundamental principles
- ✅ Comprehensive Comments: Every single line of algorithm code is documented
- ✅ Educational Progression: Beginner → Intermediate → Advanced learning path
- ✅ Research Ready: Production-quality implementations with latest optimizations
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| QMIX | Rashid et al., 2018 | Monotonic value function factorization | Cooperative tasks with partial observability |
| VDN | Sunehag et al., 2017 | Simple value decomposition | Basic cooperative learning |
| IQL | Independent Q-Learning | Each agent learns independently | Baseline for comparison |
| QTRAN | Son et al., 2019 | General value decomposition | Complex cooperative scenarios |
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| IPPO | Independent PPO | Independent policy optimization | Continuous action spaces |
| MAPPO | Yu et al., 2021 | Multi-agent PPO with centralized training | Large-scale cooperation |
| MADDPG | Lowe et al., 2017 | Multi-agent DDPG | Continuous control tasks |
| MADDPG+Comm | MADDPG with communication | Communication-enabled MADDPG | Coordination requiring communication |
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| COMA | Foerster et al., 2018 | Counterfactual multi-agent policy gradients | Credit assignment problems |
| COMA+Comm | COMA with communication | Communication-enabled COMA | Complex coordination |
| MAACC | Multi-Agent Actor-Critic-Critic | Advanced actor-critic architecture | Challenging cooperative tasks |
| DCG | Zhang et al., 2018 | Deep coordination graphs | Structured multi-agent problems |
| MAVEN | Mahajan et al., 2019 | Multi-agent variational exploration | Exploration-heavy environments |
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| NFSP | Heinrich & Silver, 2016 | Neural fictitious self-play | Two-player competitive games |
| MINIMAX-Q | Littman, 1994 | Minimax Q-learning | Zero-sum games |
| WoLF-PHC | Bowling & Veloso, 2002 | Win-or-learn-fast policy hill climbing | Mixed-motive games |
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| HQL | Hierarchical Q-Learning | Multi-level decision making | Complex task decomposition |
| LQL | Layered Q-Learning | Structured hierarchical learning | Hierarchical environments |
| Algorithm | Paper | Description | Best For |
|---|---|---|---|
| MFQ | Yang et al., 2018 | Mean field Q-learning | Large population games |
| Environment | Size | Agents | Difficulty | Description |
|---|---|---|---|---|
| Empty | 6x6, 8x8, 16x16 | 2-4 | ⭐ | Basic navigation and coordination |
| FourRooms | 19x19 | 2-4 | ⭐⭐ | Navigation through connected rooms |
| DoorKey | 6x6, 8x8 | 2-4 | ⭐⭐⭐ | Coordination to unlock doors |
| Cluttered | 6x6, 8x8 | 2-4 | ⭐⭐ | Navigation with obstacles |
| Maze | 6x6, 8x8 | 2-4 | ⭐⭐⭐ | Complex maze navigation |
| CoinGame | Variable | 2 | ⭐⭐⭐⭐ | Competitive coin collection |
| Gather | Variable | 2-8 | ⭐⭐⭐ | Resource gathering cooperation |
EasyMARL supports flexible custom neural network architectures that work with all algorithms:
| Network Type | Best For | Key Features |
|---|---|---|
| FeedForward | Baselines, simple tasks | Multi-layer perceptron, configurable depth |
| Convolutional | Visual environments | CNN layers, spatial processing |
| Attention | Coordination, complex reasoning | Self-attention, positional encoding |
| Residual | Deep learning, stability | Skip connections, batch normalization |
# Use convolutional network for visual tasks
config = {
'network_type': 'convolutional',
'conv_layers': [32, 64, 128],
'hidden_dim': 256
}
# Use attention network for coordination
config = {
'network_type': 'attention',
'num_heads': 8,
'attention_dim': 128
}
# Create custom network
config = {
'network_type': 'my_networks.CustomNetwork',
'custom_param': 'value'
}📖 Full Custom Networks Guide | 🔧 Example Script
# Educational design with clear step-by-step learning
controller = easymarl.SimpleMultiAgentController(
env=env,
algorithm="qmix",
config=config,
use_enhanced_features=True # Optional performance boost
)# Research-ready with advanced features
controller = easymarl.ModernMultiAgentController(
env=env,
algorithm="qmix",
config=config,
experiment_name="research_exp_1",
enable_advanced_tracking=True,
enable_performance_monitoring=True
)# Maximum performance optimization
controller = easymarl.VectorizedController(
env_fn=lambda: easymarl.make_env("MultiGrid-Empty-6x6-v0"),
num_envs=8, # Parallel environments
algorithm="qmix",
use_enhanced_vectorization=True # JIT compilation
)# Enable enhanced features for maximum performance
controller = easymarl.SimpleMultiAgentController(
env=env,
algorithm="qmix",
use_enhanced_features=True
)
# Or use vectorized controller for parallel training
controller = easymarl.VectorizedController(
env_fn=lambda: easymarl.make_env("MultiGrid-Empty-6x6-v0"),
num_envs=16, # 16 parallel environments
use_enhanced_vectorization=True
)| Environment | Algorithm | Vectorized Envs | Performance Benefit |
|---|---|---|---|
| MultiGrid-Empty-6x6 | QMIX | 8 parallel | ~8x faster sampling |
| MultiGrid-Empty-8x8 | VDN | 8 parallel | ~8x faster sampling |
| MultiGrid-DoorKey-6x6 | COMA | 8 parallel | ~8x faster sampling |
| MultiGrid-Maze-8x8 | MADDPG | 8 parallel | ~8x faster sampling |
Speedup from running 8 environments in parallel vs single environment
pip install easymarlpip install easymarl[enhanced]pip install easymarl[gui]pip install easymarl[all]git clone https://github.com/shreyanmitra/EasyMARL.git
cd EasyMARL
pip install -e .[all]EasyMARL follows a clean, modular package structure designed for both ease of use and professional development:
EasyMARL/
├── core/ # Core MARL functionality
│ ├── utils/ # Consolidated utilities
│ │ ├── base.py # Core utilities
│ │ ├── advanced.py # Advanced features
│ │ └── enhanced.py # Performance optimization
│ ├── config_manager.py # Configuration management
│ └── research_interface.py # Research tools
├── algorithms/ # 21+ MARL algorithms (ALL IMPLEMENTED)
│ ├── model_free/ # Model-free algorithms
│ │ ├── value_based/ # QMIX, VDN, QTRAN, IQL, MFQ + tabular
│ │ ├── policy_based/ # IPPO, MAPPO, MADDPG variants
│ │ └── actor_critic/ # COMA, MAVEN, DCG, NFSP, MAACC
│ ├── model_based/ # Model-based approaches (future)
│ ├── base.py # Algorithm base classes
│ └── taxonomy.py # Algorithm classification
├── environments/ # Environment management
│ ├── vectorized_env.py # Vectorized environments
│ └── gym_multigrid/ # MultiGrid environments
├── controllers/ # Training controllers
│ └── unified_multiagent_controller.py # Unified controller (IMPLEMENTED)
├── networks/ # Neural network architectures
│ ├── base_network.py # Base network classes
│ ├── custom_networks.py # 4 built-in architectures
│ └── multigrid_network.py # MultiGrid-specific networks
├── cli/ # Command Line Interface (NEW!)
│ ├── main_cli.py # Training commands (easymarl-train)
│ ├── wandb_cli.py # W&B integration (easymarl-wandb)
│ ├── env_cli.py # Environment management (easymarl-env)
│ └── algo_cli.py # Algorithm info (easymarl-algo)
├── tests/ # Comprehensive Test Suite (NEW!)
│ ├── test_algorithms.py # Algorithm functionality tests
│ ├── test_environments.py # Environment validation tests
│ ├── test_controllers.py # Controller integration tests
│ ├── test_utils.py # Utility function tests
│ ├── test_integration.py # End-to-end integration tests
│ └── __main__.py # CLI test runner
├── docs/ # Documentation (NEW!)
│ ├── CLI.md # Complete CLI documentation
│ ├── API.md # API reference documentation
│ └── TESTING.md # Testing framework documentation
├── api/ # Web API and deployment
│ ├── flask_backend.py # Main API server
│ ├── minimal.py # Lightweight deployment
│ └── deployment/ # Deployment configs
├── gui/ # User interfaces
│ ├── gradio_interface.py # Web-based GUI
│ └── react-frontend/ # React components
├── config/ # Configuration templates
│ ├── default.yaml # Default settings
│ ├── domain/ # Environment configs
│ ├── mode/ # Algorithm configs
│ └── templates/ # Config templates
├── examples/ # Tutorial examples
├── tools/ # Utility scripts
│ ├── start-easymarl.sh # Start both frontend and backend
│ ├── start-backend.sh # Start Flask backend only
│ └── start-frontend.sh # Start React frontend only
├── main.py # Main CLI entry point
├── setup.py # Package configuration
└── requirements.txt # Dependencies
- CLI:
easymarl-train --algorithm qmixorpython main.py - GUI:
easymarl-guiorpython -c "import easymarl; easymarl.launch_gui()" - API: Start with
./tools/start-easymarl.shor individual scripts - Tests:
python -m testsorpython -m tests --quick - Import:
import easymarl(for Python integration)
EasyMARL includes a comprehensive test suite to ensure reliability:
# Run all tests
python -m tests
# Run quick validation tests
python -m tests --quick
# Run specific test suite
python -m tests --suite algorithms
python -m tests --suite environments
# List available test suites
python -m tests --list
# Verbose test output
python -m tests --verboseTest Coverage:
- ✅ Algorithm Tests: Import, creation, basic training functionality
- ✅ Environment Tests: Creation, reset/step, vectorization
- ✅ Controller Tests: Unified controller integration
- ✅ Utility Tests: Configuration, advanced features, research tools
- ✅ Integration Tests: End-to-end pipeline, CLI, API, GUI
- ✅ CLI Test Runner: Easy test management and reporting
📖 Complete Testing Documentation - Detailed testing guide and best practices
Launch the professional web interface:
easymarl.launch_gui()- Interactive algorithm descriptions
- Performance comparisons
- Hyperparameter explanations
- Real-time algorithm switching
- Intuitive parameter tuning
- Real-time validation
- Configuration presets
- Export/import configurations
- One-click training start
- Real-time progress monitoring
- Live performance graphs
- Training status updates
- Comprehensive performance analysis
- Training curve visualization
- Statistical summaries
- Data export capabilities
- Watch trained agents in action
- Environment interaction videos
- Agent behavior analysis
- Custom rendering options
We welcome contributions! See our Contributing Guide for details.
git clone https://github.com/shreyanmitra/EasyMARL.git
cd EasyMARL
# Create development environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e .[all]
# Run tests
python -m tests
python -m tests --quick
# Run specific test suites
python -m tests --suite algorithms
python -m tests --suite integration
# Run linting (if available)
flake8 algorithms/ controllers/ core/ cli/ api/ gui/
black algorithms/ controllers/ core/ cli/ api/ gui/- Python 3.11+ (Required)
- Node.js 16+ (For React frontend only)
- Git (For cloning and development)
System Requirements:
- Memory: 4GB+ RAM recommended
- Storage: 2GB+ free space
- GPU: Optional but recommended for large experiments
If you use EasyMARL in your research, please cite:
@software{easymarl2024,
title={EasyMARL: Educational Multi-Agent Reinforcement Learning Framework},
author={Mitra, Shreyan},
year={2024},
url={https://github.com/shreyanmitra/EasyMARL}
}- Zero barriers to entry: Start learning MARL in minutes
- Interactive learning: Visual feedback and real-time monitoring
- Comprehensive coverage: All major MARL paradigms included
- Educational design: Code structure mirrors textbook concepts
- Research ready: Scale from prototype to publication
- Comprehensive algorithms: 21+ state-of-the-art implementations
- CLI for automation: Complete command-line interface for batch experiments
- Comprehensive testing: Extensive test suite ensuring reliability
- Experiment management: Professional tracking and analysis
- Extensible framework: Easy to add new algorithms and environments
- Classroom ready: GUI requires no programming experience
- Comparative analysis: Easy algorithm comparisons
- Visual learning: Rich visualizations and animations
- Local development: Optimized for GitHub Codespaces and local environments
Choose the deployment method that best fits your needs:
| Method | Best For | Setup Time | Features | Cost |
|---|---|---|---|---|
| CLI Commands | Research, automation, batch experiments | 30 seconds | Complete CLI suite, algorithm comparison | Free |
| React + Flask | Full development, local control | 5 minutes | Complete web interface, all features | Free |
| GitHub Codespaces | Students, zero-setup, cloud | 30 seconds | Browser-based, GPU support | Free* |
| Gradio (Python) | Quick experiments, notebooks | 30 seconds | Self-contained, educational mode | Free |
*Free with GitHub Student Pack (180 hours/month)
# Method 1: CLI Commands (NEW!)
pip install easymarl
easymarl-train --algorithm qmix --env MultiGrid-Empty-6x6-v0
# Method 2: Local React + Flask
git clone https://github.com/shreyanmitra/EasyMARL.git
cd EasyMARL && ./tools/start-easymarl.sh
# Method 3: GitHub Codespaces
# Click: https://codespaces.new/shreyanmitra/EasyMARL
# Method 4: Gradio (Python)
pip install easymarl && python -c "import easymarl; easymarl.launch_gui()"- 📧 Email: shreyan.m.mitra@gmail.com
- 💬 Discussions: GitHub Discussions
- 🐛 Bug Reports: GitHub Issues
- 📖 Documentation: Official Docs
This project is licensed under the MIT License - see the LICENSE file for details.
- Natasha Jaques - Original metacontroller framework inspiration
- DeepMind - QMIX and related algorithm implementations
- OpenAI - Multi-agent environment design principles
- The MARL Community - Continuous feedback and contributions