A high-performance parallel matrix multiplication implementation demonstrating the effectiveness of Java's Fork/Join framework for divide-and-conquer parallelism. Features comprehensive benchmarking, interactive GUI, and automated testing.
This project implements and compares sequential and parallel matrix multiplication algorithms, showcasing:
- Parallel Computing: Fork/Join framework for efficient task decomposition
- Performance Analysis: Comprehensive benchmarking with threshold tuning
- Clean Architecture: MVC pattern with clear separation of concerns
- Interactive Tools: GUI application with real-time progress tracking
- Speedup: Up to 4-5x on multi-core systems
- Scalability: Tested on matrices up to 2048×2048
- Correctness: 100% test pass rate with edge case coverage
| Feature | Description |
|---|---|
| Sequential Algorithm | Classic O(n³) triple-loop implementation |
| Parallel Algorithm | Fork/Join framework with row-range decomposition |
| Threshold Control | Configurable granularity for optimal performance |
| Thread Safety | Immutable matrices, atomic counters, exclusive writes |
| Error Handling | Dimension validation with descriptive messages |
- Automated Test Suite - Correctness validation (sequential vs parallel)
- Edge Case Testing - 1×1, 1×N, N×1 matrices
- Threshold Tuning - Find optimal parallelization granularity
- Performance Metrics - Speedup calculation, efficiency analysis
- Benchmark Averaging - 5 iterations per test for reliability
- Interactive GUI - Swing-based with MVC architecture
- Live Progress - Real-time updates during execution
- Results Table - Accumulate and compare multiple runs
- Input Validation - User-friendly error messages
┌──────────────────────┐
│ MatrixMultiplier │ (Interface)
│ + multiply() │
└──────────────────────┘
▲
│ implements
┏─────┴─────────────┓
│ │
┌───┴─────────────┐ ┌───┴──────────────────┐
│ Sequential │ │ ForkJoinMatrix │
│ Multiplier │ │ Multiplier │
└─────────────────┘ └──────────────────────┘
Row-Range Decomposition with Divide-and-Conquer:
[Matrix: Rows 0-1023]
/ \
[Rows 0-511] [Rows 512-1023]
/ \ / \
[0-255] [256-511] [512-767] [768-1023]
| | | |
(Direct Compute when rowCount ≤ threshold)
Task Execution Pattern:
if (rowCount <= threshold) {
computeDirectly(); // Base case
} else {
leftTask.fork(); // Async execution
rightTask.compute(); // Current thread
leftTask.join(); // Synchronize
}- Java: JDK 8 or higher
- OS: Windows, macOS, or Linux
- CPU: Multi-core processor (for meaningful speedup)
-
Clone the repository:
git clone https://github.com/DORMODO/ParallelMatrixMultiplication cd ParallelMatrixMultiplication -
Compile all source files:
cd src javac model/*.java ui/*.java test/*.java
-
Verify compilation:
# Run tests java test.MatrixTest
java app.MainThis will sequentially execute:
- Correctness Tests - Validate implementations
- Performance Benchmarks - Measure speedup
- GUI Application - Interactive interface
Tests Only:
java test.MatrixTestBenchmarks Only:
java benchmark.MatrixBenchmarkGUI Only:
java app.Main --gui-only- Optimal Threshold: 32-64 rows provides best balance
- Scalability: Larger matrices achieve better speedup
- Efficiency: 40-70% parallel efficiency on 6-core CPU
- Amdahl's Law: Overhead prevents linear scaling
parallelMatrixMultiplication/
│
├── src/
│ ├── Main.java # Entry point (For GUI)
│ ├── RunAll.java # Run All 3 Mains Sequentially
│ │
│ ├── model/
│ │ ├── Matrix.java # Immutable matrix data structure
│ │ ├── MatrixMultiplier.java # Strategy interface
│ │ ├── SequentialMatrixMultiplier.java
│ │ ├── ForkJoinMatrixMultiplier.java
│ │ ├── MatrixUtils.java # Utilities (random, measure, validate)
│ │ └── MatrixBenchmark.java # Performance benchmarks
│ │
│ ├── ui/
│ │ ├── MatrixGUI.java # View (Swing components)
│ │ └── MatrixGUIController.java # Controller (event handlers)
│ │
│ └── test/
│ └── MatrixTest.java # Correctness tests
│
│
│
├── docs/
│ ├── Report.pdf # Detailed analysis
│ └── ParallelProcessing_ProjectIdeas_Fall2025.pdf # Design documentation
│
└── README.md
| Implementation | Time Complexity |
|---|---|
| Sequential | O(m × n × p) |
| Parallel | O(m × n × p / T) |
Where:
- m = rows in matrix A
- n = columns in A / rows in B
- p = columns in matrix B
- T = number of threads
- Immutable Input: Matrix class uses defensive copying
- Exclusive Writes: Each task writes to distinct row ranges
- Atomic Counter:
AtomicIntegerfor progress tracking - Read-Only Access: Input matrices never modified
- Oracle - Fork/Join Framework Guide
- Baeldung - Guide to Fork/Join Framework
- GeeksforGeeks - Fork/Join vs ExecutorService
Academic Project for CS471 - Concurrent Programming
Institution: Helwan University
Course: Parallel and Distributed Computing
Semester: Fall 2025
This project is submitted as coursework and follows university academic integrity policies.
- Student: Youssef Badr, Youssef Assem, Youssef Ezzat and Omar Bashari
- Course Instructor/Assistant: Abd Allah Essam
For questions or feedback:
- GitHub: DORMODO
- University: Computer Science Department, Helwan University
⭐ If you found this project helpful, please consider giving it a star! ⭐
Made for CS471 - Concurrent Programming
