This project focuses on developing an accurate time series sales forecasting model for Leveraging machine learning techniques, the project aims to predict future sales trends based on historical data to assist in strategic decision-making, inventory management, and revenue optimization.
-
Sales Trend Prediction: Develop robust forecasting models to project future sales across various time horizons.
-
Pattern Recognition: Identify seasonal trends, cyclic patterns, and key factors influencing sales.
-
Data-Driven Decision Making: Equip stakeholders with actionable insights to optimise sales operations.
-
Model Evaluation and Optimization: Compare multiple forecasting techniques to ensure the most accurate predictions.
Companies faces challenges in predicting sales trends, resulting in inventory inefficiencies and missed revenue opportunities. The goal of this project is to provide accurate forecasts that support business growth by implementing reliable forecasting models that can aid the company in improving:
- Demand planning and inventory management.
- Revenue growth through accurate forecasting.
- Resource allocation and budgeting.
The dataset consists of historical sales records containing various attributes essential for analysis.
| Column Name | Description |
|---|---|
| Order Date | Date and time of the sales transaction |
| Region | Geographical sales region |
| Customer Segment | Customer type (e.g., Consumer, Small Business) |
| Product Category | Category of product sold |
| Units Sold | Number of units sold |
| Unit Price | Price per unit |
| Revenue | Total revenue generated |
| Profit Margin | Profit margin percentage |
- Data cleaning: Handling missing values and duplicate records.
- Feature engineering: Creating time-based features (e.g., month, year, quarter).
- Sales trends analysis.
- Seasonal decomposition to identify underlying patterns.
- Correlation analysis to determine factors affecting sales.
The project explores multiple time series forecasting models, including:
Traditional Models:
- Autoregressive Integrated Moving Average (ARIMA)
- Exponential Smoothing (Holt-Winters)
Machine Learning Approaches:
- Facebook Prophet (for weekly and monthly sales forecasting)
- Long Short-Term Memory (LSTM) Networks
Performance metrics used to assess model accuracy:
- Mean Absolute Error (MAE): Measures average error in prediction.
- Root Mean Squared Error (RMSE): Penalizes large errors more than MAE.
- Mean Absolute Percentage Error (MAPE): Evaluates the percentage error relative to actual values.
- Sales trend visualisation with line and bar charts.
- Interactive dashboards showcasing sales performance.
- Heatmaps to detect regional and category-wise sales trends.
Our analysis shows that LSTM Weekly Metrics provides optimal performance for short-term forecasting. The model achieved consistent MAPE (Mean Absolute Percentage Error) values across categories:
- Electronics: 17.13% MAPE
- Furniture: 14.65% MAPE
- Office Supplies: 15.17% MAPE
For long-term predictions, Exponential Smoothing demonstrated superior performance with low MAPE values:
- Electronics: 16.66% MAPE
- Office Supplies: 14.47% MAPE
- Furniture: 23.88% MAPE
MAPE (Mean Absolute Percentage Error) was chosen as the primary evaluation metric for several reasons:
- Scale Independence: Enables meaningful comparisons across different product categories
- Error Weighting: Effectively penalizes large prediction deviations
- Industry Standard: Widely adopted in demand forecasting applications
-
Prophet
- Weekly Forecasts: 15.35% MAPE
- Monthly Forecasts: 24.20% MAPE
- Verdict: Moderate performance for weekly, poor for monthly forecasts
-
ARIMA
- Weekly MAPE Range: 25.66%
- Monthly Mape Range: 23.29%
- Verdict: Struggles with seasonal patterns, higher error rates
Selected Model: LSTM Weekly Metrics
Key Benefits:
- Lowest MAPE values across all product categories
- Strong pattern recognition capabilities
- Robust generalization performance
Selected Model: Exponential Smoothing (ES) Monthly Metrics
Key Benefits:
- Consistent performance across major categories
- Computationally efficient
- Excellent handling of seasonal patterns
- Improved Accuracy: Enhanced demand planning and inventory optimization
- Granular Insights:
- Weekly trends through LSTM
- Monthly patterns through ES
- Efficient Implementation:
- LSTM for complex pattern recognition
- ES for straightforward deployment
- Data-Driven Decisions: Informed choices for pricing and promotions
- Resource Optimization: Better allocation of inventory and staff
- Risk Management: Reduced forecasting errors across categories
- Incorporation of external economic factors for better forecasting.
- Deployment of models via cloud platforms (AWS, Azure).
- Development of an automated retraining pipeline.
