Sales Decline Forecasting is a comprehensive system for analyzing and predicting sales dynamics of alcoholic beverages in retail stores. The solution uses several state-of-the-art deep learning techniques, and is designed to work with datasets containing multiple stores.
-
Manufacturers and Distributors (e.g., Sazerac):
- Forecast demand for their products across different stores and regions.
- Optimize logistics and inventory.
- React quickly to sales declines and identify causes (seasonality, competition, assortment changes).
- Plan marketing campaigns and promotions.
-
Retail Chains and Stores:
- Manage inventory to avoid shortages or overstock.
- Analyze which categories and brands are gaining or losing popularity.
- Evaluate the effectiveness of working with specific suppliers.
- Make informed decisions about assortment expansion or reduction.
-
Analysts and Sales Departments:
- Receive automated forecasts and sales dynamics reports.
- Quickly detect anomalies and trends.
- Assess the impact of external factors (holidays, promotions, weather) on sales.
-
Company Management:
- Make strategic decisions based on data-driven forecasts.
- Evaluate business performance by region, store, and product category.
- Plan budgets and investments.
- Models are trained on the full dataset, including all stores, enabling forecasts for any store and brand present in the database.
- Utilizes modern architectures: LSTM with attention, Temporal Fusion Transformer (TFT), LLM (GPT-4o mini) and Chronos Bolt for generating explanations.
- Handles time series data, store embeddings, feature scaling, and accounts for seasonality and holidays.
- The system provides an intuitive interface (Streamlit) where users can:
- Select a model from a dropdown.
- Select a store id from a dropdown menu.
- Receive a sales forecast for the next 30 days.
- The user selects a model and a store from the list.
- The system generates a sales forecast for the chosen store.
- Users can compare forecasts for different stores.
- Modular architecture, easily extensible and scalable.
- All data processing and training steps are logged.
- Uses Airflow for orchestration, Tensorboard for experiment tracking.
- Integration with external services via API is supported.
- Clone the repository:
git clone https://github.com/AAN-innopolis/Sales_Decline_Forecasting.git cd Sales_Decline_Forecasting - Install dependencies:
uv sync
- Prepare your data:
- Place your dataset in the
data/raw/directory. Ensure it contains the required columns (see Data Description).
- Place your dataset in the
- Run the pipeline:
- Use the provided scripts or Airflow DAGs to preprocess data, train models, and generate forecasts.
- Launch the interactive interface:
- Start the Streamlit app to interact with the system and visualize forecasts.
- Source: Iowa Department of Revenue, Alcoholic Beverages Division (Commerce)
- License: Creative Commons Zero (CC0)
- Coverage:
- Location: Iowa, USA
- Start Date: 2012-01-01
- Updates: Data is updated monthly, typically available on the first day of each month.
- Rows: 31.6 million+ (as of May 2025)
- Columns: 24
- Each row: Represents an individual product purchase at the store level (Class E liquor license: grocery stores, liquor stores, convenience stores, etc. — off-premises consumption).
- Topics: Sales & Distribution (liquor sales, spirit sales, store sales, liquor licensees)
The dataset should include, at minimum, the following columns (see the official data portal for full details):
invoice_line_no: Unique identifier for the individual liquor product in the store orderdate: Date of orderstore: Unique number assigned to the storename: Name of the storeaddress: Address of the storecity: City where the store is locatedzipcode: Zip code of the storestore_location: Geographical location (point)county: County namecategory: Category code of the liquor orderedcategory_name: Category name of the liquoritemno: Item numberim_desc: Item descriptionpack: Number of bottles in a casebottle_volume_ml: Volume of each bottle (ml)state_bottle_cost: Cost per bottle (wholesale)sale_bottles: Number of bottles soldsale_dollars: Total sales amountsale_liters: Total volume sold (liters)
Note: For best results, ensure your data is as complete as possible and matches the above schema. The system is designed to handle large-scale, real-world retail sales data.
Contributions are welcome! Please see CONTRIBUTIONS.md for guidelines.
This project is licensed under the terms of the LICENSE.