An NLP-driven framework for classifying unstructured industrial safety reports.
This project automates the risk assessment of "Hiyari-Hatto" (Near-Miss) incidents in construction environments. By using Natural Language Processing (NLP) and rule-based logic, it converts unstructured worker reports into structured safety data, enabling site managers to predict and prevent accidents before they occur.
(Figure 1: Real-time analysis interface showing risk classification and keyword extraction)
- Natural Language Processing (NLP):
- Analyzes raw text using
TextBlobandNLTK. - Extracts safety-critical keywords (e.g., "leak," "spark," "voltage").
- Analyzes raw text using
- Automated Risk Classification:
- Categorizes incidents into High (Critical), Medium (Caution), and Low (Monitor) based on industry safety standards.
- Real-Time Visualization:
- Interactive dashboard built with Flask and Bootstrap 5.
- Visualizes risk trends using Chart.js.
- Industrial Applicability:
- Designed to handle noisy, non-technical language often found in on-site worker logs.
| Component | Technology |
|---|---|
| Backend | Python, Flask (Web Server) |
| NLP Engine | NLTK, TextBlob, Pandas |
| Frontend | HTML5, CSS3, Bootstrap 5 |
| Data Analysis | Scikit-learn, NumPy |
AI_Hazard_Risk_Analyzer/
βββ data/ # Sample Hiyari-Hatto datasets (CSV)
βββ src/
β βββ app.py # Main Flask application logic
β βββ templates/ # UI Dashboard (HTML/Jinja2)
βββ requirements.txt # Research dependencies
βββ README.md # Project documentation# Create Virtual Environment (Sandbox)
python -m venv venv
# Activate (Windows)
.\venv\Scripts\activate
# Install Dependencies
pip install -r requirements.txt# Download required NLTK data (run once)
python -m textblob.download_corpora# Navigate to the Source Folder
cd src
# Start the Analysis Engine
python app.pyAccess the dashboard at: http://127.0.0.1:5001
If you see a "running scripts is disabled" error when trying to activate the environment, run this command in PowerShell to allow script execution:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope ProcessThen try running .\venv\Scripts\activate again.
This tool was developed to address the "Data Silo" problem in industrial safety. While large construction firms collect thousands of Near-Miss reports, they are often stored as text logs that are difficult to analyze at scale.
Future Scope:
- Integration with Formal Verification methods to model safety state transitions.
- Expansion of the dataset to include multi-lingual support (Hindi/Japanese).
Gaurav Dev
- Research Interests: Industrial Safety Systems, Formal Methods, NLP.