This repository showcases the work completed during my 4-week AI Product Development Internship at a reputed AI consulting firm. The focus was on building and applying data science solutions across real-world domains using Python, machine learning, and statistics.
🗓️ Duration: May 19 – June 13, 2025
🏢 Domain: Applied AI, Machine Learning, Data Analytics
🧠 Role: Intern – AI & Data Science
📍 Mode: Remote / Hybrid
Throughout this internship, I worked on 7 hands-on projects, each designed to simulate real-world challenges. These projects covered data preprocessing, exploratory analysis, statistical inference, machine learning, and business optimization techniques.
| Folder | Project Title | Brief Description |
|---|---|---|
PROJECT1 |
Exploratory Data Analysis (EDA) | Performed in-depth EDA on a structured dataset to identify patterns, correlations, and distributions. Created visualizations using Seaborn and Matplotlib. |
PROJECT2 |
Classification on Business Data | Built supervised ML models (Logistic Regression, Decision Trees) to classify business outcomes. Evaluated model accuracy, precision, recall, and ROC AUC. |
PROJECT3 |
Customer Segmentation using Clustering | Used K-Means and hierarchical clustering to segment users based on behavior and demographics. Interpreted clusters for business personalization. |
PROJECT4 |
Predictive Modeling with Regression | Developed regression models to forecast numerical values (e.g., sales, revenue, demand). Applied feature engineering and hyperparameter tuning. |
PROJECT5 |
A/B Testing for Optimization | Conducted hypothesis testing and A/B experiments to assess the impact of design or product changes on user conversion rates. |
PROJECT6 |
Dashboarding & KPIs | Created insightful dashboards (with Matplotlib or Power BI) to visualize KPIs and track business metrics effectively. |
PROJECT7 |
Domain-Specific Project (e.g., Healthcare / Finance) | Applied machine learning and data analysis in a specialized context — such as survival analysis in clinical trials or time-series forecasting in finance. |
- Languages: Python, Markdown
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Statsmodels
- Techniques:
- Data Cleaning & Preprocessing
- Exploratory Data Analysis (EDA)
- Supervised & Unsupervised Learning
- Statistical Testing (t-test, chi-square, ANOVA)
- Feature Engineering
- Regression & Classification
- Clustering & Segmentation
- Data Visualization & Dashboarding
✔️ Improved understanding of full ML pipelines
✔️ Ability to design, test, and evaluate real-world data science models
✔️ Hands-on experience with domain-specific problem solving
✔️ Ability to communicate insights through visual storytelling
Grateful to the mentors and team at Samatrix Consulting Pvt. Ltd. for providing a structured and impactful learning experience.
Feel free to connect, collaborate, or explore these projects further:
🔗 LinkedIn – Om Choksi
📧 omchoksiii@outlook.com
🌐 Portfolio