Skip to content
View yievia's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report yievia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yievia/README.md

Hi there, I'm Xin Yie πŸ‘‹

πŸŽ“ Aspiring Data Professional
🌏 Based in Malaysia
πŸ“Š Passionate about using data to solve real-world problems


πŸš€ About Me

With a background in biotechnology and scientific reporting, my goal is to bridge scientific rigor and analytical creativity β€” using data to explain why things happen and predict what happens next.

I’ve built and analyzed models across industries β€” from customer churn and insurance claims to content engagement and agricultural yield. I’m driven by curiosity, structure, and the impact data can make when translated into real-world decisions.


🧠 Core Skills

  • Languages & Tools: Python, R, SQL, Excel, Git, PowerBI
  • Libraries: pandas, NumPy, scikit-learn, Matplotlib, Seaborn
  • Techniques: EDA, Classification, Clustering, Regression, Hypothesis Testing, PCA
  • Other: Agile, Google-Certified Project Management, Scientific Documentation

πŸ“Œ Featured Projects

πŸ”Ή Logistics Inventory Data Analysis (SQL + PowerBI)

SQL and Power BI analysis of shipment lead times, delay rates, inventory days, and SKU performance for a retail logistics context.

πŸ”Ή Recipe Site Traffic Prediction (Machine Learning + KPI)

Classified high-traffic recipes using Logistic Regression, Decision Tree, and Random Forest. Defined a business KPI β€” High Traffic Conversion Rate (HTCR) β€” to align model precision with strategy. β†’ Best Model: Logistic Regression (Precision = 0.88, HTCR = 7.13)

πŸ”Ή Telecom Customer Churn Analysis

Predictive model to identify customers at risk of churn using billing and usage patterns.
F1 Score: 0.85 | Key tools: Python, Random Forest, Seaborn_

πŸ”Ή Insurance Claim Outcome Modeling

Built classifiers to predict insurance claims and explored risk segmentation.
Accuracy > 75% | SMOTE for class balancing_

πŸ”Ή Netflix Content Trends

EDA and visualization of global trends across genres, ratings, and durations.
Clear dashboards to support content strategy decisions_

πŸ”Ή Penguin Clustering (PCA + K-Means)

Unsupervised learning project to classify species based on biometric traits.

πŸ”Ή Crop Yield Prediction (Regression)

Modeled yield based on environmental factors to support precision agriculture.

🧠 More projects available in the Repositories


πŸ“ˆ Currently Exploring

  • Streamlit & dashboard deployment
  • Data storytelling with Tableau and Power BI
  • Model interpretability (SHAP, feature importance)
  • Data pipelines & workflow automation

πŸ“¬ Get in Touch


Thanks for visiting! Let’s connect and collaborate on impactful data projects!

Pinned Loading

  1. telecom-customer-churn telecom-customer-churn Public

    This project uses real-world telecom customer data to predict churn behavior using machine learning. It includes data cleaning, exploratory data analysis (EDA), feature engineering, model training …

    Jupyter Notebook

  2. car-insurance-claim-predictor car-insurance-claim-predictor Public

    Predicting car insurance claims using single-feature logistic regression. Identify the most informative predictor to build simple, production-ready models.

    Jupyter Notebook

  3. penguin-species-clustering penguin-species-clustering Public

    Identify natural groupings of Antarctic penguins using k-means clustering based on physical measurements

    Jupyter Notebook

  4. sowing-success-crop-prediction sowing-success-crop-prediction Public

    Predicting optimal crops using logistic regression on soil features β€” identifying the most informative single metric for farmers.

    Jupyter Notebook