Skip to content

A curated portfolio of Explainable AI (XAI) projects exploring model interpretability across classical machine learning, deep learning, and large language models.

License

Notifications You must be signed in to change notification settings

DivyaSharma0795/Explainable_AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Explainable AI: A Portfolio of Techniques and Applications

Overview

This portfolio showcases my work for the Explainable AI (XAI) course, where I explored a spectrum of explainability techniques across different types of machine learning models—from interpretable classical models to complex neural networks.

Through five hands-on projects and a detailed blog post, I investigated:

  • Interpretable Machine Learning
  • Explainable Techniques for Complex Models
  • Explainable Deep Learning
  • Mechanistic Interpretability
  • XAI for Large Language Models (LLMs)
  • SHAP Values for Model-Agnostic Interpretability

Each project focuses on making AI models more transparent, understandable, and accountable—addressing the growing need for explainability in real-world AI applications.


Portfolio Summary

  • Techniques: Logistic Regression, Generalized Additive Models (GAM), Linear Regression

  • Explainability Tools:

    • Coefficient interpretation
    • Partial Dependence Plots (PDP)
    • Individual Conditional Expectation (ICE) plots
    • Accumulated Local Effects (ALE) plots
  • Key Learning: Compared global vs. local feature importance to understand drivers of customer churn in a telecommunications dataset. Provided actionable business recommendations based on interpretable models.


  • Techniques: Comparative model evaluation using interpretability tools

  • Explainability Tools:

    • Partial Dependence Plots (PDP)
    • Individual Conditional Expectation (ICE) plots
    • Accumulated Local Effects (ALE) plots
  • Key Learning: Used multiple visualization techniques to reveal both global trends and individual variability in feature effects. Highlighted the importance of combining different XAI tools for a holistic understanding of model behavior.


  • Techniques: CNN classification on MNIST, FGSM adversarial attacks

  • Explainability Tools:

    • Saliency Maps
  • Hypothesis Tested: Do adversarial examples significantly change feature importance?

  • Key Learning: Found that in this case, adversarial perturbations did not significantly alter the model’s focus (feature importance), demonstrating robustness. Used saliency maps to validate model stability under perturbations.


  • Techniques: Toy models of superposition, Sparse Coding, Dictionary Learning

  • Explainability Focus:

    • Visualizing compressed feature representations
    • Measuring reconstruction accuracy of entangled features
  • Key Learning: Demonstrated how neural networks can store more features than neurons via superposition. Recovered features using sparse coding, providing insight into how models might efficiently compress and retrieve information.


  • Techniques: Sentence embeddings using all-MiniLM-L6-v2 from Hugging Face

  • Explainability Tools:

    • PCA, t-SNE, UMAP for dimensionality reduction
  • Key Learning: Visualized how Large Language Models semantically organize sentences in embedding space. Used dimensionality reduction to make high-dimensional representations interpretable and to uncover latent clustering behavior.


  • Concept: Explained how SHAP values provide individualized, mathematically grounded explanations for machine learning predictions.
  • Key Learning: Discussed why SHAP is critical for trust, fairness, and accountability in AI. Provided real-world examples from healthcare, finance, and human resources to illustrate its impact.

Concepts Explored Across the Portfolio

Concept Example Project
Interpretable ML Churn Prediction
Explainable Techniques PDP, ICE, ALE
Explainable Deep Learning Saliency Maps on Adversarial Examples
Mechanistic Interpretability Superposition Toy Models
XAI in LLMs Sentence Embedding Visualizations
SHAP Values SHAP Blog Post

Why This Matters

As AI systems become increasingly embedded in decision-making processes, explainability is no longer optional.

  • Trust: Users need to understand how predictions are made.
  • Fairness: Exposing feature contributions can reveal biases.
  • Accountability: Transparent models enable responsible deployment.
  • Improvement: Explainability helps developers debug and refine models.

This portfolio demonstrates a comprehensive understanding of XAI, spanning from simple interpretable models to complex deep learning systems, and illustrates how explainability can be achieved across the AI spectrum.


How to Explore

Each project is housed in a separate GitHub repository (links provided above).

Within each repository, you will find:

  • Jupyter Notebooks with detailed, step-by-step workflows
  • Visualizations (heatmaps, saliency maps, embedding plots)
  • Summary reports of findings and key reflections

Contact

For questions, feedback, or collaboration, feel free to reach out:

About

A curated portfolio of Explainable AI (XAI) projects exploring model interpretability across classical machine learning, deep learning, and large language models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published