Skip to content
View gkhnelbstn's full-sized avatar

Highlights

  • Pro

Organizations

@abclojistik

Block or report gkhnelbstn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
gkhnelbstn/README.md

Gรถkhan Elbistan ๐Ÿ‘จโ€๐Ÿ’ป

Data Scientist โ€ข MLOps Engineer โ€ข Database Architect

Last Updated IBM Certified Code Quality

"I live with data, for data โ€” because everything is about data."

Typing SVG

๐ŸŽฏ Core Competencies

mindmap
  root((Data Expert))
    Data Science & ML
      CRISP-DM
      Statistical Analysis
      Deep  Learning
      MLOps & LLMOps
    Database Architecture
      PostgreSQL
      MongoDB
      DynamoDB
      Vector DBs
    MLOps Infrastructure
      Airflow
      MLflow
      Docker
      Monitoring
    Best Practices
      SOLID Principles
      Clean Code
      12-Factor Apps
      Documentation
Loading

๐Ÿ’พ Database Architecture & Management

Primary Databases
Category Technologies & Expertise
Relational PostgreSQL MSSQL
NoSQL MongoDB DynamoDB
Vector ChromaDB Pinecone
Cache/Queue Redis
Data Warehouses Snowflake BigQuery Redshift
Database Tools & Management
Category Tools
Primary UI Tools DataSpell DB Tools pgAdmin
Management Tools MongoDB Compass SSMS
Cloud Management AWS Console RDS
CLI Tools psql AWS CLI
Learning & Exploration
Category Technologies
Time Series TimescaleDB InfluxDB
Graph Neo4j
Search & Analytics Elasticsearch ClickHouse
Distributed Cassandra CockroachDB

Database selection is always problem-driven! Primarily focused on data science and analytics use cases.

๐Ÿ“Š Data Science & Analytics Stack

Data Processing & Analysis
Category Technologies
Core Processing Pandas NumPy SciPy Polars
Big Data Spark Dask Vaex
Machine Learning scikit-learn XGBoost LightGBM
Deep Learning TensorFlow PyTorch
Statistical Analysis statsmodels scipy.stats pingouin
Time Series Prophet pmdarima
Survival Analysis lifelines
Data Visualization & Applications
Category Technologies
Interactive Viz Plotly Dash
Web Applications Streamlit Gradio
Static Plotting Matplotlib Seaborn
Business Intelligence Grafana Metabase

๐Ÿ› ๏ธ MLOps & Infrastructure

Model Development & Experimentation
Category Technologies & Status
Experiment Tracking MLflow W&B
Hyperparameter Tuning Optuna Hyperopt
Version Control DVC Git LFS
Model Registry MLflow Registry
Deployment & Serving
Category Technologies & Status
API Development FastAPI Flask Pydantic
Containerization Docker Docker Compose
Model Serving BentoML TorchServe
Cloud Deployment AWS Lambda ECS Step Functions
LLMOps & AI Infrastructure
Category Technologies & Status
LLM Monitoring Langfuse LangSmith
LLM Frameworks LangChain Haystack
Vector Databases ChromaDB Pinecone
Model Hosting Hugging Face Ollama
Monitoring & Observability
Category Technologies & Status
Metrics & Dashboards Grafana Prometheus
System Monitoring Node Exporter cAdvisor
Application Monitoring MLflow Custom Metrics
Tracing OpenTelemetry Jaeger
Alerting Grafana Alerting Slack Integration
Workflow Orchestration
Category Technologies & Status
Primary Orchestrator Airflow
Learning/Exploring Prefect Dagster
Low-Code Solutions n8n AWS Step Functions
Schedulers cron AWS EventBridge

โ˜๏ธ Cloud & Infrastructure

Cloud Platforms
Category Technologies & Status
Primary Cloud AWS
Secondary Cloud GCP
Self-Hosted Docker Portainer
Programming Languages
Language Proficiency Use Cases
Python Expert Data Science, MLOps, Backend APIs
SQL Expert Database queries, analytics, ETL
Go Beginner Microservices, CLI tools
R Intermediate Statistical analysis (rarely used)

๐Ÿ’ก Development & Statistical Philosophy

Statistical Approach
class StatisticalPhilosophy:
    """Bayesian thinking, frequentist validation."""
    
    def __init__(self):
        self.statistical_practices = {
            "hypothesis_testing": {
                "approach": "Bayesian-first, frequentist validation",
                "tools": ["scipy.stats", "statsmodels", "pymc"],
                "principles": [
                    "Effect size over p-values",
                    "Confidence intervals",
                    "Power analysis",
                    "Multiple testing correction"
                ]
            },
            "model_evaluation": {
                "cross_validation": ["time-series-split", "nested-cv"],
                "metrics": ["business-aligned", "statistical-rigor"],
                "validation": ["out-of-time", "out-of-sample"]
            },
            "experimental_design": {
                "methods": [
                    "A/B Testing",
                    "Multi-armed bandits",
                    "Factorial designs"
                ],
                "considerations": [
                    "Sample size calculation",
                    "Randomization",
                    "Control groups"
                ]
            }
        }
        
    def favorite_template(self):
        return "cookiecutter-data-science by @drivendataorg"
Project Structure Philosophy
๐Ÿ“ project_name/
โ”œโ”€โ”€ ๐Ÿ“ data/               # Data files (git-ignored, DVC-tracked)
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ raw/           # Immutable raw data
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ processed/     # Cleaned, transformed data
โ”‚   โ””โ”€โ”€ ๐Ÿ“ features/      # Feature engineering outputs
โ”œโ”€โ”€ ๐Ÿ“ notebooks/         # Jupyter notebooks (EDA, experiments)
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ 00_eda.ipynb
โ”‚   โ””โ”€โ”€ ๐Ÿ“ 01_modeling.ipynb
โ”œโ”€โ”€ ๐Ÿ“ src/               # Source code
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ data/         # Data processing
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ features/     # Feature engineering
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ models/       # Model training and inference
โ”‚   โ””โ”€โ”€ ๐Ÿ“ visualization/# Plotting and dashboards
โ”œโ”€โ”€ ๐Ÿ“ tests/            # Test files
โ”œโ”€โ”€ ๐Ÿ“ configs/          # Configuration files
โ”œโ”€โ”€ ๐Ÿ“ docs/             # Documentation
โ”œโ”€โ”€ ๐Ÿ“ monitoring/       # Grafana dashboards, alerts
โ”œโ”€โ”€ ๐Ÿ“ deployment/       # Docker, K8s, cloud configs
โ”œโ”€โ”€ ๐Ÿ“„ .env.example      # Environment variables template
โ”œโ”€โ”€ ๐Ÿ“„ .gitignore       
โ”œโ”€โ”€ ๐Ÿ“„ pyproject.toml    # Project metadata and dependencies
โ”œโ”€โ”€ ๐Ÿ“„ README.md         # Project documentation
โ”œโ”€โ”€ ๐Ÿ“„ Dockerfile        # Container definition
โ””โ”€โ”€ ๐Ÿ“„ docker-compose.yml # Local development stack
Code Quality Standards
Category Tools & Practices
Linting Black isort flake8
Type Checking mypy Pydantic
Testing pytest pytest-cov
Documentation Sphinx MkDocs
Pre-commit pre-commit

๐ŸŽฎ Fun Projects & Interests

Cookiecutter DS

My go-to template for structured data science projects!

๐ŸŽฒ Gaming Database Project

My favorite game is... building data pipelines so powerful that even the final boss (your data chaos) gets defeated before the first turn!
Currently working on a comprehensive video games database with ML-powered recommendation system!

๐Ÿ•น๏ธ Video Games Database

Current Features:

  • Comprehensive game metadata collection
  • User rating prediction models
  • Recommendation engine using collaborative filtering
  • Real-time data pipeline with Airflow
  • Interactive Streamlit dashboard
  • Self-hosted MongoDB cluster
  • Grafana monitoring for data quality

Tech Stack: Python, MongoDB, Airflow, MLflow, Streamlit, Docker

๐Ÿ“Š GitHub Statistics

GitHub Stats

GitHub Streak

Top Languages

GitHub Trophies

๐Ÿค Let's Connect & Collaborate


๐ŸŽฏ 2025 Goals & Progress
  • Complete IBM Data Scientist Certification โœ…
  • Deploy 5 production ML models with full monitoring
  • Master Grafana & Prometheus for ML observability
  • Contribute to 3+ open-source MLOps projects
  • Complete comprehensive video games database project
  • Learn Go language fundamentals
  • Implement end-to-end LLMOps pipeline
  • Write 10 technical blog posts about MLOps

Current Focus: Building robust, self-hosted MLOps infrastructure and mastering LLMOps practices.


Last updated: 2025-08-01 13:27:52 UTC by @gkhnelbstn
โœจ Always learning, always building, always optimizing โœจ

Profile Views

Pinned Loading

  1. IBM-Data-Science-Professional-Certificate-Course-Notes IBM-Data-Science-Professional-Certificate-Course-Notes Public

    I will upload my notes from IBM Data Science course in here. This repo will include my exams, projects, notes etc.

    Jupyter Notebook 2 1

  2. ibmcapstoneproject ibmcapstoneproject Public

    This repository contains files for end-to-end IBM Data Science Professional Certificate Capstone Project.

    Jupyter Notebook 2

  3. interviewqs-solutions interviewqs-solutions Public

    Python solutions for InterviewQs' competitions.

    Jupyter Notebook 1

  4. kaggle-competitions kaggle-competitions Public

    This repository contains notebooks for my solutions on Kaggle competitions (Not the private ones though).

    Jupyter Notebook 1

  5. end-to-end-video-game-ml end-to-end-video-game-ml Public

    End to end video game ML project for predicting Meta Critic point. This project will use Fast API for model prediction, maybe Flask for backend, Streamlit for frontend.

    Python 1