- This project is an Exploratory Data Analysis (EDA) of the Netflix Movies and TV Shows dataset.
- The goal is to explore the Netflix catalog identify trends in across years, countries, genres, ratings, and content duration.
- Python 3.11
- Pandas
- Numpy β data analysis and cleaning
- Matplotlib
- Seaborn β data visualization
- Load and inspect the dataset
- Data cleaning and preprocessing (duplicates, missing values, parsing dates and duration)
Exploration of:
- Movies vs TV Shows distribution
- Release years vs years added to Netflix
- Movie durations and number of TV show seasons
- Top countries by content production
- Genres and categories distribution
- Age ratings distribution
- Top directors and actors
- π Netflix rapidly expanded its library between 2015 and 2020.
- π₯ Most movies are 80β120 minutes long.
- πΊ The majority of TV shows have only 1 season.
- π USA and India dominate Netflix content production.
- π Most common categories include International Movies, Dramas, Comedies.
- π A large share of Netflix content targets mature audiences (TV-MA, TV-14).
netflix_eda_project/ ββ data/ # dataset (optional, can be downloaded separately) ββ netflix_eda.ipynb # Jupyter Notebook with analysis ββ README.md # project description ββ requirements.txt # dependencies ββ .gitignore # ignore rules for Git
The dataset is available on Kaggle:
- Understanding Netflixβs content distribution helps identify strategic markets, content preferences, and opportunities for localized production.
- This EDA can guide decisions for media acquisition and audience targeting.
π Back to Portfolio