Skip to content

MarcoSutti/ml_projects

Repository files navigation

Machine Learning Projects

This page collects some Machine Learning (ML) projects I have been recently working on. These projects originated from the Data Science in Python course that I have been following on Dataquest. Information about the data used and project descriptions are contained in the corresponding Jupyter Notebooks. Please feel free to get in touch with me if you notice any errors. Any suggestion is welcome!

This repository is work in progress and the projects themselves might undergo some changes over time. For the moment, there are seven projects in the repository:

1. Predicting car prices

We use the k-nearest neighbors algorithm to predict a car's market price using its attributes. We explore some basic data cleaning techniques, univariate and multivariate model, and hyperparameter tuning.

2. Predicting house sale prices

We use linear regression to predict house sale prices in Ames, Iowa. We explore feature engineering, feature selection, and some validation techniques: simple train/test validation, holdout validation, and k-fold cross validation.

3. Building a handwritten digits classifier

We use a deep, feedforward neural network to classify handwritten digits. We compare the results obtained with a k-nearest neighbors algorithm. We explore how the number of neurons and the number of hidden layers impact on the accuracy scores, and use some validation techniques to quantitatively evaluate overfitting.

4. Predicting bike rentals

We try to predict the total number of bikes people rented in a given hour. To accomplish this, we create a few different machine learning models (linear regression, decision trees, and random forests), and evaluate their performance.

5. Predicting the stock market

We use pandas time series tools to generate new indicators in our dataframe, and then we train a linear regression model to make predictions about the future prices of the S&P500 Index.

6. Spam filter for SMS messages

We use the naive Bayes algorithm to build a spam filter for SMS messages.

7. Clustering NBA players

We use the K-means clustering algorithm to explore patterns in a dataset of stats about NBA players.

About

Machine Learning Projects

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published