CryptoClustering Challenge

DataClass Module 19 – Unsupervised Learning
EdX/UT Data Analytics and Visualization Bootcamp
Cohort UTA-VIRT-DATA-PT-11-2024-U-LOLC
Author: Neel Agarwal

Project Overview

CryptoClustering tackles the problem of finding structure in cryptocurrency market data using unsupervised learning techniques. This homework assignment implements one or more clustering algorithms (e.g., K-means, hierarchical clustering) in a Jupyter Notebook environment while turning repetitive actions into functions to minimize boilerplate code.

Key objectives:

Preprocess crypto-market data to normalize and prepare for clustering.
Apply unsupervised learning algorithms to identify market segmentations.
Evaluate clustering results based on the provided rubric criteria.
Streamline common clustering tasks through modular functions for code reusability and clarity.

Installation & Setup

Clone the Repository

git clone https://github.com/yourusername/CryptoClustering.git
cd CryptoClustering

Set Up Virtual Environment (Recommended)

python -m venv venv
# On Unix/MacOS:
source venv/bin/activate
# On Windows:
venv\Scripts\activate

Install Required Dependencies
```
pip install -r requirements.txt
```
Run the Jupyter Notebook

Launch the notebook environment to execute the unsupervised learning experiments:
```
jupyter notebook Crypto_Clustering.ipynb
```

Note

Ensure that you have installed all required Python packages (such as pandas, numpy, scikit-learn, and matplotlib) as listed in your requirements.txt.

Unsupervised Learning Methodology

Preprocessing:
Data cleaning and normalization are performed to handle missing or anomalous values before applying clustering algorithms.
Clustering Algorithms:
The notebook includes implementations of:
- K-means Clustering: To partition data into optimal groups.
- Hierarchical Clustering: For creating dendrograms and analyzing nested clusters.
Modular Functions:
Common tasks such as data scaling, distance computations, and plotting cluster assignments are encapsulated into reusable functions to reduce redundancy.
Evaluation:
Clustering quality is assessed using metrics (e.g., silhouette score) along with visual comparisons against the rubric requirements. Detailed evaluation criteria are outlined in the attached rubric.

Directory Structure

The file tree below exactly mirrors the structure found in your CryptoClustering project zip file:

CryptoClustering/
├── .gitignore
├── Crypto_Clustering.ipynb
├── README.md
├── Resources/
│   └── crypto_market_data.csv
├── __init__.py
└── lib.py

Rubric & Evaluation

The project evaluation was guided by the rubric provided in the HTML rubric. Key criteria include:

Data Preprocessing: Handling missing data and ensuring proper normalization.
Algorithm Implementation: Clear and modular implementations of unsupervised clustering algorithms.
Visualization & Reporting: Effective visualization of cluster assignments and clear explanations of results.
Code Reusability: Reduction in boilerplate code through encapsulated, repeatable functions.
Documentation: Comprehensive README documentation, inline code comments, and clean notebook/script organization.

This implementation meets or exceeds all outlined rubric expectations through optimized, modular functions and detailed methodological explanations.

Usage Notes & Limitations

Usage Notes:
- Run the main notebook (Crypto_Clustering.ipynb) to see interactive clustering analyses.
- Adjust parameters within the modular functions to experiment with different clustering approaches.
- Ensure that the crypto_market_data.csv file is located in the Resources folder to avoid errors.
Limitations:
- The current dataset is limited to the provided CSV file. Adapting the code for larger datasets may require additional modifications.
- Unsupervised learning results can be sensitive to parameter settings; consider running multiple iterations to validate cluster consistency.

Credits & References

pandas Documentation
scikit-learn Documentation
Matplotlib Documentation
Unsupervised learning theories and best practices from core machine learning literature.
Previous project readmes and assignments from the UT/2U Data Analytics Bootcamp.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CryptoClustering Challenge

Table of Contents

Project Overview

Installation & Setup

Unsupervised Learning Methodology

Directory Structure

Rubric & Evaluation

Usage Notes & Limitations

Credits & References

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Resources		Resources
.gitignore		.gitignore
Crypto_Clustering.ipynb		Crypto_Clustering.ipynb
README.md		README.md
__init__.py		__init__.py
lib.py		lib.py

Neelka96/CryptoClustering

Folders and files

Latest commit

History

Repository files navigation

CryptoClustering Challenge

Table of Contents

Project Overview

Installation & Setup

Unsupervised Learning Methodology

Directory Structure

Rubric & Evaluation

Usage Notes & Limitations

Credits & References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages