Skip to content

ML module Week 2 Topic 2: Overview of scikit-learn ecosystem #48

@EricThomson

Description

@EricThomson

Learn the basics of the scikit-learn package and its API
Give a sense of scikit-learn ecosystem. This is what you will use whenever you do ML that is not deep learning.

This is one of the best-written/maintained packages in the open source scientific ecosystem. Let's give them a nice sense of this. E.g., their docs, the range of tools available for the things in the intro section: supervised (classification, regression), unsupervised (clustering), and other useful stuff.

There really is a rich ecosystem we're just touching the surface. For instance, scikit-learn provides a pipeline system that is really useful.

The basic API functionality
Secondly, their basic API functionality: you create a model instance, and then you fit the model with data. Then you predict the outcomes with new data outside the training data. THis is the basic pattern we want them to understand. (There is a lot more to it we could cover, but that's the basic pattern).

A simple clustering demo
Third, let's demo this with clustering (k means), which is an example of unsupervised machine learning. We are not going to cover this in any detail in this class, so let's just do it here to demo how things work. We could adapt code from a simple demo like this

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions