This repo contains a minimally working sample of how to process text data, transform it into vectors (using TF-IDF), and then build a simple clusering model (KMeans) on the vectors. It also included a demo script of making a word cloud of top terms in the corpus.
The selected corpus was lyrics from Taylor Swift's songs performed on her "Eras" Tour encompassing over 40 songs across 11 albums. Songs selected based on this article. Lyrics taken directly from Google search results.
You can see the resulting vector file here.