a bunch of scripts I made to help with clipping streams, it should work as long as yt-dlp and chat-downloader supports whatever website you wish to use
- install python and pip
- pip install -r requirements.txt, you will probably need ffmpeg as well
- use the scripts
- the links and info goes into info.json
- run chatScrape.py to generate a graph to see chat frequency, graphs, json, and csv output will be in the ./working directory
- run videoScrape.py to download time periods with high chat frequency, output videos will be in the output directory
- run subtitles.py to autogenerate srt subtitles with whisper (openAI), you probably need a decent gpu for this, might also need to mess with the model. srt files will be outputted into the same output directory as the videos with the same file name
TO-DOs
- make a pyQT UI to make videoScrape.py less just downloading randomly (which is currently just download -3 min & +5 min from the peak chat frequency) and actually user controllable.
- fix the UI not updating properly after downloading from yt-dlp
- maybe just replace with a tui of some sorts
- make diarization less gimicky
- possibly use a different, less scattered library that isn't just for speech recognition
- just ended up using whisperx because why reinvent the wheel