RIT has the first and largest technological college in the world for students who are deaf or hard of hearing. Because of this, at any large event or presentation was given, there is always an interpreter and closed captioning. Without being able to hear the tone or pitch of a voice through reading alone, a lot of information can be lost in closed captioning. Our project aims to help the deaf community understand and communicate better by giving them more context into what emotion and tone a speaker might have.
Mood Captioning detects the emotion of the speaker and changes the subtitle color to be based on that emotion.
The application was built in Flutter and designed to receive a JSON file with the subtitles and their information. We take in the text, starting timestamp, ending timestamp, and mood.
Used an Emotion recognition library to recognize 9 different emotions through a WAV audio file. Found the time offset for each sentence in the video and generate the transcript for the subtitles. Parsed the FLAC audio files and transcript to create a JSON file with a mood, timestamp that relates to a color to be passed to the front-end flutter app.
- Syncing up the subtitle and the video as well as starting them at the same time. For this demo, we replaced that with a last-minute fix of just refreshing every 2 seconds.
- Connecting the two sides of the application together. Currently, the application just takes the json file generated by the back end.
- Timestamps on audio transcripts in order to sync video and subtitles and emotion.
- Converting files from mp4, WAV, FLAC, JSON.
Building a web application using Flutter from scratch! I've only recently started poking around with Flutter so this is my very first project in it.
My first time working with Google Cloud Speech to Text in order to create a timestamp to correlate to when each word was being said.