Added Model Selection for Elevenlabs _ Added Voice Styles for Elevenlabs, Azure, and OpenAI. Minor UI bugfixes #263

mitcheb0219 · 2025-12-31T18:40:10Z

Added Ability to select Voice Model for ElevenLabs.

Added ability to use voice-styles for Elevenlabs and Azure backends.

Elevenlabs: Voice styles are user-generated. Users can click any tag to have it copied to clipboard for pasting into FFXIV Chat
(Note. Use of voice tags requires leveraging ElevenLabs' V3 voice model. Any messages containing a style tag (i.e. [whispering]) will automatically be forced onto the V3 model for that Say request.
Azure: Voice styles are limited in scope and dependent on the selected voice. Users can click any tag to have it copied to clipboard for pasting into FFXIV Chat. Any mismatching voice tags will be ignored by the speech synthesizer.

Added command "/tttstyles" which will open the "Voice Styles" widget for the currently active backend. This widget leverages the existing backend-switching architecture.

These features allow the user to use more granular expressiveness with their TTS messages. Common use-cases would be for role-players or users who are non-verbal/shy.

Added Ability to select Voice Model for ElevenLabs. Added ability to use voice-styles for Elevenlabs and Azure backends. - Elevenlabs: Voice styles are user-generated. Users can click any tag to have it copied to clipboard for pasting into FFXIV Chat -Azure: Voice styles are limited in scope and dependent on the selected voice. Users can click any tag to have it copied to clipboard for pasting into FFXIV Chat Added command "/tttstyles" which will open the widget for the currently active backend. These features allow the user to use more granular expressiveness with their TTS messages. Common use-cases would be for role-players or users who are non-verbal/shy.

Updated SSML generation to ignore styling tags if using SYSTEM backend. Tags will still be stripped from the message.

…e_Preset_OutofZone_Fix

src/TextToTalk/TextToTalk.csproj

karashiiro · 2026-01-01T05:13:37Z

src/TextToTalk/Backends/ElevenLabs/ElevenLabsClient.cs

            throw new ElevenLabsMissingCredentialsException("No ElevenLabs authorization keys have been configured.");
        }
-
+        var modelId = (text.Contains("[") && text.Contains("]")) ? "eleven_v3" : model; //if user uses SSML tags, force eleven_v3 model


This should be reflected in the configuration window somehow - it'd be poor UX to select a model and implicitly always use eleven_v3 just because SSML tags are being used. At the very least, it should visibly lock the model selector in the config window.

More importantly, this seems likely to hit non-SSML messages, too - [] aren't particularly uncommon characters to use in macros and such.

Maybe a tooltip in the styles widget would help communicate that. The swap to V3 if using a tag would only apply for that specific message. Any other messages sent without a tag would use the model selected in the preset.

As far as identifying the tags, agreed that [] could often come up in unintended ways. Maybe will switch it to ~~tag~~

Do you think it would be better to have a toggle in the Elevenlabs backend that would say "enable voice styles" which would then lock the preset to V3?

The original design was just so a user wouldn't be locked in to the more expensive model and could utilize it only in situations where a voice style was wanted

Do you think it would be better to have a toggle in the Elevenlabs backend that would say "enable voice styles" which would then lock the preset to V3?

I think this is a good idea, and we should also have a tooltip, too - the tooltip can just be next to (or maybe on) that setting. We can also use ImRaii.Disabled() to disable interacting with the model selector directly.

Alternatively, we can go the other way around and have the checkbox be disabled (and non-interactable) unless the model is set to V3, with the checkbox tooltip stating that output styles are disabled unless V3 is selected.

src/TextToTalk/UI/Windows/StylesWindow.cs

src/TextToTalk/CommandModules/MainCommandModule.cs

karashiiro · 2026-01-01T05:25:48Z

src/TextToTalk.Lexicons/LexiconManager.cs

+            // This regex captures the style name in group 1 and the text in group 2.
+            // It replaces the whole match with the SSML tag, effectively removing 
+            // the [styleName] text from the spoken output.
+            text = Regex.Replace(text, @"\[([^\]]+)\]([^\[]*)", m =>


Noted elsewhere that this is too general, but also, where exactly are these style tags added to the text in the first place? I think this is perhaps not the right place for this code to live, but I'm not sure where the style tags come from, so I'm not sure yet.

In the code so far there are two scenarios:

Elevenlabs uses user-generated tags that their AI tries to meaningfully interpret. This is why the Elevenlabs styles widget depends on user input. Any styles can be added/removed, are listed alphabetically, and allow the user to easily copy a desired tag to clipboard before beginning their chat message.

Azure, on the other hand, has very restrictive tags that are specific to the voice selected (i.e. Jenny Neural). Some voices have no style options at all. Luckily the available styles for the voice can be captured in the API call that returns the voices for the backend's dropdown box. So as the user changes voices in the currently active preset, the list of available styles in the widget will update as well. Unlike Elevenlabs, the user cannot make their own tags.

The user would add the tag to their message in the message body.

For example:

[whisper] I may have let him die on purpose.

The whisper tag would get interpreted but not actually read out in the resulting TTS

Here's a small demo of it:

https://youtu.be/mLjpBeBzDTY?si=N9XJy952BTRH_lFP

Okay, I think I may have misunderstood this from the beginning - this is actually more complicated than I initially recognized. I thought this was something that would apply to all messages, but it only applies to people following a particular convention in their messages.

In that case, we probably need this to be configurable, unless this is just a convention that ~everyone already follows. I would think people would want to define the patterns used for extracting styles, but we would have some sort of reasonable defaults ([] might be fine). That would be something along the lines of how inclusion/exclusion rules are configured, but we might be able to simplify things for this.

One of the main things I'm thinking about is how this works for detecting styles from other people's messages - presumably people have pre-existing conventions for how they communicate styles, and I would expect there to be multiple conventions floating around, which is why some sort of configurable patterns makes sense to me. You might want to, say, plug in a handful of patterns that match the conventions of people you interact with regularly, for example.

[Backend]VoiceStylesUI.cs added for backends that will have style tags enabled. Added "Voice Styles" button to each backend supporting styles to make the discovery aspect easier for the user. Modified the tag to "[[ ]]" to reduce risk of accidental usage. Fixed bug with deleting Voice Presets. This bug is present in the current release (1.34.0.0). Please see adjustments in ``BackendUI.cs`` Fixed odd behavior when toggling to Elevenlabs backend when the last used preset has been deleted. Preset dropdown was not rendering and forced the user to create a new preset in order to select any other existing presets.

src/TextToTalk.Lexicons/LexiconManager.cs

src/TextToTalk/Backends/BackendUI.cs

src/TextToTalk/Backends/ElevenLabs/ElevenLabsBackendUI.cs

src/TextToTalk/Backends/OpenAI/OpenAiClient.cs

Bug was causing weird behavior when deleting a voice preset from Azure, System, Elevenlabs, Kokoro, Uberduck, and GoogleCloud backends. Added logic to select first preset from list if the list contained any presets and a preset has not already been selected.

mitcheb0219 and others added 4 commits December 31, 2025 13:29

Merge branch 'karashiiro:main' into main

c5f697e

System Backend Update

45aa1b6

Updated SSML generation to ignore styling tags if using SYSTEM backend. Tags will still be stripped from the message.

Merge branch 'main' of https://github.com/mitcheb0219/TextToTalk_Voic…

ca44634

…e_Preset_OutofZone_Fix

karashiiro reviewed Jan 1, 2026

View reviewed changes

mitcheb0219 commented Jan 2, 2026

View reviewed changes

src/TextToTalk.Lexicons/LexiconManager.cs Show resolved Hide resolved

mitcheb0219 commented Jan 2, 2026

View reviewed changes

src/TextToTalk/Backends/BackendUI.cs Show resolved Hide resolved

mitcheb0219 commented Jan 2, 2026

View reviewed changes

src/TextToTalk/Backends/ElevenLabs/ElevenLabsBackendUI.cs Outdated Show resolved Hide resolved

mitcheb0219 commented Jan 2, 2026

View reviewed changes

src/TextToTalk/Backends/OpenAI/OpenAiClient.cs Show resolved Hide resolved

mitcheb0219 changed the title ~~Added Model Selection for Elevenlabs _ Added Voice Styles for Elevenlabs and Azure~~ Added Model Selection for Elevenlabs _ Added Voice Styles for Elevenlabs, Azure, and OpenAI. Minor UI bugfixes Jan 2, 2026

mitcheb0219 and others added 3 commits January 1, 2026 22:14

Update SystemSoundQueue.cs

eeab659

Merge branch 'main' into main

9df42cd

Added Model Selection for Elevenlabs _ Added Voice Styles for Elevenlabs, Azure, and OpenAI. Minor UI bugfixes #263

Are you sure you want to change the base?

Added Model Selection for Elevenlabs _ Added Voice Styles for Elevenlabs, Azure, and OpenAI. Minor UI bugfixes #263

Uh oh!

Conversation

mitcheb0219 commented Dec 31, 2025

Uh oh!

Uh oh!

karashiiro Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

mitcheb0219 Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mitcheb0219 Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

karashiiro Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

karashiiro Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

mitcheb0219 Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mitcheb0219 Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karashiiro Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

karashiiro Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mitcheb0219 Jan 1, 2026 •

edited

Loading

mitcheb0219 Jan 1, 2026 •

edited

Loading

mitcheb0219 Jan 1, 2026 •

edited

Loading