Skip to content

Conversation

@kiancn
Copy link

@kiancn kiancn commented Apr 17, 2023

  1. This pull adds a function that filters for conversational words and needless special characters.
  2. This pull replaces the direct 'placement' of the image into the chat with a link - but only when saving the generated image is selected. It isn't perfect but the links do not clutter the chat with large amounts of tokens like <img ..> does.

I had to update the old 'not-updated-code' (in comparison to the one in oobabooga), but this enhancement works the same nonetheless.

I recommend updating the code in this repo to at least match the one in oobabooga. Include my enhancement or don't but please update the code in the repo marked as the experimental one; or encourage updates directly in the main oobabooga repo.

kiancn and others added 5 commits April 15, 2023 17:05
1) Added the function filter_out_conversational_words, which filters out common english conversational words that ~confuse SD models.

2) Added call to filter_out_conversational_words in the get_SD_pictures function, so that the response from the chatbot is filtered when passed to AUTOMATIC1111's SD api, but still presented unfiltered to the user of oobabooga.
I am submitting this to the last published version of the script on a repo that seems outdated compared to oobabooga's. However, in oobabooga's repo this is mentioned as the experimental branch so, so to not mess things up, the chages here (originally made to the updated script in the obabooga repo) are incorporated in the old script present in the 'Brawlence/SD_api_pics' repo.
Removed description of purpose of fork.
Here is the purpose of the fork:
1) Added the function filter_out_conversational_words, which filters out common english conversational words that ~confuse SD models.

2) Added call to filter_out_conversational_words in the get_SD_pictures function, so that the response from the chatbot is filtered when passed to AUTOMATIC1111's SD api, but still presented unfiltered to the user of oobabooga.
I am submitting this to the last published version of the script on a repo that seems outdated compared to oobabooga's. However, in oobabooga's repo this is mentioned as the experimental branch so, so to not mess things up, the chages here (originally made to the updated script in the obabooga repo) are incorporated in the old script present in the 'Brawlence/SD_api_pics' repo.
…en taken up by resulting images.

This push fixes the problem of excessive tokens taken up by putting results of SD-generation results taking up too many token.
The solution is very simple: Provide a link (it is a link to cached file in browser, not to one saved to harddrive) to image.

I had to update the old 'not-updated-code' (in comparison to the one in oobabooga), but this enhancement works the same nonetheless.

I recommend updating the code in this repo to at least match the one in oobabooga. Include my enhancement or don't but please update the code in the repo marked as the experimental one; or encourage updates directly in the main oobabooga repo.
@kiancn
Copy link
Author

kiancn commented Apr 17, 2023

Sorry about the typos in the descriptions :/ Just ask, if something in incomprehensible :)

@Brawlence
Copy link
Owner

The idea has merit.

First, I wanted to argue that repeated substring searches are not performant, but... I'm kinda surprised by the benchmark results.
Can't say it manages to achieve good outputs, though: TESTRUN.

The output string is a mess of words broken by commas, sometimes the things left are outright non-descriptive.

Second, I don't believe the header statement; after all, (natural descriptions - picture) pairs are what CLIP was trained on. Stable Diffusion checkpoints include CLIP, so they should perform well. Well, maybe if one uses NovelAI-based checkpoints -- then yes, it's less tolerant to natural language descriptions and responds better to danbooru-style tags instead.

I can't decide yet if the PR is worth it. Check this out in the meantime: there was another PR in the main repo which tried to improve prompt tags in another way.


As for the tokens, this fixes the problem that shouldn't even exist in the first place (if it even exists, cause I know for a fact that ooba logs two histories, one visible for the UI and the one hidden for the model itself) — non-text inputs should not be forwarded to the model at all. This is relevant not only to SD_api_pics, but also to TTS & similar extensions that use embeddings.

@Brawlence Brawlence added enhancement New feature or request elaborate Further information is requested labels Apr 17, 2023
@kiancn
Copy link
Author

kiancn commented Apr 17, 2023

Regarding the jumbled collection of words (mess) resulting from the operation of filter_out_conversational_words.

Yes, the output string is jumbled, but from a decent range of tests before implementing the filter and more than a 100 tests after implementing it, I can safely postulate that a lot of SD models like the jumbled strings better (give results closer to the intent of the unjumbled text; and the unjumbled text is still display to the user).
Most SD models I know of have training content tagged with danbooru tags - I believe this is why it works better with only 'keywords': I mean it works for me and the models I use and a lot of others - check ex. tags on the images posted on civitai.com; every single one I have seen is basically word soup like the one you are getting from the filter_out_conversational_words-function.
I assure you that a lot of people will find these prompt more desireable than full sentences; we could make it an option in the settings perhaps (danbooru-style/deepbooru-style).

Regarding the speed of the function

Yes. The function is relatively slow. 0.74 secs on my pc (running your test code). That is pretty bad, but for this type of application I believe the 0.7 seconds (for a 1000 requests) is acceptable. However, by removing the second run of deletions of elements from substrings_to_remove-variable in the string-variable, the test time goes does to 0.43 secs. And the quality of the string resulting from the removal is not reduced significantly.
Also, you are doing a 1000 reps in the test, which means that if you were doing something still insane, but possible, like evaluate 5 strings after each other it would take only 0,0037 secs ((0.74/1000)*5).
I don't believe the speed of the code is an issue.

The token thing

Yeah, it is really wierd. I tested a bit before deciding on the link-solution. I regret posting it now, because it is not satisfactory. We really want the images to appear on the same page. But it does 'solve' the problem in a way that is understandable to non-tech users. (And since most users have GPUs with less than 12/16GB of VRAM removing these tokens really is a significant optimization.)
[Edit. I'm really curious as to why an img with a path-source takes up tokens... Does it try to read the image? In base 64 I get, but img with source clogging is ... funny? Investigating.... ]

kiancn and others added 2 commits April 22, 2023 16:35
…meters Section

Also converted previous enhancement suggestions to the newly updated version of the script.
@kiancn kiancn changed the title Current chatbots are bad at creating SD prompts. This pull remedies that to a large extent by filtering common conversational words and characters. 1) Filtering common conversational words and characters, 2) 'solving' the expensive token problem, 3) get models from SD-server and let user choose Apr 22, 2023
@kiancn kiancn changed the title 1) Filtering common conversational words and characters, 2) 'solving' the expensive token problem, 3) get models from SD-server and let user choose 1) Filter common conversational words and characters, 2) make output imgs cheap by linking, 3) get models from SD-server and let user choose Apr 22, 2023
kiancn added 6 commits April 22, 2023 17:14
Before it displayed the full list (which, admidtedly had only one member).
It (initialization of the name displayed in dropdown) still does not work properly - initially no no string is shown. Reason not yet understood by me.
Commented with wrong method name corrected.
Remove unused function
* Refactored: removed sd_model_current variable, for the same purposes now using params['SD_model'] instead.
* Added a row in the ui for the models dropdown.
@kiancn
Copy link
Author

kiancn commented Apr 23, 2023

Model selection now completely functional in the UI

It seemed like something that was missing.

  1. It works by getting sdapi/v1/sd_models and filtering the results to a list (called sd_models)
  2. then the currently selected model is filtered from the response from sdapi/v1/options and saved to params['SD_model']
  3. when selecting model from dropdown in ui a post request is sent to sdapi/v1/options with the name of the selected model, and params['SD_model'] is updated.

@Brawlence
Copy link
Owner

The model selection is currently breaking VRAM management options. Gotta think what's up with that

@kiancn
Copy link
Author

kiancn commented Apr 23, 2023

That is funky...
What happens - error message, nothing?

However, for me, the VRAM management never worked. I've got responses like this one with every request the extension sends from the give_VRAM_priority function:
[...] File "C:\aitools\oobabooga-windows\text-generation-webui\extensions\sd_api_pictures\script.py", line 66, in give_VRAM_priority response.raise_for_status() File "C:\aitools\oobabooga-windows\installer_files\env\lib\site-packages\requests\models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://127.0.0.1:1048/sdapi/v1/unload-checkpoint
I have been lazy and assumed that my slightly outdated version of Automatic1111's ui does not have the end-points reload-checkpoint and unload-checkpoint: ... And I just verified that this is the case.

I'll fix my install so I can at least experience the same error as the up-to-date user :)

@kiancn
Copy link
Author

kiancn commented Apr 24, 2023

Ok. I updated the relevant parts of the code (in Automatic1111 codebase) to be up-to-date, and now VRAM-management works for me. Which puzzles me.
I do get a few messages about deprecations in 'torch.py', '_utils.py', and 'storage.py'. But it works - the following message is shown exactly once:
Prompting the image generator via the API on http://127.0.0.1:1048... Requesting Auto1111 to vacate VRAM... Loading mayaeary_pygmalion-6b-4bit-128g... Found the following quantized model: models\mayaeary_pygmalion-6b-4bit-128g\pygmalion-6b-4bit-128g.safetensors Loading model ... C:\aitools\oobabooga-windows\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() with safe_open(filename, framework="pt", device=device) as f: C:\aitools\oobabooga-windows\installer_files\env\lib\site-packages\torch\_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.__get__(instance, owner)() C:\aitools\oobabooga-windows\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = cls(wrap_storage=untyped_storage) Done. Loaded the model in 2.13 seconds.

I played around a bit and didn't get the extension to fail.

@Brawlence Could you describe the error you experience?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

elaborate Further information is requested enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants