diff --git a/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/README.md b/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/README.md deleted file mode 100644 index 0d93028..0000000 --- a/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/README.md +++ /dev/null @@ -1,356 +0,0 @@ -# 1 - Embedding an LLM Chat - -## Introduction - -In this lesson we will look at how to add and configure an LLM chat into our image -viewer slideshow. - -At the end of this lesson, the chat will be fully functional but lack integration -with the current image being displayed, as we will explore that in the next parts -of the tutorial. - -## Screenshots - -![screenshot](./screenshot_001_001.png) - -## Steps - -### Overview - -To accomplish this, we will simply follow [Moly Kit's recipe](https://moxin-org.github.io/moly/basics.html). - -This can be broken into 3 parts: - -1. Add Moly Kit dependency and use a compatible Makepad version. -2. Register Moly Kit widgets the Makepad way. -3. Create a `BotContext` and give it to the `Chat` widget (we will -explain this later). - -We will also do a 4th step to make the UI look and feel better. - -### 1. Adding Moly Kit to our project - -Basically, your dependencies in `Cargo.toml` should look like the following: - -```toml -[dependencies] -makepad-widgets = { git = "https://github.com/wyeworks/makepad", branch = "moly" } -moly-kit = { git = "https://github.com/moxin-org/moly.git", features = ["full"], branch = "main" } -``` - -We include Moly Kit from GitHub as it's not on crates.io yet. The `full` feature -flag gives the easiest setup possible. - -We will need to use a patched version of Makepad as well. This is because -currently Moly Kit's focus is to be used inside Moly, an LLM app that moves fast -and usually requires specific Makepad patches to keep working. - -These requirements will disappear once Moly Kit is moved to its own repo with a -more stable branch, but for now we will work with these dependencies for this -tutorial. - -### 2. Registering Moly Kit widgets - -Similar to how we registered the core Makepad widgets in our app `live_register` -function, we should also register the widgets from the `moly_kit` crate. - -You should modify `live_register` to look like the following: - -```rust -impl LiveRegister for App { - fn live_register(cx: &mut Cx) { - makepad_widgets::live_design(cx); - moly_kit::live_design(cx); // Added this line. - } -} -``` - -Then on the DSL we should include the `Chat` widget by adding: - -```rust -use moly_kit::widgets::chat::Chat; -``` - -And for this tutorial, we will add a chat to the right of our slideshow, so -let's modify our previous DSL like this: - - -```rust -Slideshow = { - // We wrapped what we had with this view to get a horizontal layout. - { - flow: Overlay, - - image = { - width: Fill, - height: Fill, - fit: Biggest, - source: (PLACEHOLDER) - } - - overlay = {} - } - - // We added the `Chat` widget. - chat = { - padding: 10, - width: 300, - // We hide the chat by default. We will make it programmatically visible - // once models are ready inside the `BotContext` (next step). - visible: false, - draw_bg: { - border_radius: 0.0, - color: #fff - } - } -} -``` - -### 3. Configuring the `Chat` widget - -Moly Kit's `Chat` is designed in a way it works out of the box like any other -LLM chat application. You won't need to handle button presses yourself, but -you will need to configure it to use a proper API and LLM model. - -For our purposes of interacting with a conversational model under an OpenAI -compatible endpoint, the following steps should be done: -1. Create an `OpenAIClient`, pointing to the proper API URL. -2. Configure the client with the proper API key. -3. Generate a `BotContext` from our single client. -4. Set this `BotContext` as the one to be used in `Chat`. -5. Spawn an async task to bootstrap (load) the `BotContext`. - -As you may expect, `OpenAIClient` is what defines how to hit OpenAI's -conversational endpoint. But what is `BotContext` and why do we need to -"load" it asynchronously? - -In Moly Kit terms, a "Bot" is anything automated we can talk to. Like pure LLM -models or agentic workflows. A `BotContext` is basically a synchronous -container, with models preloaded, that is passed down the widget tree of `Chat` -so all widgets have synchronous access to the list of available models. It's -a key part of integrating the asynchronous, streaming, Moly Kit clients with -the synchronous Makepad widgets. - -Currently, `Chat` doesn't handle bots loading automatically, so we need to -trigger and wait for the load ourselves before displaying it. There are future plans -to eliminate the need for this for simple cases like ours, but it's currently not -the case. - -Okay, so let's start writing the code. - -You will want to prepare the following aliases: - -```rust -use moly_kit::{ - ChatWidgetRefExt, OpenAIClient, protocol::*, utils::asynchronous::spawn, -}; -``` - -- `moly_kit::protocol::*` is an important one, as it contains the base types -and traits to work with Moly Kit. -- `moly_kit::OpenAIClient` is the client implementation we will use to talk to -an OpenAI compatible endpoint. -- `moly_kit::utils::asynchronous::spawn` is what we will use to spawn our future -when loading the `BotContext`. -- `ChatWidgetRefExt` is simply a Makepad autogenerated extension method to fetch -our `Chat` widget we defined in the DSL. - -Now, we will also use Makepad's `UiRunner` as a concise and pragmatic alternative -to raw `Cx::post_action` usage. A `UiRunner` allows us to send boxed closures -from any thread or async task, back to our app/widget scope for execution. The -only requirement to enable its usage is to add the following line to -your `handle_event`. - -```rust -fn handle_event(&mut self, cx: &mut Cx, event: &Event) { - self.ui_runner().handle(cx, event, &mut Scope::empty(), self); - // ... everything we had before ... -} -``` - -> [!note] -> -> We are passing `&mut Scope::empty()` because we are in the root of the app. -> If we were in a normal widget, where we receive a `scope` parameter, then you -> should pass that so received closures gain access to it. - -As for the `BotContext` and other fancy details setup, let's group everything -into a separate `configure_slideshow_chat_context` function. I will document -what each part does inline with the code: - -```rust -fn configure_slideshow_chat_context(&mut self, cx: &mut Cx) { - // Get the api url and api key from environment variables. - // Feel free to hard code them if you want. - let url = std::env::var("API_URL").unwrap_or_default(); - let key = std::env::var("API_KEY").unwrap_or_default(); - - // Create and configure a client to talk with the proper url and key. - let mut client = OpenAIClient::new(url); - client.set_key(&key).unwrap(); - - // Create a `BotContext` configured to use our client. - let mut bot_context = BotContext::from(client); - - // Give the `BotContext` to the `Chat` widget. - let mut chat = self.ui.chat(id!(slideshow.chat)); - chat.write().set_bot_context(cx, Some(bot_context.clone())); - - // Obtain a `UiRunner` instance (which is `Copy`), so we can execute things - // in the scope of our app from any thread or async task easily. - let ui = self.ui_runner(); - - // Use Moly Kit cross platform, Tokio enabled, web compatible, `spawn` - // function to run our `BotContext` bootstrapping code. - spawn(async move { - // Load the `BotContext` asynchronously and turn its result into a - // vector of errors so we can report them if any. - let errors = bot_context.load().await.into_errors(); - - // Use our `UiRunner` so we can "go back" to our app's scope. - // We use `me` as `self`, as it's a reserved name. - ui.defer(move |me, cx, _scope| { - // We create a new reference to the `Chat` widget in this scope. - let mut chat = me.ui.chat(id!(slideshow.chat)); - - // We also get a reference to the internal `Messages` widget which - // is a child of `Chat` responsible for holding the messages vector - // and displaying it. - let mut messages = chat.read().messages_ref(); - - // We insert "error messages" into our chat to report problems that - // happened during load. - // - // Note: You could also just "print" them, but having errors in our - // UI will look better. - for error in errors { - messages.write().messages.push(Message::app_error(error)); - } - - // We search for a `Bot`, whose raw provider-side id is the one we - // specified in our environment variable. - // - // Note: A `BotId` is composed of more information, but for this - // tutorial, we only care about the id as it's defined in our - // OpenAI compatible provider. That's what `.id()` returns. - let model_id = std::env::var("MODEL_ID").unwrap_or_default(); - let bot = bot_context - .bots() - .into_iter() - .find(|b| b.id.id() == model_id); - - // If we found the desired `Bot`, let's set it in the `Chat` as the - // one to use. - // - // Insert another error message back to the chat otherwise. - if let Some(bot) = bot { - chat.write().set_bot_id(cx, Some(bot.id)); - } else { - messages.write().messages.push(Message::app_error( - format!("Model ID '{}' not found", model_id), - )); - } - - // We finished our setup, so let's make the chat visible. - chat.write().visible = true; - me.ui.redraw(cx); - }); - }); -} -``` - -> [!info] -> -> Moly Kit avoids duplicating methods across `Chat` and Makepad's autogenerated -> `ChatRef` by giving you the `.write()` and `.read()` methods which are roughly -> shorthands to doing `.borrow_mut().unwrap()`. -> -> `ChatRef` doesn't replicate any other internal methods. Same goes for -> `MessagesRef` and other public widgets of Moly Kit. -> -> The benefits of this approach are: -> - Moly Kit developers don't have to replicate methods, doing `.clone()`s or -> `RefMut::map` hacks internally. -> - You will never experience an accidental method being defined on a ref, but -> not in its inner widget. -> - You will never hit the inconsistency of having methods that are exposed in -> the ref version, while others require you to do `.borrow_mut().unwrap()`. - -We can now call our function in `after_new_from_doc` where we also initialized -the images list in previous tutorial lessons. - -```rust -fn after_new_from_doc(&mut self, cx: &mut Cx) { - // ... previous initialization code ... - self.configure_slideshow_chat(cx); -} -``` - -We made an intermediate function `configure_slideshow_chat` that we will use -later but for now it should simply contain our -`configure_slideshow_chat_context` call. - - -```rust -fn configure_slideshow_chat(&mut self, cx: &mut Cx) { - self.configure_slideshow_chat_context(cx); -} -``` - -### 4. Nice UI extras. - -It would be nice if we can clear the message list every time the image in the -slideshow changes and when the slideshow itself is opened. - -Let's define the following utility function: - -```rust -fn clear_slideshow_chat_messages(&self) { - self.ui - .chat(id!(slideshow.chat)) - .read() - .messages_ref() - .write() - .messages - .retain(|m| m.from == EntityId::App); -} -``` - -> [!info] -> -> We are using `retain()` instead of just `clear()` to preserve custom app -> messages, as they may be the error messages we inserted during the `BotContext` -> `load()` handling. - -Then, search where the slideshow opening event is handled and add the -corresponding call: - -```rust -if self.ui.button(id!(button)).clicked(&actions) { - self.clear_slideshow_chat_messages(); // Add this line. - page_flip.set_active_page(cx, live_id!(slideshow)); -} -``` - -And also call this when the image changes: - -```rust -fn set_current_image(&mut self, cx: &mut Cx, image_idx: usize) { - // ... other code ... - - self.clear_slideshow_chat_messages(); // Add this line. - self.ui.redraw(cx); -} -``` - -## What we did - -Now, if you run your app and go to the slideshow, you will have the chat we -embedded, and it will be working (independently of our app). - -## What's Next - -In the [next lesson](../2%20-%20Current%20Image%20as%20Conversation%20Context/README.md), -we will hook into our `Chat`'s pipeline to inject the current image being viewed -in the slideshow as additional context. This way we will be able to ask things -about our image. \ No newline at end of file diff --git a/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/screenshot_001_001.png b/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/screenshot_001_001.png deleted file mode 100644 index 805475f..0000000 Binary files a/Appendix/Moly Kit Integration/1 - Embedding an LLM Chat/screenshot_001_001.png and /dev/null differ diff --git a/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/README.md b/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/README.md deleted file mode 100644 index 60aaf7a..0000000 --- a/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/README.md +++ /dev/null @@ -1,371 +0,0 @@ -# 2 - Current Image as Conversation Context - -## Introduction - -In the previous lesson we embedded Moly Kit's `Chat` widget into our slideshow -screen, which once configured, worked automatically to talk to LLM models. - -However, to make this a real integration, we would like to let the LLM model -"see" the current image in the slideshow, so we can ask questions about it. - -Don't be fooled, even if Moly Kit `Chat` has a default behavior, it doesn't mean -we can't change it when we really need to. To understand how, I recommend reading the official [Integrate and customize behavior](https://moxin-org.github.io/moly/integrate.html) -Moly Kit guide. But to keep knowledge here, let me try to summarize it next. - -## Screenshots - -![screenshot](./screenshot_002_001.png) - -## Theory - -### Hooks and `ChatTask` - -Every important behavior in `Chat` (like updating messages, sending them, -copying to clipboard, etc.) is identified with an enum called a `ChatTask`. -When `Chat` is about to do something, it emits a `ChatTask`, that when received -back, performs the action for real. - -However, `Chat` allows us to "hook" between that "send and receive" flow, -giving us the chance to modify those tasks before they are performed. To do so, -we simply configure a subscriber closure we call the "hook" using the -`set_hook_before` function. The closure will receive a **vector** of tasks that -were originally dispatched together. As we said before, modifying any of the -tasks in the vector will impact the final result once performed. Additionally, -you can inject new tasks into the vector, or simply `clear()` it to cancel all -default behaviors and handle everything on your own. - -If you are cancelling (cleaning the vector), then you may be interested in the -`.perform()` and `.dispatch()` methods of `Chat`, which allow you to -programmatically trigger those behaviors yourself. The only difference between -the two is that `perform` "bypasses" the hook, while `dispatch` causes it to be -triggered. - -For our purposes in this tutorial, one option would be to use a hook to wait for -`ChatTask::Send`, to insert a message in the chat with the image before it's -sent. But that would not look clean. It would be better if the image could be -"injected silently" without touching the chat history. To do so, let me also -introduce you to making a custom `BotClient`. - -### `BotClient` (custom) - -`BotClient` is a trait all clients that talk to an LLM in Moly Kit implement. We -saw the `OpenAIClient` before, which was a built-in implementation of that. - -But we can also make our own, tailored to our needs. The -[Implement your own client](https://moxin-org.github.io/moly/custom-client.html) -guide in Moly Kit covers this trait and how to implement it, but it's considered -"advanced", and we don't care about many of these details. - -What we will want is to make our own `SlideshowClient` that simply wraps the -existing `OpenAIClient`, delegating most of its implementation, but customizing -the `send()` implementation to inject a message with the image attached as -context. - -## Steps - -### Overview - -Okay, now we have the required theory, let's put this into practice. To allow -the `Chat` to "see" our current image we will do something like the following: - -1. Implement a wrapper `SlideshowClient` that simply wraps `OpenAIClient` -to add the image as a message while sending it to the LLM. -2. Set the hook that will: - 1. Wait for the `ChatTask::Send` event. - 2. Filter that task (leaving others untouched) so message updates are still - performed automatically, but we can manually trigger our own "send - mechanism". - 3. Define our custom send mechanism, where we will read the current image in - the slideshow (handling the filename and mimetype as well). - 4. Give the read message to the client. - 5. Trigger a `ChatTask::Send` manually. -3. As an extra, we will adjust a little bit the DSL of the `Chat`. - -### 1. Implementing the wrapper client - -As we mentioned, the purpose of this wrapper client will be to take the current -image, and insert it in the message history that is sent to remote LLMs. - -The implementation is simple, so let's show it and explain it inline: - -```rust -use moly_kit::{OpenAIClient, protocol::*}; -use std::sync::{Arc, Mutex}; - -// Here, we will hold the attachment to be sent to the LLM and the `OpenAIClient` -// to which we will delegate most of the behavior. -struct SlideshowClientInner { - // An `Attachment` is how Moly Kit represents all kinds of files that are - // exchanged with LLMs. This will be our image when set. - attachment: Option, - openai_client: OpenAIClient, -} - -// This is the public client, which is reference counted so we can give a copy -// of it to `BotContext`, while also preserving it in our `App` for setting the -// attachment. -pub struct SlideshowClient(Arc>); - -// Let's simply define a method to wrap the `OpenAIClient` we previously had -// from the previous tutorial chapter. -impl From for SlideshowClient { - fn from(openai_client: OpenAIClient) -> Self { - SlideshowClient(Arc::new(Mutex::new(SlideshowClientInner { - attachment: None, - openai_client, - }))) - } -} - -impl Clone for SlideshowClient { - fn clone(&self) -> Self { - SlideshowClient(Arc::clone(&self.0)) - } -} - -impl BotClient for SlideshowClient { - // Simply delegate this impl to the `OpenAIClient` - fn bots(&self) -> BoxPlatformSendFuture<'static, ClientResult>> { - self.0.lock().unwrap().openai_client.bots() - } - - // The only method we are truly working with. - // This method takes a list of messages and sends it to the given bot. - fn send( - &mut self, - bot_id: &BotId, - messages: &[Message], - tools: &[Tool], - ) -> BoxPlatformSendStream<'static, ClientResult> { - // Let's turn the immutable slice into a vec we can modify. - let mut messages = messages.to_vec(); - - // Let's insert the image as a message at the beginning (if any). - if let Some(attachment) = &self.0.lock().unwrap().attachment { - messages.insert( - 0, - Message { - content: MessageContent { - attachments: vec![attachment.clone()], - ..Default::default() - }, - from: EntityId::User, - ..Default::default() - }, - ); - } - - // Now, simply call the delegated method in `OpenAIClient` with the - // modified messages list. - self.0 - .lock() - .unwrap() - .openai_client - .send(bot_id, &messages, tools) - } - - // Required for cloning across dynamic dispatch. - fn clone_box(&self) -> Box { - Box::new(self.clone()) - } -} - -impl SlideshowClient { - // We will use this method to give an `Attachment` from our `App`. - pub fn set_attachment(&self, attachment: Option) { - self.0.lock().unwrap().attachment = attachment; - } -} -``` - -> [!info] -> -> Please note we are only really working with the `send()` method, and even so, we -> don't need to understand every type involved thanks to our reliance on the -> already implemented `OpenAIClient`. - -We will update our `App` widget to hold the copy of this we mentioned: - -```rust -#[derive(Live)] -struct App { - // ...other fields... - #[rust] - slideshow_client: Option, -} -``` - -And update our `configure_slideshow_chat_context` to simply wrap our already -existing `OpenAIClient` from the previous tutorial chapter: - -```rust -fn configure_slideshow_chat_context(&mut self, cx: &mut Cx) { - // ...existing code... - - // The client we already had from before, unmodified. - let mut client = OpenAIClient::new(url); - client.set_key(&key).unwrap(); - - // The only code we are inserting, that wraps the client from before and - // saves a copy of itself to `App`. - let client = SlideshowClient::from(client); - self.slideshow_client = Some(client.clone()); - - // The context from before, unmodified. - let mut bot_context = BotContext::from(client); - - // ...existing code... -} -``` - -> [!info] -> -> Please note we simply inserted 2 lines in the middle of what we already had. - -### 2. The hook - -We will update our `configure_slideshow_chat` method from the previous tutorial -chapter, to add a `configure_slideshow_chat_before_hook` method call. - -```rust -fn configure_slideshow_chat(&mut self, cx: &mut Cx) { - self.configure_slideshow_chat_context(cx); - self.configure_slideshow_chat_before_hook(cx); -} -``` - -And we will implement it as the following: - -```rust -fn configure_slideshow_chat_before_hook(&mut self, _cx: &mut Cx) { - let ui = self.ui_runner(); - let mut chat = self.ui.chat(id!(slideshow.chat)); - - // Here, our hook is receiving the (grouped) list of tasks that our `Chat` - // emits for us when doing something important. - chat.write().set_hook_before(move |task_group, _chat, _cx| { - let before_len = task_group.len(); - - // We delete any `ChatTask::Send` from the group so, whatever - // happens, it will not cause an automatic send, but other behaviors - // are still performed automatically. - task_group.retain(|task| *task != ChatTask::Send); - - // If there was a `ChatTask::Send`, let's handle the send ourselves by - // calling a `perform_chat_send` method (we will define it next). - if task_group.len() != before_len { - // `defer` in Makepad's `UiRunner` will be executed later at - // `handle_event`, so other tasks not erased from the vector will be - // already applied by then. - ui.defer(move |me, cx, _scope| { - me.perform_chat_send(cx); - }); - } - }); -} -``` - -Then, we will need to implement `perform_chat_send`. This is where the -integration is truly completed. It will: - -- Get the current image. -- Try to infer its mime type (required to build the `Attachment`). -- Extract the filename. -- Read the file bytes to memory. -- Construct the `Attachment` and set it in our `SlideshowClient` we stored in -`App`. -- Trigger `ChatTask::Send` to allow the normal send flow to happen. - -The code: - -```rust -fn perform_chat_send(&mut self, cx: &mut Cx) { - let Some(client) = self.slideshow_client.as_mut() else { - return; - }; - - // Get the current image. - let path = - self.state.image_paths[self.state.current_image_idx].as_path(); - - // Try to infer the mime type by just looking at the extension. This is a - // naive implementation. To do this in a serious app, you may want to use a - // crate like `mime_guess`, or one that sniffs the real type from the - // binary content. But this is enough for our tutorial use cases. - let extension = path.extension().and_then(|e| e.to_str()); - let mime = extension.map(|e| match e { - "jpg" | "jpeg" => "image/jpeg".to_string(), - "png" => "image/png".to_string(), - e => format!("image/{e}"), - }); - - // Extract the filename. - let filename = path - .file_name() - .and_then(|f| f.to_str()) - .unwrap_or_default(); - - let mut chat = self.ui.chat(id!(slideshow.chat)); - - // Try reading the file's content synchronously from the filesystem. - match std::fs::read(path) { - Ok(bytes) => { - // Build the attachment from the information we collected. - let attachment = - Attachment::from_bytes(filename.to_string(), mime, &bytes); - - // Set the attachment in our client. - client.set_attachment(Some(attachment)); - - // Trigger the natural send mechanism of `Chat` that we aborted - // earlier. - chat.write().perform(cx, &[ChatTask::Send]); - } - Err(e) => { - // Just some nice error reporting, but could be a simple "print" if - // you want. - chat.read() - .messages_ref() - .write() - .messages - .push(Message::app_error(e)); - } - } -} -``` - -### 3. UI details - -This is optional but, the "attach file" button on the left side of the prompt -input of the chat is not something important for our app. We can hide it by -overriding the Makepad DSL to hide the left side of the prompt input like this: - -```rust -chat = { - // ...other overrides... - - prompt = { - persistent = { - center = { - left = { - visible: false - } - - // ...other overrides... - } - } - } -} -``` - -## What we did - -Now, we have a chat in slideshow that can "see" our current image! Try asking -it some questions like "what do you see?" to test it. - -## What's Next - -What we did until now is already complete. The [next lesson](../3%20-%20Generating%20Images%20to%20the%20Grid/README.md) -will use the knowledge we gained to create, configure and integrate a new -separate "chat", that will be put in the image grid screen to generate new -images at runtime! \ No newline at end of file diff --git a/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/screenshot_002_001.png b/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/screenshot_002_001.png deleted file mode 100644 index 546ce60..0000000 Binary files a/Appendix/Moly Kit Integration/2 - Current Image as Conversation Context/screenshot_002_001.png and /dev/null differ diff --git a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/README.md b/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/README.md deleted file mode 100644 index 1907e94..0000000 --- a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/README.md +++ /dev/null @@ -1,316 +0,0 @@ -# 3 - Generating Images to the Grid - -## Introduction - -In this additional lesson, we will simply use what we already learned from -the previous Moly Kit lessons. It's assumed you already know how to configure -the `BotContext` required by a `Chat` and what a "chat hook" is. We will go fast -to reach our goal, but the principles are the same as previous lessons. - -For this lesson, what we will do is add a new "chat" to the image grid, with the -purpose of generating new images at runtime! - -## Screenshots - -![screenshot](./screenshot_003_001.png) -![screenshot](./screenshot_003_002.png) - -## Steps - -### Overview - -Knowing what we already know, we can achieve this following this recipe. - -1. Put a new chat in the DSL so it appears at the bottom of the image grid. We -will override its DSL to hide the messages list as it will not be used, leaving -only the prompt input visible. -2. Configure the `BotContext` for this chat using the `OpenAIImageClient` as -the base. This client can talk to models like `dalle-3` or `gpt-image-1` to -produce images. -3. Configure a hook, that simply aborts all default behaviors, taking full -control of what happens when something happens in the `Chat`. - -> [!tip] -> -> Although these lessons focus on integrating the `Chat` widget, you may find -> it cleaner to make your own UI to avoid the needs of hooking. If you prefer to -> do so, you can still leverage a lot of the pure Moly Kit abstractions, like -> the built-in implemented clients, attachments abstractions, async utilities, -> etc. This way, you will not need to deal with SSE, handling JSON format -> inconsistencies, fight some web compatibility issues, etc. that are already -> solved by these built-in implementations. - -### 1. Adding a new chat - -Update the DSL to include the chat below the image grid, inside the image -browser. - -```rust -ImageBrowser = { - flow: Down, - menu_bar = {} - image_grid = {} - - // We added this. - chat = { - height: Fit, - padding: 10, - - // Make the chat invisible until it loads. - visible: false, - - // Let's hide the messages list as it will never be used. - messages = { - visible: false - } - prompt = { - persistent = { - center = { - left = { - // Optionally, as in lesson 2, remove the "attach file" - // button from the left side of the prompt input to - // achieve a cleaner UI, although it may actually be - // useful to produce images from references if you don't - // want to hide it. - visible: false - } - text_input = { - empty_text: "Describe an image to generate..." - } - } - } - } - } -} -``` - -2. Configuring the `BotContext`. - -We will then configure the `BotContext` as in previous lessons, but with the -image generation client: - -```rust -impl LiveHook for App { - fn after_new_from_doc(&mut self, cx: &mut Cx) { - // ...other initialization calls... - - self.configure_image_browser_chat(cx); - } -} - -fn configure_image_browser_chat(&mut self, cx: &mut Cx) { - self.configure_image_browser_chat_context(cx); -} - -fn configure_image_browser_chat_context(&self, cx: &mut Cx) { - let url = std::env::var("API_URL").unwrap_or_default(); - let key = std::env::var("API_KEY").unwrap_or_default(); - - // This client knows how to use OpenAI models for image generation. - let mut client = OpenAIImageClient::new(url); - client.set_key(&key).unwrap(); - - // Generate the `BotContext` from it. - let mut bot_context = BotContext::from(client); - - // Set the `BotContext` for the `Chat`. - let mut chat = self.ui.chat(id!(image_browser.chat)); - chat.write().set_bot_context(cx, Some(bot_context.clone())); - - // Bootstrap the `BotContext` by loading - let ui = self.ui_runner(); - spawn(async move { - // Do the async loading and collect the errors to report them. - let errors = bot_context.load().await.into_errors(); - - // As we don't have the message list visible, let's just print the - // errors. - for error in errors { - eprintln!("Error: {error}"); - } - - ui.defer(move |me, cx, _scope| { - let mut chat = me.ui.chat(id!(image_browser.chat)); - - let model_id = - std::env::var("IMAGE_MODEL_ID").unwrap_or_default(); - - // Search for the `Bot` whose provider-side id is the one we are - // looking to generate images. - let bot = bot_context - .bots() - .into_iter() - .find(|b| b.id.id() == model_id); - - if let Some(bot) = bot { - // Set this as the bot to use by this chat. - chat.write().set_bot_id(cx, Some(bot.id)); - } else { - eprintln!("Error: Image Model ID '{}' not found", model_id); - } - - chat.write().visible = true; - me.ui.redraw(cx); - }); - }); -} -``` - -3. The hook - -We will use a hook as before, to detect the response from the image generation -model, which should come with an `Attachment`, to save it to our filesystem and add -it to our grid. - -Hooks are very flexible, and the approach we will take here is slightly -different to the ones in the previous lessons. Inside the hook, we will: - -1. Clear the tasks vector, essentially, preventing implicit default behavior from -executing. -2. Let some relevant tasks pass-through by performing them manually. -3. Hook into message insertions to always leave the chat with exactly 2 -messages, the user request, and the loading AI message. -4. Hook into message updates, trying to identify the final task notifying us -with the image generation. -5. Write that image to disk, alongside other images of the grid. -6. Add the image to the grid. - -The hook code is long because it does more than in the previous lessons, and -would end up looking like this: - -```rust -fn configure_image_browser_chat(&mut self, cx: &mut Cx) { - self.configure_image_browser_chat_context(cx); - - // Added this line. - self.configure_image_browser_chat_before_hook(cx); -} - -fn configure_image_browser_chat_before_hook(&mut self, _cx: &mut Cx) { - let ui = self.ui_runner(); - self.ui - .chat(id!(image_browser.chat)) - .write() - .set_hook_before(move |task_group, chat, cx| { - // Clear the task group to take full control as mentioned before. - let aborted_tasks = std::mem::take(task_group); - - for task in aborted_tasks { - match task { - // Let this task pass-through. - ChatTask::Send => { - chat.perform(cx, &[ChatTask::Send]); - } - // Let this task pass-through. - ChatTask::ClearPrompt => { - chat.perform(cx, &[ChatTask::ClearPrompt]); - } - // Handle messages insertions, expecting only two messages - // in perfect indexes. - ChatTask::InsertMessage(_, message) => { - match &message.from { - EntityId::User => { - chat.perform( - cx, - &[ChatTask::InsertMessage(0, message)], - ); - } - EntityId::Bot(_) => { - chat.perform( - cx, - &[ChatTask::InsertMessage(1, message)], - ); - } - _ => {} - } - } - // The important task to handle, which contains more complex - // code. Here, we will try to detect when the generated - // image arrives and it's ready. - ChatTask::UpdateMessage(_, message) => { - // A trick to only handle this kind of tasks once. - // We assume only one message update will happen before - // marking the message as ready, and that it will - // contain our image. This is the case for the - // `OpenAIImageClient`. - if !message.metadata.is_writing { - continue; - } - - // See if there is an attachment in this message. - let attachment = message - .content - .attachments - .first() - .cloned(); - - // Do not continue if this message contains no - // attachment. - let Some(attachment) = attachment else { - return; - }; - - // We will want to read the attachment to write it to - // disk. `read()` is async, so we will use `spawn` again - // here. - spawn(async move { - // Read the attachment bytes content. - match attachment.read().await { - Ok(bytes) => { - // Get the current time to use as a "unique filename". - // Note: We are assuming generations are slow and - // there is not parallel app running alongside this one. - // In a serious app you should check the filesystem to - // negotiate a unique filename, or generate something - // unique with a lot of entropy (like an UUID v4/v7). - let now = std::time::SystemTime::now().duration_since( - std::time::UNIX_EPOCH, - ).unwrap().as_secs(); - - // We compute the filename, which should always be a - // `.png` because that's what `OpenAIImageClient` always - // requests. - let filename = format!("generated_image_{now}.png"); - - // Let's take the path where images were stored and - // make the path for our file to write it. - let path = Path::new(IMAGES_PATH).join(&filename); - - println!("Saving generated image to {path:?}"); - - // Write the file to disk. - if let Err(e) = std::fs::write(&path, &bytes) { - eprintln!("Error saving generated image to {path:?}: {e}"); - } - - // Add the image path to the image grid and - // display it. - ui.defer(move |me, cx, _scope| { - me.state.image_paths.push(path); - me.ui.redraw(cx); - }); - }, - Err(e) => { - eprintln!("Error reading image generation: {e}"); - } - } - }); - - // After getting what we want, clear the messages, so - // next generations are not affected by this one. - chat.messages_ref().write().messages.clear(); - } - _ => {} - } - } - }); -} -``` - -As you can see, even if it's a little trickier to do, `Chat` allows us to take -control as much as we need. - -## What we did - -You should now be able to generate images that will get added to the image -grid, by leveraging what we already learned in previous lessons. \ No newline at end of file diff --git a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_001.png b/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_001.png deleted file mode 100644 index 50d1f74..0000000 Binary files a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_001.png and /dev/null differ diff --git a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_002.png b/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_002.png deleted file mode 100644 index 7b72dc5..0000000 Binary files a/Appendix/Moly Kit Integration/3 - Generating Images to the Grid/screenshot_003_002.png and /dev/null differ diff --git a/Appendix/Moly Kit Integration/README.md b/Appendix/Moly Kit Integration/README.md deleted file mode 100644 index 14f363e..0000000 --- a/Appendix/Moly Kit Integration/README.md +++ /dev/null @@ -1,74 +0,0 @@ -# Moly Kit Integration - -## Introduction - -This optional appendix tutorial section aims to explore the integration of -[Moly Kit](https://github.com/moxin-org/moly/tree/main/moly-kit), a crate -containing abstractions, implementations and widgets for embedding LLMs in -Makepad apps. - -We will construct from the [last part of the base image viewer tutorial](https://publish.obsidian.md/makepad-docs/Tutorials/Image+Viewer/7+-+Adding+Animations), -to progressively integrate a couple of LLM chats with capabilities to interact -with our previous image viewer. - -We will break this lessons into 3 big fully functional parts: -1. Embed a fully functional LLM chat inside the slideshow screen. -2. Give the current image in the slideshow as context to that chat. -3. Add a separate prompt input to the grid screen, with image generation -capabilities. - -Lesson 1 and 2 are connected, while lesson 3 uses what we learned to do something -totally different on a different screen. - -## Screenshots - -![lesson 2](./2%20-%20Current%20Image%20as%20Conversation%20Context/screenshot_002_001.png) -![lesson 3](./3%20-%20Generating%20Images%20to%20the%20Grid/screenshot_003_002.png) - - -## Requirements - -- It is assumed you have the latest version of the base image viewer working. -- You should have access to an OpenAI compatible service with its respective -API key and support for vision and image generation models. - -> [!tip] -> -> If you don't have an OpenAI key, you can technically use any local -> model for lesson 1. For lesson 2 you will need a model with vision support. -> These kinds of models may be accessible by using something like Ollama. -> -> However, if you are going to do lesson 3, you will need an OpenAI compatible -> image generation endpoint. That's trickier to mimic locally. - -## Environment variables - -You will eventually need to configure the following environment variables. - -```shell -export API_URL="https://api.openai.com/v1" # Or compatible -export API_KEY="" -export MODEL_ID="gpt-5-nano" -export IMAGE_MODEL_ID="dall-e-3" -``` - -> [!info] -> -> The `IMAGE_MODEL_ID` is only needed if you are doing lesson 3. - -> [!info] -> -> You can replace `gpt-5-nano` and `dall-e-3` with the models you prefer. -> These are simply suggested because they don't require having your identity -> verified and they are relatively cheap. - -## Useful links - -- [The official Moly Kit guide](https://moxin-org.github.io/moly/basics.html) -- [Moly Kit crate documentation](https://moxin-org.github.io/moly/docs/moly_kit) - -## Overview - -- [1 - Embedding an LLM Chat](./1%20-%20Embedding%20an%20LLM%20Chat/README.md) -- [2 - Current Image as Conversation Context](./2%20-%20Current%20Image%20as%20Conversation%20Context/README.md) -- [3 - Generating Images to the Grid](./3%20-%20Generating%20Images%20to%20the%20Grid/README.md)