Skip to content

Add "Stop" button during LLM generation #1145

@curran

Description

@curran

During an AI edit, the user may notice that it's doing the wrong thing, and might want to stop it.

The StopSVG icon is exported from src/client/Icons/index.tsx but is not used in any component.

Let's use it in a new button that says "Stop", which when clicked, will stop receiving LLM streaming updates, and reset the chat history to the way it was before the latest message was sent. It should also restore the chat input field so that it contains the original message for which the response was stopped, so that the user can refine the prompt easily.

Some additional details of the plan:

Great question. You can implement a clean, reliable “Stop” in three parts:

  1. a real network abort (so tokens stop billing + streaming halts immediately),
  2. a shared “stop requested” flag (so your own loop cooperates), and
  3. graceful teardown (so the UI/state isn’t left half-baked).

Below is a drop-in pattern for your codebase.


1) Wire up an AbortController per chat

Create a tiny registry so the UI (or another server handler) can abort an in-flight generation by chatId.

// generationControl.ts
import { VizChatId } from '@vizhub/viz-types';

const controllers = new Map<VizChatId, AbortController>();

export const registerController = (chatId: VizChatId, controller: AbortController) => {
  controllers.set(chatId, controller);
};

export const deregisterController = (chatId: VizChatId) => {
  controllers.delete(chatId);
};

export const stopGenerationNow = (chatId: VizChatId) => {
  const ctrl = controllers.get(chatId);
  if (ctrl) ctrl.abort(); // Triggers AbortError in your streaming loop
  controllers.delete(chatId);
};

Expose a server API route (or a socket event) that calls stopGenerationNow(chatId) when the user hits the Stop button.


2) Add a cooperative “stop requested” flag in ShareDB (optional but nice)

This lets your loop exit even if the SDK didn’t honor the abort yet, and gives your UI a reactive way to show “Stopping…”.

You already have streaming status helpers. Add these two helpers:

// chatStopFlag.ts
import { ShareDBDoc } from '../../types.js';
import { VizChatId } from '@vizhub/viz-types';

export const setStopRequested = (shareDBDoc: ShareDBDoc<any>, chatId: VizChatId, value: boolean) => {
  // Store under your existing chat state, e.g. data.chats[chatId].stopRequested = value
  shareDBDoc.submitOp([{ p: ['chats', chatId, 'stopRequested'], od: !!shareDBDoc.data.chats?.[chatId]?.stopRequested, oi: value }]);
};

export const isStopRequested = (shareDBDoc: ShareDBDoc<any>, chatId: VizChatId) =>
  !!shareDBDoc.data.chats?.[chatId]?.stopRequested;

Your Stop endpoint can set both: setStopRequested(..., true) and stopGenerationNow(chatId).


3) Integrate into your LLM runner

Key changes:

  • Create an AbortController, register it.
  • Pass the signal to the OpenAI/OpenRouter request.
  • Inside the stream loop, periodically check isStopRequested(...).
  • On abort/stop, gracefully bail: emit any pending UI events you want, mark status as stopped, skip or apply partial edits per your policy, finalize the streaming message consistently, and always clean up the controller.
// In your existing file
import OpenAI from 'openai';
import { registerController, deregisterController } from './generationControl.js';
import { setStopRequested, isStopRequested } from './chatStopFlag.js';

const APPLY_PARTIAL_EDITS_ON_STOP = false; // choose policy: false = discard partial edits

export const createLLMFunction = ({
  shareDBDoc,
  chatId,
  enableReasoningTokens = false,
  model,
  aiRequestOptions,
}) => {
  return async (fullPrompt: string) => {
    const openRouterClient = new OpenAI({
      apiKey: process.env.VZCODE_EDIT_WITH_AI_API_KEY,
      baseURL: process.env.VZCODE_EDIT_WITH_AI_BASE_URL || 'https://openrouter.ai/api/v1',
      defaultHeaders: {
        'HTTP-Referer': 'https://vizhub.com',
        'X-Title': 'VizHub',
      },
    });

    // Reset any previous stop requests for this chat
    setStopRequested(shareDBDoc, chatId, false);

    const abortController = new AbortController();
    registerController(chatId, abortController);

    let fullContent = '';
    let generationId = '';
    let currentEditingFileName: string | null = null;
    let accumulatedTextChunk = '';
    let currentFileContent = '';
    let stopped = false;

    createStreamingAIMessage(shareDBDoc, chatId);
    updateStreamingStatus(shareDBDoc, chatId, 'Formulating a plan...');

    const getOriginalFileContent = (fileName: string) => {
      const files = shareDBDoc.data.files;
      for (const file of Object.values(files)) {
        if ((file as any).name === fileName) return (file as any).text || '';
      }
      return '';
    };

    const emitTextChunk = async () => {
      if (accumulatedTextChunk.trim()) {
        await addStreamingEvent(shareDBDoc, chatId, {
          type: 'text_chunk',
          content: accumulatedTextChunk,
          timestamp: Date.now(),
        });
        accumulatedTextChunk = '';
      }
    };

    const completeFileEditing = async (fileName: string) => {
      if (fileName && currentFileContent) {
        await addStreamingEvent(shareDBDoc, chatId, {
          type: 'file_complete',
          fileName,
          beforeContent: getOriginalFileContent(fileName),
          afterContent: currentFileContent,
          timestamp: Date.now(),
        });
        currentFileContent = '';
      }
    };

    const callbacks = {
      onFileNameChange: async (fileName: string, format: string) => {
        await emitTextChunk();
        if (currentEditingFileName) {
          await completeFileEditing(currentEditingFileName);
        }
        currentEditingFileName = fileName;
        currentFileContent = '';
        await addStreamingEvent(shareDBDoc, chatId, {
          type: 'file_start',
          fileName,
          timestamp: Date.now(),
        });
        updateStreamingStatus(shareDBDoc, chatId, `Editing ${fileName}...`);
      },
      onCodeLine: async (line: string) => {
        currentFileContent += line + '\n';
      },
      onNonCodeLine: async (line: string) => {
        if (line.trim() !== '') {
          accumulatedTextChunk += line + '\n';
          if (firstNonCodeChunkProcessed) {
            updateStreamingStatus(shareDBDoc, chatId, 'Describing changes...');
          } else {
            firstNonCodeChunkProcessed = true;
          }
        }
      },
    };

    const parser = new StreamingMarkdownParser(callbacks);

    const chunks: string[] = [];
    let reasoningContent = '';
    const modelName =
      model || process.env.VZCODE_EDIT_WITH_AI_MODEL_NAME || 'anthropic/claude-3.5-sonnet';

    const requestConfig: any = {
      model: modelName,
      messages: [{ role: 'user', content: fullPrompt }],
      usage: { include: true },
      stream: true,
      // IMPORTANT: pass abort signal so we can stop immediately
      signal: abortController.signal,
      ...aiRequestOptions,
    };
    if (enableReasoningTokens) {
      requestConfig.reasoning = { effort: 'low', exclude: false };
    }

    let reasoningStarted = false;
    let contentStarted = false;
    let firstNonCodeChunkProcessed = false;

    try {
      const stream = await (openRouterClient.chat.completions.create as any)(requestConfig);

      for await (const chunk of stream) {
        // Early out if UI requested stop (cooperative cancel)
        if (isStopRequested(shareDBDoc, chatId)) {
          stopped = true;
          abortController.abort(); // ensure the SDK/network stops too
          break;
        }

        if (slowMode) await new Promise((r) => setTimeout(r, 500));

        const delta = chunk.choices[0]?.delta as any;

        if (delta?.reasoning && enableReasoningTokens) {
          if (!reasoningStarted) {
            reasoningStarted = true;
            updateStreamingStatus(shareDBDoc, chatId, 'Thinking...');
          }
          reasoningContent += delta.reasoning;
          updateAIScratchpad(shareDBDoc, chatId, reasoningContent);
        } else if (delta?.content) {
          if (!contentStarted) {
            contentStarted = true;
            if (reasoningStarted) updateAIScratchpad(shareDBDoc, chatId, '');
          }
          const chunkContent = delta.content;
          chunks.push(chunkContent);
          await parser.processChunk(chunkContent);
          fullContent += chunkContent;
        } else if (chunk.usage) {
          DEBUG && console.log('Usage:', chunk.usage);
        }
        if (!generationId && chunk.id) generationId = chunk.id;
      }

      // Flush any parser buffers if we didn’t hard abort
      if (!abortController.signal.aborted) {
        await parser.flushRemaining();
        await emitTextChunk();
        if (currentEditingFileName) {
          await completeFileEditing(currentEditingFileName);
        }
      }
    } catch (err: any) {
      // If stopped by user, OpenAI SDK throws an AbortError
      if (err?.name === 'AbortError') {
        stopped = true;
      } else {
        // Real error
        updateStreamingStatus(shareDBDoc, chatId, 'Generation failed.');
        await addStreamingEvent(shareDBDoc, chatId, {
          type: 'error',
          message: String(err?.message || err),
          timestamp: Date.now(),
        });
        // Ensure controller cleanup
        deregisterController(chatId);
        setStopRequested(shareDBDoc, chatId, false);
        // Do not apply edits; finalize message as failed
        await finalizeStreamingMessage(shareDBDoc, chatId);
        throw err;
      }
    } finally {
      // Always cleanup controller
      deregisterController(chatId);
      setStopRequested(shareDBDoc, chatId, false);
    }

    // If user stopped: decide policy (apply partial or discard)
    if (stopped) {
      updateStreamingStatus(shareDBDoc, chatId, 'Stopped by user.');
      await addStreamingEvent(shareDBDoc, chatId, {
        type: 'stopped',
        timestamp: Date.now(),
      });

      if (APPLY_PARTIAL_EDITS_ON_STOP && fullContent.trim()) {
        // (optional) apply partial changes — your call
        updateFiles(
          shareDBDoc,
          mergeFileChanges(
            shareDBDoc.data.files,
            parseMarkdownFiles(fullContent, 'bold').files,
          ),
        );
      }

      await finalizeStreamingMessage(shareDBDoc, chatId);
      return { content: fullContent, generationId };
    }

    // Normal completion: apply edits, finalize, kick runId
    updateFiles(
      shareDBDoc,
      mergeFileChanges(
        shareDBDoc.data.files,
        parseMarkdownFiles(fullContent, 'bold').files,
      ),
    );

    await finalizeStreamingMessage(shareDBDoc, chatId);

    const newRunId = generateRunId();
    const runIdOp = diff(shareDBDoc.data, { ...shareDBDoc.data, runId: newRunId });
    shareDBDoc.submitOp(runIdOp, (error) => {
      if (error) {
        console.warn('Error setting runId after AI editing:', error);
      } else {
        DEBUG && console.log('Set new runId after AI editing:', newRunId);
      }
    });

    return { content: fullContent, generationId };
  };
};

4) Minimal UI hook

  • Add a Stop button shown while a generation is in progress.

  • On click: call your server route /api/ai/stop?chatId=... that does:

    setStopRequested(shareDBDoc, chatId, true);
    stopGenerationNow(chatId);
  • Update the status line from “Thinking…” or “Editing …” to “Stopping…” and then “Stopped by user.”

  • Optionally, offer a “Resume” that simply re-runs createLLMFunction(...)(samePrompt).


5) Behavior choices (recommended defaults)

  • Discard partial edits on stop (default): keeps diffs consistent and avoids surprise half-edits. (APPLY_PARTIAL_EDITS_ON_STOP = false)
  • Bill-aware stop: using AbortController ensures the HTTP stream closes; most providers stop generating further tokens when the connection is aborted.
  • Idempotence: guard that only one generation per chatId runs at a time (e.g., store isGenerating in chat state; ignore new starts while true).

6) Edge cases handled

  • User hammers Stop multiple times → controller abort is idempotent; registry cleanup prevents leaks.
  • Stop during “reasoning” phase → clears scratchpad on real content start; on stop you finalize with a stopped event and empty/partial content depending on policy.
  • Stop between chunks → cooperative check + abort covers both.

That’s it — with the abort signal, a small controller registry, and a consistent stop event/status, you’ll have a snappy, user-friendly Stop that’s safe for your state and wallet.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions