Skip to content

Conversation

@mlikasam-askui
Copy link
Contributor

@mlikasam-askui mlikasam-askui commented Jan 9, 2026

Refactor: Tools Auto-Inject Agent OS

Overview

This PR refactors the tool system to automatically inject agent_os instances into tools through a centralized ToolCollection, eliminating the need to manually pass agent_os to each tool during initialization. The refactoring also splits the monolithic computer.py tool file into individual, focused tool files for better maintainability.

Key Changes

1. Tool Architecture Refactoring

Split Monolithic Computer Tool

  • Before: Single Computer20250124Tool class handling all computer interactions
  • After: Individual tool classes for each action:
    • ComputerScreenshotTool
    • ComputerMouseClickTool
    • ComputerMoveMouseTool
    • ComputerKeyboardTapTool
    • ComputerTypeTool
    • ComputerListDisplaysTool
    • ComputerSetActiveDisplayTool
    • ComputerRetrieveActiveDisplayTool
    • ComputerMouseHoldDownTool
    • ComputerMouseReleaseTool
    • ComputerMouseScrollTool
    • ComputerKeyboardPressedTool
    • ComputerKeyboardReleaseTool
    • ComputerGetMousePositionTool
    • ComputerReconnectTool
    • ComputerDisconnectTool

All tools are now located in src/askui/tools/computer/ directory with a dedicated file per tool.

Base Tool Classes

  • Added ComputerBaseTool (src/askui/models/shared/computer_base_tool.py) - base class for all computer tools
  • Added AndroidBaseTool (src/askui/models/shared/android_base_tool.py) - base class for all Android tools
  • Both inherit from ToolWithAgentOS and provide type-safe access to their respective agent_os implementations

2. Automatic Agent OS Injection

ToolCollection Enhancement

  • ToolCollection now maintains a list of agent_os instances via add_agent_os()
  • Tools automatically receive their agent_os based on matching required_tags during initialization
  • Tools can specify required_tags (e.g., ["computer", "agent_os_facade"]) to match the appropriate agent_os

Agent OS Facade

  • Introduced ComputerAgentOsFacade (src/askui/tools/computer_agent_os_facade.py)
  • Wraps the underlying AgentOs to provide coordinate scaling functionality
  • Automatically scales coordinates between target resolution (1024x768) and real screen resolution
  • Used by all computer tools to ensure consistent coordinate handling

3. Agent Base Refactoring

Simplified Tool Management

  • Removed _get_default_tools_for_act(), _get_default_settings_for_act(), and _get_default_caching_settings_for_act() methods
  • Replaced with instance attributes:
    • self.act_tool_collection - pre-initialized tool collection
    • self.act_settings - default act settings
    • self.caching_settings - default caching settings
  • Tools are now added to the collection via add_agent_os() after initialization

Updated Tool Building Logic

  • _build_tools() now works with ToolCollection instances directly
  • Supports appending tools to existing collection or merging collections
  • Agent OS is automatically injected during tool collection initialization

4. Vision Agent Changes

Tool Initialization

  • Removed dependency on Computer20250124Tool and beta flags
  • Explicitly initializes individual computer tools in the tools list
  • Creates ComputerAgentOsFacade and adds it to act_tool_collection via add_agent_os()
  • Sets default act_settings with system prompt and thinking configuration

Removed Beta Flag Logic

  • Removed _get_default_settings_for_act() method that handled different beta flags for different Claude model versions
  • Simplified settings initialization

5. Android Agent Changes

Tool Initialization

  • Android tools no longer require agent_os parameter in constructor
  • Tools are initialized without agent_os and receive it automatically via ToolCollection
  • act_agent_os_facade is added to act_tool_collection after initialization
  • Default act_settings are set with Android-specific system prompt

6. MCP Server Refactoring

Computer MCP Server (src/askui/chat/api/mcp_servers/computer.py)

  • Replaced single computer() tool with individual tool functions
  • Each tool is now a separate MCP tool (e.g., computer_screenshot, computer_mouse_click)
  • Tools are registered using mcp.add_tool() with proper tags
  • Removed global active_display variable and resolution constants
  • Uses shared ComputerAgentOsFacade instance for all tools

Assistant Seeds Update

  • Updated COMPUTER_AGENT_V1 assistant configuration to use new individual tool names
  • Replaced ["computer", "list_displays", "set_active_display", "retrieve_active_display"] with the full list of 16 individual computer tools

7. Type System Reorganization

Geometry Types

  • Moved Point and PointList from askui.models.models to askui.models.types.geometry
  • Updated all imports across the codebase to use the new location
  • Improves type organization and separation of concerns

8. File Structure Changes

New Files

  • src/askui/tools/computer/__init__.py - exports all computer tools
  • src/askui/tools/computer/*_tool.py - individual tool implementations (16 files)
  • src/askui/tools/computer_agent_os_facade.py - coordinate scaling facade
  • src/askui/models/shared/computer_base_tool.py - base class for computer tools
  • src/askui/models/shared/android_base_tool.py - base class for Android tools
  • src/askui/models/types/geometry.py - geometry type definitions

Removed Files

  • src/askui/tools/computer.py - replaced by individual tool files
  • src/askui/tools/list_displays_tool.py - moved to computer/list_displays_tool.py
  • src/askui/tools/retrieve_active_display_tool.py - moved to computer/retrieve_active_display_tool.py

Moved Files

  • src/askui/tools/set_active_display_tool.pysrc/askui/tools/computer/set_active_display_tool.py

9. Minor Changes

  • Added timeout_graceful_shutdown=5 to FastAPI uvicorn configuration in chat/__main__.py
  • Updated Android tools to remove agent_os parameter from constructors
  • Updated test files to reflect new tool initialization patterns

Benefits

  1. Better Maintainability: Each tool is in its own file, making it easier to understand and modify individual tools
  2. Automatic Dependency Injection: Tools automatically receive their agent_os based on tags, reducing boilerplate
  3. Type Safety: Base tool classes provide type-safe access to platform-specific agent_os implementations
  4. Consistent Coordinate Scaling: ComputerAgentOsFacade ensures all computer tools handle coordinate scaling consistently
  5. Simplified Agent Initialization: Agents no longer need to manage beta flags or complex tool initialization logic
  6. Better Organization: Geometry types are now properly organized in a dedicated module

Migration Notes

  • Tools that previously required agent_os in constructor can now be initialized without it
  • The agent_os will be automatically injected by ToolCollection based on required_tags
  • Custom tools should inherit from ComputerBaseTool or AndroidBaseTool and specify appropriate required_tags
  • Import paths for Point and PointList have changed to askui.models.types.geometry

Testing

  • Updated existing tests to work with new tool initialization
  • All tools maintain the same external API, ensuring backward compatibility for tool usage

- Add ToolWithAgentOS base class for tools requiring AgentOS access
- Create ComputerBaseTool and AndroidBaseTool with platform-specific typing
- Add ToolCollection.add_agent_os() for automatic AgentOS injection
- Modularize computer tools: split computer.py into individual modules
- Refactor Android tools to use AndroidBaseTool
- Add ComputerAgentOsFacade for coordinate scaling functionality
- Add geometry types module for better type safety
- Update all agent classes (VisionAgent, AndroidAgent, WebAgent) to use
  new tool structure with auto-injection
- Update MCP server implementations for new tool architecture

This refactoring enables automatic AgentOS injection into tools through
base classes, eliminating manual agent_os parameter passing and improving
code maintainability and tool composition.
@mlikasam-askui mlikasam-askui self-assigned this Jan 9, 2026
@mlikasam-askui mlikasam-askui marked this pull request as draft January 9, 2026 16:19
@mlikasam-askui mlikasam-askui force-pushed the feat/refactor-tools-auto-inject-agent-os branch from 17a5546 to 3b263d3 Compare January 9, 2026 17:07
@mlikasam-askui mlikasam-askui marked this pull request as ready for review January 12, 2026 09:52
@mlikasam-askui mlikasam-askui marked this pull request as draft January 12, 2026 10:39
@mlikasam-askui mlikasam-askui marked this pull request as ready for review January 12, 2026 17:30
if all(tag in agent_os.tags for tag in tags):
return agent_os
msg = f"Agent OS with tags [{', '.join(tags)}] not found"
raise ValueError(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to consider defining a custom "AskUIException" or "AskUIAgentOSException" for things like that

Copy link
Contributor

@philipph-askui philipph-askui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think my comments on the code are only minor. I am a great fan of the new abstraction of ComputerTools, AndroidTools, and the AgentOSFacade. Besides my comments on the code, in my opinion there is a few things that need to be changed / done before we can merge this into main:

  • Docs: we definitely need to explain the abstractions and concepts in the docs. 3 major concepts introduced here that need to be explained in my opinion are 1) the agent_os_facade, especially with focus on the different coordinate systems, 2) the tags that can/should be used for the tools. I saw "android", "computer", "universal" (which would make sense) and "agent_os_facade" (which would not make so much sense for me), and 3) the tool_store, with focus on how people can import tools from there and what it is for.
  • Benchmarking: We need to assure that the overall performance does not get worse with switching from a "one-tool-architecture" to a "multi-tool-architecture" and removing the beta flags.
  • Testing: I think there should be new (unit and e2e) tests for the new Computer/Android tools, and especially for the tools in the new tools_store.

Copy link
Contributor Author

@mlikasam-askui mlikasam-askui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@mlikasam-askui mlikasam-askui force-pushed the feat/refactor-tools-auto-inject-agent-os branch from 68e22d8 to a97517b Compare January 14, 2026 13:04
@mlikasam-askui mlikasam-askui force-pushed the feat/refactor-tools-auto-inject-agent-os branch from a97517b to 20bffd9 Compare January 14, 2026 13:07
@mlikasam-askui mlikasam-askui merged commit 483bc4a into main Jan 15, 2026
1 check passed
@mlikasam-askui mlikasam-askui deleted the feat/refactor-tools-auto-inject-agent-os branch January 15, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants