Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion 01-tutorials/01-your-first-agent.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ You'll create an agent that:
```python
with VisionAgent(reporters=[SimpleHtmlReporter()]) as agent:
```
- Creates a vision agent that can see and interact with your screen
- Creates an agent using the Python SDK that can see and interact with your screen
- Enables debug logging to see what's happening
- Sets up HTML reporting to review the automation later

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Solutions for runtime errors and environment configuration issues

This guide helps you resolve runtime errors and environment-related issues when using AskUI.

## Python Vision Agent Errors
## Python SDK Errors

### Session Info Doesn't Match Error

Expand Down
2 changes: 1 addition & 1 deletion 02-how-to-guides/04-troubleshooting/08-known-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ For solutions to common problems, see our [Troubleshooting Guides](/02-how-to-gu
```
Then restart your system.

## Python Vision Agent
## Python SDK

### Session Management
- **Issue**: "Session info doesn't match" errors
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,4 @@ For detailed information about method parameters, return types, and advanced usa
- [`act()`](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent#act) - Autonomous goal-oriented actions
- [`type()`](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent#type) - Type text input
- [`keyboard()`](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent#keyboard) - Send keyboard input
- [Full Vision Agent API](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent) - Complete reference
- [Full Python SDK API](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent) - Complete reference
2 changes: 1 addition & 1 deletion 02-how-to-guides/05-build-ai-agents/02-select-elements.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Select UI Elements
description: How to select and interact with UI elements using AskUI
---

This guide shows you how to select and interact with UI elements using AskUI's Vision Agent.
This guide shows you how to select and interact with UI elements using AskUI's Python SDK.

## Natural Language Selection

Expand Down
2 changes: 1 addition & 1 deletion 03-explanation/01-foundations/foundations-overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Key features of AskUI include:
- Flexible model use (hot swap of models) and infrastructure for reteaching of models (available on-premise)
- **Secure deployment** of agents in enterprise environments

| Feature | AskUI Vision Agent | Computer Use by Anthropic | Operator by OpenAI | Browser Use | Custom (VLM \+ PyAutoGUI \+ Playwright) |
| Feature | AskUI Python SDK | Computer Use by Anthropic | Operator by OpenAI | Browser Use | Custom (VLM \+ PyAutoGUI \+ Playwright) |
| ---------------------------------- | ------------------ | ------------------------- | ------------------ | ----------- | --------------------------------------- |
| Browser Use | ✅ | ✅ | ✅ | ✅ | ✅ |
| DOM Support | ❌ | ❌ | ✅ | ✅ | ✅ |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ This ensures that your automation script adapts dynamically to real-time applica

### **5. Use Multiple Approaches for Actions**

Flexibility is key when automating tasks, especially for repetitive actions like deleting text. The AskUI Vision Agent allows you to use multiple approaches for the same action, ensuring compatibility across different scenarios.
Flexibility is key when automating tasks, especially for repetitive actions like deleting text. The AskUI Python SDK allows you to use multiple approaches for the same action, ensuring compatibility across different scenarios.

Example:

Expand Down
4 changes: 2 additions & 2 deletions 03-explanation/05-glossary.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: 'This page provides a glossary of key AskUI terms and definitions t
| **Term** | **Meaning** |
| :-------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **access token** | Gives you access to the AskUI inference in combination with your workspace ID. Every access token has an expiry date and is assigned to exactly one workspace. See [Managing Access Tokens](/02-how-to-guides/01-account-management/04-tokens). |
| **action** | A method in the AskUI Control Client API that describes an action to be taken against the operating system, e.g., `click()`, `type()`. See [Vision Agent API Actions](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent). |
| **action** | A method in the AskUI Control Client API that describes an action to be taken against the operating system, e.g., `click()`, `type()`. See [Python SDK API Actions](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/agent). |
| **annotation** | Marked area around an element with metadata including name, text, and bounding box coordinates. |
| **automation** | A system of multiple connected workflows. |
| **bounding box** | A rectangle described by coordinates that define an element’s location, visually displayed as a red rectangle. |
Expand All @@ -16,7 +16,7 @@ description: 'This page provides a glossary of key AskUI terms and definitions t
| **inference server** | The backend system that performs inference (UI analysis and annotation). |
| **instruction** | A single AskUI directive, usually consisting of **action + (optional) locator**. |
| **interactive annotation** | Exploring the annotations of a user interface through an annotated screenshot. |
| **locator** | A description of an UI *element* that is used to (re-)locate (find) the element on the screen, e.g., when trying to click on it; can be a simple textual description like `"login button"` or more complex, potentially multi-modal, e.g., `loc.Image("path/to/image.png").below_of(loc.Text("Login"))`. See [Vision Agent API Locators](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/locators) and [Element Selection Guide](/03-explanation/02-best-practices/01-element-selection/01-element-selection). |
| **locator** | A description of an UI *element* that is used to (re-)locate (find) the element on the screen, e.g., when trying to click on it; can be a simple textual description like `"login button"` or more complex, potentially multi-modal, e.g., `loc.Image("path/to/image.png").below_of(loc.Text("Login"))`. See [Python SDK API Locators](/04-reference/01-agent-frameworks/02-python/02-vision-agent-api/locators) and [Element Selection Guide](/03-explanation/02-best-practices/01-element-selection/01-element-selection). |
| **UI Controller (legacy)** | A service that controls inputs and observes visuals on the operating system. |
| **AskUI Controller** | The updated service that controls inputs and observes visuals on the operating system. |
| **UI Control Client** | Retrieves annotations from the inference server and executes input instructions on the operating system via the AskUI Controller. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ It handles switching the RDP session to console mode.

### PARAMETERS

- `-ControllerType` | _<String>_ - The type of the controller to recover. (Default: RemoteDeviceController for Python Vision Agent)
- `-ControllerType` | _<String>_ - The type of the controller to recover. (Default: RemoteDeviceController for Python SDK)
Valid values are: 'UIController', 'RemoteDeviceController'
Example: 'UIController' for Typescript ADK (NodeJS), 'RemoteDeviceController' for Python Vision Agent.
Example: 'UIController' for Typescript ADK (NodeJS), 'RemoteDeviceController' for Python SDK.

### NOTES

Expand All @@ -23,7 +23,7 @@ This Commandlet is only available on Windows AMD64.

#### EXAMPLE 1

This command adds the AskUI Controller Auto Recover Service for the Python Vision Agent.
This command adds the AskUI Controller Auto Recover Service for the Python SDK.

```powershell
AskUI-AddAskUIControllerAutoRecoverService
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,24 @@
title: AskUI-SetFirstAskUIRunner
---


### SYNOPSIS

Configures the runner environment.
It tries to create the "AskUIRunnerUser" user and disables the screensaver for it.
It sets the AskUI Suite as global installation.
It adds and starts the AskUI Controller Auto Recover Service.

Configures the runner environment.
It tries to create the "AskUIRunnerUser" user and disables the screensaver for it.
It sets the AskUI Suite as global installation.
It adds and starts the AskUI Controller Auto Recover Service.
### PARAMETERS

- `-ControllerType` | _<String>_ - The type of the controller to recover. (Default: RemoteDeviceController for Python Vision Agent)
- `-ControllerType` | _<String>_ - The type of the controller to recover. (Default: RemoteDeviceController for Python SDK)
Valid values are: 'UIController', 'RemoteDeviceController'
Example: 'UIController' for Typescript ADK (NodeJS), 'RemoteDeviceController' for Python Vision Agent.

Example: 'UIController' for Typescript ADK (NodeJS), 'RemoteDeviceController' for Python SDK.
### NOTES

This Commandlet is only available on Windows AMD64.

This Commandlet is only available on Windows AMD64.
### EXAMPLES

#### EXAMPLE 1
Expand Down
8 changes: 4 additions & 4 deletions 04-reference/environment-variables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ description: "These are the environment variables for AskUI"

---

### Vision Agent:
### Python SDK:

- **ASKUI_INFERENCE_ENDPOINT**: The URL of the Vision Agent's inference endpoint, which handles image and text analysis.
- **ASKUI_INFERENCE_ENDPOINT**: The URL of the Python SDK's inference endpoint, which handles image and text analysis.
- **ASKUI_CONTROLLER_PATH**: The file path to the controller's executable or binary file.
- **ASKUI__VA__TELEMETRY__ENABLED**: A boolean flag to enable or disable the recording of usage data for the Vision Agent. Setting this variable to False will disable telemetry.
- **ASKUI__VA__TELEMETRY__ENABLED**: A boolean flag to enable or disable the recording of usage data for the Python SDK. Setting this variable to False will disable telemetry.
- **ASKUI_CONTROLLER_ARGS**: Command-line arguments for the AskUI Remote Device Controller. See details in [ASKUI_CONTROLLER_ARGS](#askui_controller_args).
- **ASKUI_CONTROLLER_CLIENT_SERVER_AUTOSTART**: Boolean flag to enable or disable the automatic startup of AskUI Remote Device Controller Client Server by the Client in vision agent. Default is `true`.
- **ASKUI_CONTROLLER_CLIENT_SERVER_AUTOSTART**: Boolean flag to enable or disable the automatic startup of AskUI Remote Device Controller Client Server by the Client in Python SDK. Default is `true`.
- **ASKUI_CONTROLLER_CLIENT_SERVER_ADDRESS**: The address of the AskUI Remote Device Controller Client Server. Default is `localhost:23000`.
- **ASKUI__AUTHORIZATION**: Overwrites the HTTP Request Authorization Header with the value. (Optional)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,9 @@ To update your [AskUI Typescript ADK](https://github.com/askui/askui) to the new
2. Enter the AskUI Shell by `askui-shell`
3. Update AskUI by `npm install --dev askui@latest`

### AskUI Vision Agent (Python)
### AskUI Python SDK

To update your [AskUI Vision Agent](https://github.com/askui/vision-agent) to the new version:
To update your [AskUI Python SDK](https://github.com/askui/vision-agent) to the new version:

1. Open your project in VSCode
2. Enable your virtual environment.
Expand Down Expand Up @@ -94,7 +94,7 @@ Nothing has changed
| ADE.PluginManager | 0.1.0 | |
| ADE.EnvironmentManager | 0.1.0 | |
| AskUI Typescript ADK | 0.26.0 | [Link](https://github.com/askui/askui/releases/tag/v0.26.0) |
| Python Vision Agent | 0.5.3 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.5.3) |
| Python SDK | 0.5.3 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.5.3) |
| VSCode | 1.98 | |
| Mesa Driver | 25.2.0 | |

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,9 @@ To update your [AskUI Typescript ADK](https://github.com/askui/askui) to the new
2. Enter the AskUI Shell by `askui-shell`
3. Update AskUI by `npm install --dev askui@latest`

### AskUI Vision Agent (Python)
### AskUI Python SDK

To update your [AskUI Vision Agent](https://github.com/askui/vision-agent) to the new version:
To update your [AskUI Python SDK](https://github.com/askui/vision-agent) to the new version:

1. Open your project in VSCode
2. Enable your virtual environment.
Expand Down Expand Up @@ -97,7 +97,7 @@ To update your [AskUI Vision Agent](https://github.com/askui/vision-agent) to th
| ADE.PluginManager | 0.1.0 | |
| ADE.EnvironmentManager | 0.1.0 | |
| AskUI Typescript ADK | 0.26.0 | [Link](https://github.com/askui/askui/releases/tag/v0.26.0) |
| Python Vision Agent | 0.5.3 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.5.3) |
| Python SDK | 0.5.3 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.5.3) |
| VSCode | 1.98 | |
| Mesa Driver | 25.2.0 | |

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The AskUI Suite is a comprehensive bundle that includes all necessary dependenci

This release focus on:
1. Enhance On-Boarding experience for Users with the new [Caesr](https://app.caesr.ai/)
2. Support [Python Vision Agent](https://github.com/askui/vision-agent)
2. Support [Python Python SDK](https://github.com/askui/vision-agent)
3. Improve ADE
4. Improve Proxy Handling and Autodetection

Expand Down Expand Up @@ -186,7 +186,7 @@ Info: Response description: OK

### ADE: Windows Tools: Long Path Tools

On Windows, the maximum file path length is 260 characters. Exceeding this limit results in a "Destination Path Too Long" error. This tool helps users check and enable Windows Long Path support. This helps to avoid errors with Python Vision Agent.
On Windows, the maximum file path length is 260 characters. Exceeding this limit results in a "Destination Path Too Long" error. This tool helps users check and enable Windows Long Path support. This helps to avoid errors with Python Python SDK.

<details>
<summary><b>Startup Check</b> Warns Long Path Support is not enabled - Displays a Long Path not enabled warning when starting the <code>askui-shell</code></summary>
Expand Down Expand Up @@ -288,7 +288,7 @@ Info: Plugin with name 'MyPlugin' has been removed.

### ADE: Python Environment Manager

The Python Environment Manager helps manage virtual environments, dependencies, and package installations, ensuring consistency across projects. It prevents conflicts and allows seamless switching between different Python versions and environments. It is used in combination with AskUI Agents from the [Caesr](https://app.caesr.ai/) and the [Python Vision Agent Libarary](https://github.com/askui/vision-agent)
The Python Environment Manager helps manage virtual environments, dependencies, and package installations, ensuring consistency across projects. It prevents conflicts and allows seamless switching between different Python versions and environments. It is used in combination with AskUI Agents from the [Caesr](https://app.caesr.ai/) and the [Python SDK](https://github.com/askui/vision-agent)

<details>
<summary><code>AskUI-EnablePythonEnvironment</code> - Activates a Python virtual environment. - <a href="https://askui.mintlify.app/02-api-reference/02-askui-suite/02-askui-suite/Python/Public/AskUI-EnablePythonEnvironment">docs</a></summary>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,9 @@ To update your [AskUI Typescipt ADK ](https://github.com/askui/askui) to the new
3. Enter the AskUI Shell by `askui-shell`
4. Update AskUI by `npm install --dev askui@0.23.1`

### AskUI Vision Agent (Python)
### AskUI Python SDK

To update your [AskUI Vision Agent](https://github.com/askui/vision-agent) to the new version:
To update your [AskUI Python SDK](https://github.com/askui/vision-agent) to the new version:
1. Open your project in VSode
5. Enable your virtual environment.
6. Enter `pip install askui==0.2.4`
Expand Down Expand Up @@ -229,7 +229,7 @@ await aui.click().element().matching('a black sneaker shoe').exec(); -> await au
| ADE.PluginManager | 0.1.0 | |
| ADE.EnvironmentManager | 0.1.0 | |
| AskUI Typescript ADK | 0.23.1 | [Link](https://github.com/askui/askui/releases/tag/v0.23.1) |
| Python Vision Agent | 0.2.4 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.2.4) |
| Python SDK | 0.2.4 | [Link](https://github.com/askui/vision-agent/releases/tag/v0.2.4) |



Expand Down
Loading