Skip to content

Conversation

@ForLoopCodes
Copy link

@ForLoopCodes ForLoopCodes commented Jan 8, 2026

Added browser automation toolkit to the packages/opencode package, just like anthropic's claude in chrome and cursor 2.0. Also, this feature can be turned on by setting the flag OPENCODE_ENABLE_BROWSER=true in the terminal or setting browser: true in opencode.json before running it.

This method does not use mcp but direct integration of browser tools into opencode, doing this can further extend opencode's possibilities to open devtools (not yet, but possible), resize windows, and much more than what extension+mcp method does.

Dependencies:

Working video:

opencode.-.added.browser.tools.mp4

(working speed depends on network ping, unfortunately mine is too high)

Tools:

File Tool Name Description
navigate.ts browser_navigate Navigate to URLs with optional content return
navigate-back.ts browser_navigate_back Go back in browser history
navigate-back.ts browser_navigate_forward Go forward in browser history
wait.ts browser_wait Wait for conditions (time, text, elements)
init.ts browser_init Initialize browser instance with optional direct URL navigation
close.ts browser_close Close browser instance
close-page.ts browser_close_page Close current page while keeping browser running
click.ts browser_click Click elements with fuzzy search and auto-scroll
hover.ts browser_hover Hover over elements
type.ts browser_type Type text into inputs with submit option
search.ts browser_search Find elements using fuzzy search
drag.ts browser_drag Drag and drop between elements
scroll.ts browser_scroll Scroll page with content return
press-key.ts browser_press_key Press keyboard keys
fill-form.ts browser_fill_form Fill multiple form fields at once
select-option.ts browser_select_option Select dropdown options
check-element.ts browser_check Check/uncheck checkboxes and radio buttons
file-upload.ts browser_file_upload Upload files to file inputs
content.ts browser_content Get page content (text, links, inputs, structured)
screenshot.ts browser_screenshot Take screenshots (viewport or full page)
snapshot.ts browser_snapshot Get accessibility snapshot for actions
evaluate.ts browser_evaluate Evaluate JavaScript expressions
run-code.ts browser_run_code Run Playwright code snippets
console-messages.ts browser_console_messages Get browser console messages
network-requests.ts browser_network_requests List network requests
handle-dialog.ts browser_handle_dialog Handle browser dialogs (alerts, confirms)
tabs.ts browser_tabs Manage browser tabs (list, create, close, select)
resize.ts browser_resize Resize browser window
get-page.ts browser_get_page Get current page information (URL, title, scroll position)
get-element-at.ts browser_get_element_at Get element at specific coordinates (x, y)
get-element-bounds.ts browser_get_element_bounds Get bounding box of element by selector/ref
verify-element-visible.ts browser_verify_element_visible Verify elements are visible
verify-text-visible.ts browser_verify_text_visible Verify text appears on page
generate-locator.ts browser_generate_locator Generate robust selectors for tests

I've made sure these tools give expandable web automation capabilities and infinite possibilities with minimal tool call overhead and context bloat.

Fixes:

Known issues:

  • need to do npx playwright install chromium before getting started.

Notes:

  • this is my first PR on the opencode repository, would love to listen from the community
  • developers/reviewers are requested to tell me whatever i could add/fix over this feature (im sorry if something broke) <3

Copilot AI review requested due to automatic review settings January 8, 2026 08:49
@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

The following comment was made by an LLM, it may be inaccurate:

No duplicate PRs found

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive browser automation toolkit to OpenCode using Playwright, providing 28 tools for web automation tasks. The implementation uses a Node.js HTTP server bridge (browser-server.js) to manage Chromium instances, as Playwright doesn't work natively with Bun. The feature is gated behind the OPENCODE_ENABLE_BROWSER environment flag.

Key Changes:

  • Browser automation infrastructure: BrowserManager, HTTP bridge server, and 28 automation tools
  • Dependencies: Added playwright@1.57.0 and sharp@0.34.5
  • Configuration: TypeScript path aliases, new environment flags, and tool registry integration

Reviewed changes

Copilot reviewed 39 out of 41 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tsconfig.json Added esModuleInterop and path aliases for imports
packages/opencode/package.json Added playwright and sharp dependencies
packages/opencode/src/flag/flag.ts Added OPENCODE_ENABLE_BROWSER and OPENCODE_BROWSER_PROFILE_PATH flags
packages/opencode/src/tool/tool.ts Added Tool.attachExecute helper for convenience execute method
packages/opencode/src/tool/registry.ts Integrated BrowserTools into tool registry
packages/opencode/src/browser/manager.ts Core BrowserManager with HTTP-based Playwright bridge
packages/opencode/src/browser/tools/*.ts 28 individual browser automation tools
packages/opencode/browser-server.js Node.js HTTP server managing Playwright browser instance
packages/opencode/src/browser/annotate.ts Screenshot annotation utilities using sharp
packages/opencode/src/browser/README.md Comprehensive documentation
bun.lock Lock file updates for new dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant