1 KAIST 2Korea University 3Yonsei University
*Equal contribution
project page · arXiv · dataset
Summary: This repository contains the experiment code accompanying the paper CANVAS: A Benchmark for Vision-Language Models on Tool-Based UI Design (AAAI 2026). CANVAS designed to evaluate a VLM's capability to generate a UI design with tool invocations in two tasks: Design Replication and Design Modification.
We Tested in the following environment:
- OS: Windows 11 (WSL2) and MacOS
- Node.js: v18.20.8
- Python: 3.12.9
- Figma Desktop + Plugin loaded from
manifest.json(For WSL2, you should copy thedistdirectory locally to Windows and imported it into Figma.)
1. Socket Server
# (terminal 1)
cd src/socket_server
npm install && npm run build
npm run dev2. Figma Plugin
# (terminal 2)
cd src/figma_plugin
npm install && npm run build3. MCP Server
# (terminal 3)
cd src/mcp_server
npm install && npm run build4. MCP Client
# (terminal 4)
cd src/mcp_client
npm install
npm run dev -- --port=3001- Your MCP client GUI available at localhost:3001
- You can select a channel to connect.
5. Load Figma Plugin
- Open Figma Desktop
- Go to: Figma logo → Plugins → Development
- Load manifest:
src/figma_plugin/dist/manifest.json - Click Connect
- Make sure to choose the same channel with your MCP Client.
(Outdated) Debug MCP Server
cd src/mcp_server
npx @modelcontextprotocol/inspector dist/server.jsCurrently, we provide the CANVAS dataset. If you use this dataset, please cite our BibTex.
To follow the basic experiment, download this dataset and organize the locations as shown below:
canvas
├── dataset # HERE!
│ ├── benchmarks
│ │ ├── modification_gt
│ │ └── replication_gt
├── evaluation
└── src
├── config
├── config.py
├── environment.yml
├── experiments
├── figma_plugin
├── mcp_client
├── mcp_server
└── socket_server
Required environment variables:
export OPENAI_API_KEY="your_openai_key"Other APIs can be added in the same way. You can define them directly in the .env file.
cd src
conda env create -f experiments/environment.yml
conda activate canvasbench-eval
Single Agent (Code):
python -m experiments.run_replication_code_experiment \
--config-name single-code-replication \
--model gpt-4.1 \
--variants image_only \
--channel channel_1 \
--agent-type code_replication \
--autoArguments:
-
--config-name: Experiment configuration name (e.g.,single-code-replication,multi-react-replication) -
--model: Model to use. Options:gpt-4o,gpt-4.1,gpt-4o-mini,o3,claude-3-5-sonnet,gemini-2.5-flash,gemini-2.5-pro. -
--channel: Channel name from config.yaml. Options:channel_1throughchannel_7. You need to change the api_base_url insrc/config/expr/{your_expr}.yamlfile. -
--agent-type: (optional) Agent type. Options:code_replication,single_replication,react_replication,single_modification,react_modification -
--auto: (optional) Run in non-interactive auto-save mode (skips user prompts) -
--batch-name: (optional) Batch name to run specific subset of samples (e.g.,batch_1) -
--batches-config-path: (optional) Path to batches.yaml file -
--variants: (optional, replication_only) Comma-separated list of input variants. Options:image_only,text_level_1,text_level_2 -
--task: (optional, modification only) Specific tasks to run (e.g.,task-1,task-2). Runs all if not specified
Single Agent (Canvas):
python -m experiments.run_replication_canvas_experiment \
--config-name single-canvas-replication \
--model gpt-4.1 \
--variants image_only \
--channel channel_1 \
--agent-type single_replication \
--autoMulti Agent (ReAct):
python -m experiments.run_replication_canvas_experiment \
--config-name multi-react-replication \
--model gpt-4.1 \
--variants image_only \
--channel channel_1 \
--agent-type react_replication \
--auto- Task 1, 2, 3 are available.
- For descriptions of each task, please refer to the paper and the huggingface repository.
Single Agent (Canvas):
python -m experiments.run_modification_experiment \
--config-name single-canvas-modification \
--model gemini-2.5-flash \
--channel channel_1 \
--task task-2 \
--agent-type single_modification \
--autoMulti Agent (ReAct):
python -m experiments.run_modification_experiment \
--config-name multi-react-modification \
--model gemini-2.5-flash \
--channel channel_1 \
--task task-2 \
--agent-type react_modification \
--autoBased on Talk-to-Figma, CANVAS adopted an independent MCP client, disregarding the existing Cursor client or Langchain. We also added more tools to conduct experiments.
The full list of implemented tools is provided below.
Implemented Tool List
A complete list of tools supported in CanvasBench (Figma MCP):
-
Connection
get_channels- Get available Figma channels for communicationselect_channel- Select a specific Figma channel for communicationcheck_connection_status- Check the connection status with Figma
-
Inspection
get_page_info- Get the information of the current page in Figmaget_selection_info- Get detailed information about the current selection in Figma, including all node detailsget_node_info- Get detailed information about multiple nodesget_node_info_by_types- Get detailed information about nodes with specific typesget_result_image- Get image of the current figma pageget_page_structure- Get complete elements structure of the current page.
-
Creation
create_rectangle- Create a new rectangle with position, size, and optional namecreate_frame- Create a new frame with position, size, and optional namecreate_text- Create a new text element with customizable font propertiescreate_graphic- Create vector graphics (e.g. icon) using SVG markupcreate_polygon- Create a new polygon with specified number of sidescreate_star- Create a new star with customizable points and inner radiuscreate_line- Create a straight line between two pointscreate_mask- Turn a node into a mask and group it with other nodes to apply the mask
-
Operation
move_node- Move a node to a new positionclone_node- Create a copy of an existing node with optional position offsetresize_node- Resize a node with new dimensionsdelete_node- Delete nodes from Figmareorder_node- Re-order a node within its parent's layer stackgroup_nodes- Group multiple nodes into a single groupungroup_nodes- Ungroup an existing GROUP noderename_node- Rename a noderotate_node- Rotate a node in Figmaboolean_nodes- Combine two or more shape/vector nodes with a boolean operation (UNION, SUBTRACT, INTERSECT, EXCLUDE)
-
Text
set_text_content- Set text content for text nodesget_text_node_info- Collect all text nodes within a specified nodeset_text_properties- Set common text properties (size, line-height, letter-spacing, align) on one text nodeset_text_decoration- Set underline/strikethrough/casing on one text nodeset_text_font- Set the font of one text node (family & style)
-
Style
set_fill_color- Set the fill color of a node (RGBA)set_corner_radius- Set the corner radius of a node with optional per-corner controlget_styles- Get all styles from the current Figma documentset_opacity- Set the overall opacity of a node (0-1)set_stroke- Set stroke color, weight and alignment of a nodeset_fill_gradient- Apply a simple gradient fillset_drop_shadow- Add a drop-shadow effectset_inner_shadow- Add an inner-shadow effectcopy_style- Copy one node's visual style to anotherset_blend_mode- Set the blend-mode of a node (e.g. MULTIPLY, SCREEN)
-
Layout
set_padding- Set padding values for an auto-layout frameset_axis_align- Set primary and counter axis alignment for auto-layout framesset_layout_sizing- Set horizontal and vertical layout sizing (FIXED, HUG, FILL) for auto-layout framesset_item_spacing- Set distance between children in an auto-layout frameset_layout_mode- Set the layout mode and wrap behavior of a frame
Best Practices
When working with canvas Figma MCP:
-
Connection Setup:
- Get available channels using
get_channelsfirst - Select a channel using
select_channelbefore sending commands - Check connection status with
check_connection_statusif needed
- Get available channels using
-
Design Inspection:
- Get page overview using
get_page_infofirst - Check current selection with
get_selection_infobefore modifications - Use
get_node_infofor detailed single node information - Use
get_nodes_infofor batch node information retrieval
- Get page overview using
-
Node Creation:
- Use appropriate creation tools based on needs:
create_framefor containerscreate_rectanglefor basic shapescreate_textfor text elementscreate_graphicfor SVG-based vector graphicscreate_polygonandcreate_starfor geometric shapescreate_linefor connecting elements
- Use appropriate creation tools based on needs:
-
Node Operations:
- Use
move_nodefor repositioning - Use
clone_nodefor duplicating elements - Use
resize_nodefor dimension changes - Use
group_nodesandungroup_nodesfor organizing elements - Use
boolean_nodesfor combining shapes
- Use
-
Text Handling:
- Use
get_text_node_infoto scan for text nodes - Use
set_text_contentfor content updates - Use
set_text_propertiesfor styling (font size, alignment, etc.) - Use
set_text_decorationfor underline/strikethrough effects - Use
set_text_fontfor font family changes
- Use
-
Styling:
- Use
set_fill_colorfor solid colors - Use
set_fill_gradientfor gradient effects - Use
set_strokefor borders and outlines - Use
set_corner_radiusfor rounded corners - Use
copy_styleto replicate styling across nodes - Use
set_opacityandset_blend_modefor visual effects - Use
set_drop_shadowandset_inner_shadowfor depth
- Use
-
Layout Management:
- Use
set_layout_modeto enable auto-layout - Use
set_paddingfor internal spacing - Use
set_axis_alignfor alignment control - Use
set_item_spacingfor spacing between elements - Use
set_layout_sizingfor responsive behavior
- Use
-
Error Handling and Validation:
- Verify changes using appropriate inspection tools
- Handle errors appropriately as all commands can throw exceptions
- Use
rename_nodefor better organization and identification
-
Performance Considerations:
- Use batch operations when available
- Monitor WebSocket connection status
- Implement appropriate error handling and retries
This code is mainly built upon Talk-to-Figma repository and UEyes repository for the visual-saliency-based metric.
MIT
If you use canvas repository or dataset, please cite:
@article{jeong2025canvas,
title={CANVAS: A Benchmark for Vision-Language Models on Tool-Based User Interface Design},
author={Daeheon Jeong and Seoyeon Byun and Kihoon Son and Dae Hyun Kim and Juho Kim},
year={2025},
eprint={2511.20737},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2511.20737},
}