UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316

loci-dev · 2025-11-25T06:44:17Z

Make sure to read the contributing guidelines before submitting a PR

multi-transport MCP client
full agentic orchestrator
isolated, idempotent singleton initialization
typed SSE client
normalized tool-call accumulation pipeline
integrated reasoning, timings, previews, and turn-limit handling
complete UI section for MCP configuration
dedicated controls for relevant parameters
opt-in ChatService integration that does not interfere with existing flows

TODO: increase coupling with the UI for structured tool-call result rendering, including integrated display components and support for sending out-of-context images (persistence/storage still to be defined).

loci-agentic-ai · 2025-11-25T07:28:18Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary

Analysis Scope: PR #316 - MCP Client Integration for llama.cpp WebUI
Versions Compared: 930f177b-2868-453d-809a-8c06d2215f50 vs d55f4145-0a3a-4b89-9c31-ba206b13d74b

Summary

This PR introduces MCP client functionality exclusively in the WebUI frontend layer (TypeScript/Svelte). Analysis of the actual performance data shows zero measurable impact on core inference functions. All changes are isolated to browser-side JavaScript code with no modifications to the C++ inference engine. Power consumption measurements across all binaries show 0.0% change, confirming no performance regression in the compiled artifacts.

The code review identified 2,338 lines of new frontend code implementing agentic tool-calling workflows. The integration point in ChatService uses an opt-in pattern that bypasses the new code path when MCP is not configured, preserving existing behavior. No performance-critical functions from the project summary (llama_decode, llama_tokenize, llama_model_load_from_file, ggml_backend_graph_compute) were modified.

Function-level metrics for llama_decode show throughput of 69 ns in both versions with response time of 44,722,748 ns vs 44,722,492 ns (256 ns difference, 0.0006% change). The llama_tokenize function maintains 22 ns throughput with response time of 898,714 ns vs 898,716 ns (2 ns difference). These sub-microsecond variations are within measurement noise and indicate no functional changes to the inference pipeline.

Tokens per Second Impact: None. No inference functions modified.

Power Consumption: All binaries show 0.0% change (libllama.so: 228,744 nJ both versions).

Conclusion: This PR adds optional frontend functionality with zero performance impact on core inference operations.

webui: MCP client with low coupling to current codebase

0fe425e

loci-dev temporarily deployed to PROD__AL_DEMO November 25, 2025 06:44 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 50965d2 to 09e03e7 Compare November 25, 2025 07:09

loci-dev force-pushed the main branch 26 times, most recently from eec18ea to 7475023 Compare November 29, 2025 16:09

loci-dev force-pushed the main branch 30 times, most recently from 6b83243 to fa01de0 Compare December 23, 2025 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316

Uh oh!

loci-dev commented Nov 25, 2025

Uh oh!

loci-agentic-ai bot commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316

Are you sure you want to change the base?

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316

Uh oh!

Conversation

loci-dev commented Nov 25, 2025

Uh oh!

loci-agentic-ai bot commented Nov 25, 2025

Performance Analysis Summary

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants