-
Notifications
You must be signed in to change notification settings - Fork 5
[Mirror] server: (preset) add unsafe-allow-api-override
#68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe changes introduce parameter override support for model loading via the server API, controlled by a preset-based whitelist. This includes new preset infrastructure for map-based loading, updated server model loading with override validation, and documentation for the new /models/load overrides field. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server as Server<br/>/models/load
participant PresetCtx as Preset<br/>Context
participant ModelLoader as Model<br/>Loader
Client->>Server: POST /models/load<br/>(name, overrides)
Server->>PresetCtx: Load preset &<br/>whitelist
rect rgb(200, 220, 255)
Note over Server,PresetCtx: Validate overrides
Server->>PresetCtx: Check each override key<br/>against whitelist
PresetCtx-->>Server: Valid/Invalid result
end
alt Validation passes
Server->>ModelLoader: Apply overrides to<br/>preset copy
ModelLoader->>ModelLoader: Render model args
ModelLoader-->>Server: Return arg vector
Server-->>Client: 200 OK + args
else Validation fails
Server-->>Client: 400 Bad Request<br/>(invalid override)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tools/server/server-models.cpp (1)
21-21: Incorrect header included.
#include <set>is added but the code usesstd::unordered_setat line 412. Either change to#include <unordered_set>or switch to usingstd::setfor theallowed_keyscontainer.🔎 Suggested fix
-#include <set> +#include <unordered_set>tools/server/README.md (1)
1575-1575: Clarify field description wording.Line 1575 reads: "list of preset parameter override (an object mapping string to string)". The opening phrase "list of" conflicts with the object-based structure. Consider: "preset parameter overrides (an object mapping string to string)".
- `overrides`: list of preset parameter override (an object mapping string to string). Parameters must be whitelisted via the `unsafe-allow-api-override` preset parameter. + `overrides`: preset parameter overrides (an object mapping string to string). Parameters must be whitelisted via the `unsafe-allow-api-override` preset parameter.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
common/arg.cppcommon/arg.hcommon/preset.cppcommon/preset.htools/server/README.mdtools/server/server-models.cpptools/server/server-models.h
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{cpp,h,hpp}: Always format C++ code usinggit clang-formatbefore committing, following .clang-format configuration (4-space indentation, 120 column limit, braces on same line for functions, pointer/reference alignment in middle)
Minimize external dependencies; avoid adding new external dependencies unless absolutely necessary
Files:
common/arg.hcommon/preset.hcommon/arg.cppcommon/preset.cpptools/server/server-models.htools/server/server-models.cpp
**/*.{cpp,h,hpp,py}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Ensure cross-platform compatibility by testing code changes on Linux, macOS, and Windows when possible
Files:
common/arg.hcommon/preset.hcommon/arg.cppcommon/preset.cpptools/server/server-models.htools/server/server-models.cpp
🧠 Learnings (1)
📓 Common learnings
Learnt from: ngxson
Repo: ngxson/llama.cpp PR: 0
File: :0-0
Timestamp: 2025-12-22T23:32:42.587Z
Learning: For mirrored PRs (those with [Mirror] in title or mirroring upstream), ngxson wants AI-assisted code review with these preferences: check the upstream PR URL for description, be nit-picky about obvious mistakes, don't tag contributors directly (inform ngxson instead), and alert about suspicious AI-generated code comments like "CRITICAL: ..." or "You are correct here...".
Learnt from: ngxson
Repo: ngxson/llama.cpp PR: 14
File: examples/lookahead/lookahead.cpp:97-98
Timestamp: 2025-03-13T22:56:47.417Z
Learning: ngxson prefers to prioritize simplification in some cases and defer adding error handling to a later time, as indicated by their response to a suggestion about adding error checks for llama_decode_ext calls.
🪛 Cppcheck (2.19.0)
tools/server/server-models.cpp
[error] 424-424: failed to evaluate #if condition, undefined function-like macro invocation
(syntaxError)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: build-cmake-pkg / linux
- GitHub Check: ubuntu-22-cmake-sycl
- GitHub Check: ggml-ci-x64-cpu-low-perf
- GitHub Check: ubuntu-22-cmake-sycl-fp16
- GitHub Check: ubuntu-22-cmake-hip
- GitHub Check: android-ndk-build (arm64-snapdragon, --preset arm64-android-snapdragon-release)
- GitHub Check: ggml-ci-x64-cpu-high-perf
- GitHub Check: android-ndk-build (arm64-cpu, -D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TO...
- GitHub Check: ubuntu-22-cmake-musa
- GitHub Check: android-build
- GitHub Check: ubuntu-latest-cmake-cuda
- GitHub Check: macOS-latest-cmake-x64
- GitHub Check: ubuntu-cpu-cmake (x64, ubuntu-22.04)
- GitHub Check: ubuntu-latest-cmake-sanitizer (THREAD, Debug)
- GitHub Check: ubuntu-latest-cmake-sanitizer (UNDEFINED, Debug)
- GitHub Check: ubuntu-latest-cmake-sanitizer (ADDRESS, Debug)
- GitHub Check: ubuntu-latest-cmake-rpc
- GitHub Check: ubuntu-latest-llguidance
- GitHub Check: macOS-latest-cmake-arm64-webgpu
🔇 Additional comments (12)
common/arg.h (1)
12-13: LGTM!The new macro follows the existing naming convention and the alignment adjustment improves readability.
common/preset.h (1)
68-70: LGTM!The new
load_from_mapmethod declaration is well-documented and follows the existing patterns for preset loaders in this struct.common/preset.cpp (2)
239-259: LGTM!The
load_from_mapimplementation properly:
- Reuses existing helpers (
is_bool_arg,parse_bool_arg) for consistent boolean handling- Gracefully handles unknown options with a warning rather than failing
- Provides debug logging for traceability
265-273: LGTM!Good refactoring -
load_from_ininow delegates toload_from_mapfor each section, reducing code duplication and ensuring consistent option parsing behavior across different preset loading paths.tools/server/server-models.h (1)
117-118: LGTM!The updated signature appropriately:
- Returns the argument list for debugging/inspection purposes
- Accepts
override_paramsto support the new API override featurecommon/arg.cpp (1)
3521-3526: LGTM!The new
unsafe-allow-api-overridepreset option is well-designed:
- The "unsafe" prefix clearly signals security implications to users
- Properly marked as
preset_onlyto prevent CLI exposure- Follows the established pattern from the
load-on-startuppreset abovetools/server/server-models.cpp (4)
408-434: Override validation logic is secure and correctly implemented.The implementation properly:
- Defaults to an empty whitelist (secure by default - no overrides allowed unless explicitly configured)
- Validates each override key against the whitelist before applying
- Restores the original preset after rendering args, preventing permanent preset modifications
- Throws
std::invalid_argumentwith a clear error message for unauthorized overridesThe restore-after-render pattern at line 433 is important for maintaining preset integrity across multiple load calls with different overrides.
776-788: LGTM!Good input validation - ensuring all override values are strings before processing. The error response for non-string values prevents type confusion issues.
248-248: LGTM!Correctly updated to pass empty map
{}for the newoverride_paramsparameter.
598-598: LGTM!Correctly updated to pass empty map for
override_paramswhen ensuring a model is loaded.tools/server/README.md (2)
1489-1489: Verify against upstream PR for completeness.Documentation of
unsafe-allow-api-overridelooks solid with prominent security warning. However, per the learnings, confirm that the upstream PR (PR #18322) contains the same level of detail or if additional context about validation/error handling should be documented here (e.g., what happens if an override is requested but not whitelisted).
1577-1593: Confirm response format documentation.The
/models/loadresponse example shows only{"success": true}. The AI summary mentions the implementation "returns the computed args". Verify whether the response should document returned argument values or if that's only returned by other endpoints.
Mirror from upstream PR: ggml-org#18322
Note: @coderabbitai use my 'Mirror PR' preset for reviewing this.
Summary by CodeRabbit
Release Notes
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.