-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Shared: Prefer source/sink models with manual provenance over generated #21020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared: Prefer source/sink models with manual provenance over generated #21020
Conversation
4d27a73 to
e393034
Compare
e393034 to
0b00589
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses duplicate flow paths in dataflow analysis by prioritizing manual source/sink models over auto-generated models, and filtering away generated flow summaries when a static call target exists in source code for Rust.
Key Changes:
- Introduced filtering logic to prefer manual models over generated models for sources and sinks
- Added similar filtering for Rust flow summaries to match existing C# and Java behavior
- Corrected a minor comment formatting issue in Java
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| shared/dataflow/codeql/dataflow/internal/FlowSummaryImpl.qll | Added isRelevantSource and isRelevantSink predicates to filter generated models when manual models exist for the same kind |
| rust/ql/lib/codeql/rust/dataflow/internal/DataFlowImpl.qll | Added logic to exclude generated flow summaries when static call target is in source code, aligning with Java/C# behavior |
| java/ql/lib/semmle/code/java/dataflow/internal/DataFlowDispatch.qll | Added backticks around applyGeneratedModel in comment for consistency |
| rust/ql/test/query-tests/security/CWE-825/AccessInvalidPointer.expected | Removed duplicate flow path results as expected |
| rust/ql/test/query-tests/security/CWE-312/CleartextLogging.expected | Removed duplicate flow path results and renumbered model references |
| rust/ql/test/library-tests/dataflow/sources/file/TaintSources.expected | Removed duplicate source detections |
| rust/ql/test/library-tests/dataflow/sources/file/InlineFlow.expected | Removed duplicate flow edges and model references, renumbered remaining ones |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
paldepind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Nice to get this right.
I noticed on https://github.com/github/codeql/pull/20879/changes#diff-2dc9c76255b29ca51e9fe9d8861fe51534f0d72c032a32dbb2f554cbb7ffdf55 that we got some duplicate flow paths, which happens because
std::fs::readhas both a manual model and an auto-generated model. The fix is to always prioritize manual source/sink models over generated models, just like we do for flow summaries.This PR also filters away generated flow summaries when there is a static target in source code for Rust, similar to C# and Java.