Skip to content

Conversation

@tautschnig
Copy link

Description of changes:

Adds support for outputting verification results in the SARIF (Static Analysis Results Interchange Format) v2.1.0 JSON format via a new output module with complete data structures and conversion functions for transforming VCResults to SARIF format. Command-line options (--sarif and --output-format=sarif) enable SARIF output generation.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Adds support for outputting verification results in the SARIF (Static
Analysis Results Interchange Format) v2.1.0 JSON format via a new output
module with complete data structures and conversion functions for
transforming VCResults to SARIF format. Command-line options (`--sarif`
and `--output-format=sarif`) enable SARIF output generation.
@shigoel shigoel requested a review from joehendrix December 22, 2025 19:33
Copy link
Contributor

@joehendrix joehendrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style seems reasonable to me. I had a few changes; the most important is related to #guard_msgs.

/-! ## VCResult to SARIF Conversion Tests -/

-- Test converting a successful VCResult
#eval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use #guard_msgs in to test the output is expected. I'd add that to all the #eval statements to squelch output.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe to have addressed this, if I understood correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm looking over these tests and others, and it's odd they are in IO. I think generally Lean should be functional unless there's a good reason to use IO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Ideally I'd like to see the tests print out SARIF JSON on the success path and #guard_msgs based on that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems does seem like a good use of #guard_msgs. The test would looks something like:

/--
info: (json...)
-/
#guard_msgs in
#eval (code that generates JSON)

@tautschnig tautschnig marked this pull request as ready for review December 22, 2025 20:57
Copilot AI review requested due to automatic review settings December 22, 2025 20:57
@tautschnig tautschnig requested review from a team, MikaelMayer, atomb and shigoel as code owners December 22, 2025 20:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for outputting verification results in SARIF (Static Analysis Results Interchange Format) v2.1.0 JSON format, enabling integration with tools that consume SARIF output. The implementation includes comprehensive data structures, conversion functions, and test coverage.

Key Changes:

  • New SARIF output module with complete data structures for SARIF v2.1.0 format
  • Command-line options --sarif and --output-format=sarif to enable SARIF output generation
  • Comprehensive test suite validating SARIF conversion and JSON serialization

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
StrataVerify.lean Adds command-line parsing for SARIF options and integrates SARIF output generation into the main verification workflow, with special handling for C_Simp files
Strata/Languages/Boogie/SarifOutput.lean Implements complete SARIF v2.1.0 data structures and conversion functions to transform VCResults to SARIF format
StrataTest/Languages/Boogie/SarifOutputTests.lean Provides comprehensive test coverage including level conversion, message generation, location extraction, and JSON serialization
Strata.lean Adds import for the new SarifOutput module
Examples/SarifTest.boogie.st Adds example Boogie program for testing SARIF output functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 165 to 170
match uri?, startLine?, startColumn? with
| some uri, some startLine, some startColumn => pure { uri, startLine, startColumn }
| some uri, some startLine, none => pure { uri, startLine, startColumn := 1 }
| some uri, none, some startColumn => pure { uri, startLine := 1, startColumn }
| some uri, none, none => pure { uri, startLine := 1, startColumn := 1 }
| none, _, _ => none
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extractLocation function provides fallback default values when line or column information is missing (defaulting to line 1, column 1). While this allows SARIF output to be generated, it may be misleading to report a location with incorrect line/column numbers. Consider either returning None when essential location information is missing, or documenting this fallback behavior clearly in the function's docstring.

Copilot uses AI. Check for mistakes.
{ fld := .label "file", value := .msg "/test/file.st" },
{ fld := .label "startLine", value := .msg "10" }
]
(extractLocation md == none)
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test expects extractLocation to return none when column information is missing, but the actual implementation (lines 167-169 in SarifOutput.lean) provides fallback default values and returns Some with startColumn := 1. This test will fail. Either update the test to match the implementation's behavior, or change the implementation to match the test's expectations.

Suggested change
(extractLocation md == none)
match extractLocation md with
| some loc =>
loc.uri = "/test/file.st" && loc.startLine = 10 && loc.startColumn = 1
| none => false

Copilot uses AI. Check for mistakes.
tautschnig and others added 3 commits December 22, 2025 22:06
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really cool to have this! I requested a couple of stylistic changes, but they should be easy.

@atomb
Copy link
Contributor

atomb commented Jan 7, 2026

It looks like the most recent refactoring led to some missing references. Maybe a missing import?

@tautschnig
Copy link
Author

It looks like the most recent refactoring led to some missing references. Maybe a missing import?

Err, sorry, I had blindly merged from main, being unaware that #256 had brought about Metadata changes.


/-- SARIF tool driver information -/
structure Driver where
name : String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a "Driver"? Is that a command line a tool name or something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a similar question about ruleId. I think a short comment with an example would help or maybe just link to a URL for the Sarif reference at the top of model so it is easy for someone not familiar with Sarif to understand what some of the strings are for.

Comment on lines +121 to +131
#guard_msgs in
#eval
let md := makeMetadata "/test/file.st" 20 10
let vcr := makeVCResult "failed_obligation" .fail (.sat []) md
let sarifResult := vcResultToSarifResult vcr
if sarifResult.ruleId = "failed_obligation" &&
sarifResult.level = Level.error &&
sarifResult.message.text = "Verification failed" then
pure ()
else
IO.println s!"Failed VCResult conversion test failed: {repr sarifResult}"
Copy link
Contributor

@joehendrix joehendrix Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be more idiomatic as the following:

Suggested change
#guard_msgs in
#eval
let md := makeMetadata "/test/file.st" 20 10
let vcr := makeVCResult "failed_obligation" .fail (.sat []) md
let sarifResult := vcResultToSarifResult vcr
if sarifResult.ruleId = "failed_obligation" &&
sarifResult.level = Level.error &&
sarifResult.message.text = "Verification failed" then
pure ()
else
IO.println s!"Failed VCResult conversion test failed: {repr sarifResult}"
#guard
let md := makeMetadata "/test/file.st" 20 10
let vcr := makeVCResult "failed_obligation" .fail (.sat []) md
let sarifResult := vcResultToSarifResult vcr
let expected := {
ruleId := "failed_obligation"
level := Level.error
message := { text = "Verification failed" }
}
sarifResult = expected

This will require deriving DecidableEq be added to the Sarif result type. If you were intentionally only testing three fields, then I'd change it to do the same, but still use #guard in place of #guard_msgs and IO.

if sarif.schema = "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json" then
pure ()
else
IO.println s!"Schema URI test failed: {sarif.schema}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems pretty artificial.

uri : String
startLine : Nat
startColumn : Nat
deriving Repr, ToJson, FromJson, BEq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these instances just magically give us JSON that's in the right form for SARIF? Pretty cool, if so!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants