Skip to content

tools: add helper tool about verify backports#3900

Open
ffromani wants to merge 1 commit into
mainfrom
tool-verify-backport
Open

tools: add helper tool about verify backports#3900
ffromani wants to merge 1 commit into
mainfrom
tool-verify-backport

Conversation

@ffromani

@ffromani ffromani commented May 4, 2026

Copy link
Copy Markdown
Member

Add an (AI-assisted) tool to help users verify a backport.
The tool scans all the commits in a backport branch,
summarize the changes, flags the language-agnostic mechanical changes
(e.g. file rename) and code movements, and helps prioritize
the review effort.

The tool intentionally builds on git tooling, acting
an orchestrator and custom-built porcelain layer.

The tool includes a LLM-optimized JSON output.
This can help LLM-driven review in the folling ways:

  1. Reducing orchestration (single command vs multi-step git pipelines)
  2. Returning pre-correlated commit relationships
  3. Providing stable machine-parseable output

Token savings are possible but scenario-dependent and are not guaranteed
for every run; benchmarking and measuring over time and multiple usages
are the only way to know for sure.

See the included README.md for the full details.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 4, 2026
@openshift-ci openshift-ci Bot requested review from shajmakh and swatisehgal May 4, 2026 14:10
@openshift-ci

openshift-ci Bot commented May 4, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 4, 2026
@ffromani

ffromani commented May 4, 2026

Copy link
Copy Markdown
Member Author

/hold

self-review pending, reviewers please ignore till WIP is removed

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 4, 2026
@coderabbitai

coderabbitai Bot commented May 4, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@ffromani has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 39 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 87d13f3a-a021-467e-9cfa-9490d0bc7e00

📥 Commits

Reviewing files that changed from the base of the PR and between 582cd31 and ee0f822.

📒 Files selected for processing (13)
  • tools/verify-backport/README.md
  • tools/verify-backport/testdata/json_all_matched.golden.json
  • tools/verify-backport/testdata/json_githistory_matched.golden.json
  • tools/verify-backport/testdata/json_githistory_unmatched_closest.golden.json
  • tools/verify-backport/testdata/json_mismatch_message.golden.json
  • tools/verify-backport/testdata/json_missing_backports.golden.json
  • tools/verify-backport/testdata/json_missing_not_reported.golden.json
  • tools/verify-backport/testdata/json_mixed_statuses.golden.json
  • tools/verify-backport/testdata/json_unmatched_no_closest.golden.json
  • tools/verify-backport/verify-backport.go
  • tools/verify-backport/verify-backport_githistory_test.go
  • tools/verify-backport/verify-backport_json_test.go
  • tools/verify-backport/verify-backport_test.go
📝 Walkthrough

Walkthrough

Adds a new command-line tool verify-backport that locates and matches backport commits to main-branch originals using git metadata and patch-content comparison; includes JSON and human-readable reporting, extensive tests, golden test data, and a README.

Changes

Verify-Backport Tool

Layer / File(s) Summary
Data Models & Types
tools/verify-backport/verify-backport.go
Introduces MatchStatus, CommitInfo, MatchResult, ReportOptions, and Options types and simple helpers (ShortHash, Subject, status stringers).
CLI Setup & Entrypoint
tools/verify-backport/verify-backport.go
Adds main, flag parsing, run orchestration, option validation, and color initialization.
Option Validation
tools/verify-backport/verify-backport.go
Implements validateOptions enforcing mutually exclusive modes and required flags.
Git Metadata Retrieval
tools/verify-backport/verify-backport.go
Adds getCommitsFromRange, getHashes, getPatchIDs, and getMessages to assemble CommitInfo lists from git ranges.
Matching Logic
tools/verify-backport/verify-backport.go
Implements matchCommits to correlate backports to main via patch-id and message heuristics, producing MatchResult entries.
Patch Parsing & Comparison
tools/verify-backport/verify-backport.go
Adds getCommitPatch, parsePatchFiles, comparePatchContent, diffLinesSets, extractAddRemove, and related structs (filePatch, fileResult, patchComparison) to classify identical/moved/additive/edit changes.
Reporting & Output
tools/verify-backport/verify-backport.go
Adds human-readable and JSON reporting (printReport, printJSONReport, printPatchComparison, color helpers, and pretty-print utilities) and exit-code logic for unmatched/missing backports.
Tests — Unit & Integration
tools/verify-backport/verify-backport_test.go, tools/verify-backport/verify-backport_githistory_test.go, tools/verify-backport/verify-backport_json_test.go
Extensive unit tests for parsing, diffing, matching, utilities, JSON report generation, and git-history integration tests guarded by git availability.
Testdata / Golden Files
tools/verify-backport/testdata/*.golden.json
Adds multiple golden JSON files covering matched, unmatched, mismatch-message, missing-backports, mixed statuses, and git-history scenarios.
Documentation
tools/verify-backport/README.md
New README documenting problem, usage (branch/range modes), flags, output formats, classification, exit codes, build, and requirements.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant CLI as verify-backport
    participant Git as Git
    participant FS as FileSystem/Output

    User->>CLI: invoke with flags (branch/range, --json, --verbose)
    CLI->>Git: fetch commit hashes, patch-ids, messages for ranges
    Git-->>CLI: return commit metadata and patch diffs
    CLI->>CLI: parse patches, compute patch comparisons, match commits
    CLI->>FS: write JSON or print human-readable report (stdout)
    CLI-->>User: exit code (0/1) based on missing/unmatched
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'tools: add helper tool about verify backports' is somewhat vague and generic, using the phrase 'helper tool about' which lacks specificity about the tool's actual purpose or functionality. Consider revising to be more specific about the tool's purpose, such as 'tools: add verify-backport tool for backport validation' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description is well-related to the changeset, explaining the purpose of the tool, its capabilities (scanning commits, flagging changes, JSON output), and its intended use for LLM-driven review.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tool-verify-backport

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ffromani ffromani force-pushed the tool-verify-backport branch from 3885a6d to 93ff134 Compare May 6, 2026 12:42
@ffromani ffromani changed the title WIP: tools: add helper tool about verify backports tools: add helper tool about verify backports May 6, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 6, 2026
@ffromani

ffromani commented May 6, 2026

Copy link
Copy Markdown
Member Author

example run:

$ ./bin/verify-backport -color=false --backport-branch ocpbug-84226-4.18 --base-branch release-4.18
UNMATCHED:         03451dc fixup: adapt cherry-picked commits for release-4.18
UNMATCHED:         607e248 tests: add unit and e2e tests for MachineConfigsState  (~>  bec6f8f)  [1 identical, 2 edits]
  identical (1):
    internal/controller/numaresourcesoperator_controller_test.go -> controllers/numaresourcesoperator_controller_test.go (renamed)
  edits (2):
    pkg/objectstate/rte/machineconfigpool_test.go (+3 only in backport, +3 only in main)
    test/e2e/serial/tests/configuration.go (+7 only in backport, +3 only in main)
UNMATCHED:         318dbcf status: add MachineConfigPoolPaused condition for observability  (~>  58bb44c)  [2 identical, 1 additive, 1 edits]
  identical (2):
    internal/controller/numaresourcesoperator_controller_test.go -> controllers/numaresourcesoperator_controller_test.go (renamed)
    pkg/status/status_test.go
  additive (1):
    pkg/status/status.go (+1 only in backport)
  edits (1):
    internal/controller/numaresourcesoperator_controller.go -> controllers/numaresourcesoperator_controller.go (+2 only in backport, +5 only in main)
UNMATCHED:         68a1dc0 objectstate/rte: skip paused MCPs during MachineConfig sync  (~>  b668e87)  [1 identical, 1 moved, 1 edits]
  identical (1):
    internal/controller/numaresourcesoperator_controller_test.go -> controllers/numaresourcesoperator_controller_test.go (renamed)
  moved (1):
    pkg/objectstate/rte/machineconfigpool.go -> pkg/objectstate/rte/rte.go (90% overlap)
  edits (1):
    internal/controller/numaresourcesoperator_controller.go -> controllers/numaresourcesoperator_controller.go (+1 only in backport, +2 only in main)
UNMATCHED:         2a3690a objectstate/rte: per-pool MachineConfig wait function  (~>  143e53c)  [2 identical, 1 moved]
  identical (2):
    internal/controller/numaresourcesoperator_controller.go -> controllers/numaresourcesoperator_controller.go (renamed)
    internal/controller/numaresourcesoperator_controller_test.go -> controllers/numaresourcesoperator_controller_test.go (renamed)
  moved (1):
    pkg/objectstate/rte/machineconfigpool.go -> pkg/objectstate/rte/rte.go (77% overlap)

summary: 0/5 backport commits matched

@ffromani ffromani force-pushed the tool-verify-backport branch from 93ff134 to 22618a5 Compare May 6, 2026 12:58
@ffromani

ffromani commented May 6, 2026

Copy link
Copy Markdown
Member Author

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 6, 2026
@ffromani ffromani force-pushed the tool-verify-backport branch from 22618a5 to 46be65f Compare May 6, 2026 13:07

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tools/verify-backport/verify-backport.go (1)

864-892: 💤 Low value

parsePatchFiles mishandles a few real-world diff inputs.

Two issues worth tightening before this is depended on:

  1. Filenames with spaces. strings.Fields(lines[0]) treats diff --git a/path with space b/path with space as five tokens; parts[1] becomes path (not the full b-path). Git quotes such names ("a/path\twith"), so consider parsing the header explicitly or stripping the a/ / b/ halves.
  2. Content lines starting with +++ / ---. strings.HasPrefix(line, "+++")/"---" was meant to skip the file-header markers, but it also drops legitimate added/removed content whose first chars are ++ or -- (think config files, banners, table separators). The header lines always come immediately after diff --git / index …; either match them strictly ("+++ " / "--- ") or track whether you have already seen the header pair for the current chunk.

Both gaps will silently undercount changes and skew the additive/edits classification.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport.go` around lines 864 - 892, The
parsePatchFiles function misparses file paths with spaces and incorrectly drops
content lines that start with "+++"/"---"; update parsePatchFiles to (1) parse
the diff header line more robustly instead of using strings.Fields(lines[0]) —
locate the a/ and b/ path tokens (or use a regex) and extract the full b-path
into fp.path, stripping an optional leading "b/" and any surrounding
quotes/escaping so filenames with spaces are preserved (refer to lines[0] and
fp.path), and (2) only treat header markers as file headers when they are the
exact header form (e.g. start with "+++ " or "--- " including the trailing
space) or by tracking that the header pair has been consumed for the current
chunk; this ensures lines beginning with "+++" or "---" that are real content
still get recorded into addedLines/removedLines (refer to addedLines and
removedLines).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/verify-backport/README.md`:
- Around line 15-17: Update the README wording to avoid conflating whitespace
normalization with the stability behavior of git patch-id; specifically, change
the sentence about "`git patch-id --stable` produces the same id regardless of
whitespace differences" to state that whitespace is normalized by default in all
patch-id modes (except `--verbatim`) and that `--stable`'s primary purpose is to
ensure hash stability across file/diff reordering, and also note that there is
no separate `--ignore-whitespace` flag — whitespace handling is determined by
the mode (`--stable`, `--unstable`, `--verbatim`).

In `@tools/verify-backport/verify-backport.go`:
- Around line 240-244: The git invocations in getHashes, getMessages, and
getCommitPatch pass user-supplied revisions (gitRange and hash) directly to
exec.Command which allows values beginning with "-" to be interpreted as flags;
modify each exec.Command call that builds git log / git diff-tree to insert a
"--" separator immediately after the revision argument (e.g., pass "git", "log",
"--no-merges", "--format=%H", gitRange, "--") so git treats the value as a
revision, and add a simple validation to reject empty revisions or
whitespace-only values before invoking the command.
- Around line 355-423: The text report's matched metric (in printReport using
matchCount) only counts StatusMatch, but the JSON report's Summary.Matched and
the exit-code logic treat StatusMismatchMessage as matched — make the definition
consistent by changing the JSON/path that populates Summary.Matched
(printJSONReport or the code that builds the Summary) to count only results
where r.Status == StatusMatch (same as printReport's matchCount), and ensure any
exit decision or Summary construction (the code that sets Summary.Matched) uses
that same predicate so StatusMismatchMessage is not treated as matched.
- Around line 256-278: getPatchIDs currently builds a shell pipeline string and
calls exec.Command("sh", "-c", pipeline), which interpolates gitRange into a
shell command (unsafe and brittle); change it to run the two git commands
directly and wire their stdio with an io.Pipe (or use cmd.StdoutPipe +
cmd2.Stdin); specifically, replace the sh -c invocation with
exec.Command("git","log","-p","--no-merges", gitRange) and
exec.Command("git","patch-id","--stable"), connect the first command's Stdout to
the second command's Stdin, start both processes, wait for completion, capture
output/errors (use buffers for Stderr on each command and return a wrapped error
including stderr if either fails), and keep the rest of getPatchIDs logic
(parsing patch-id output) unchanged.

---

Nitpick comments:
In `@tools/verify-backport/verify-backport.go`:
- Around line 864-892: The parsePatchFiles function misparses file paths with
spaces and incorrectly drops content lines that start with "+++"/"---"; update
parsePatchFiles to (1) parse the diff header line more robustly instead of using
strings.Fields(lines[0]) — locate the a/ and b/ path tokens (or use a regex) and
extract the full b-path into fp.path, stripping an optional leading "b/" and any
surrounding quotes/escaping so filenames with spaces are preserved (refer to
lines[0] and fp.path), and (2) only treat header markers as file headers when
they are the exact header form (e.g. start with "+++ " or "--- " including the
trailing space) or by tracking that the header pair has been consumed for the
current chunk; this ensures lines beginning with "+++" or "---" that are real
content still get recorded into addedLines/removedLines (refer to addedLines and
removedLines).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: a6abdd72-2b7b-4d08-afe6-523afdb6ae29

📥 Commits

Reviewing files that changed from the base of the PR and between f8b1a65 and 22618a5.

📒 Files selected for processing (2)
  • tools/verify-backport/README.md
  • tools/verify-backport/verify-backport.go

Comment thread tools/verify-backport/README.md Outdated
Comment thread tools/verify-backport/verify-backport.go Outdated
Comment thread tools/verify-backport/verify-backport.go
Comment thread tools/verify-backport/verify-backport.go
@ffromani ffromani force-pushed the tool-verify-backport branch from 46be65f to 63f2d3e Compare May 6, 2026 13:11
@ffromani

ffromani commented May 6, 2026

Copy link
Copy Markdown
Member Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 6, 2026
@ffromani ffromani force-pushed the tool-verify-backport branch from 63f2d3e to 582cd31 Compare May 6, 2026 13:59

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
tools/verify-backport/verify-backport.go (2)

360-364: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align “matched” semantics across text summary, JSON summary, and exit code.

Text output and exit behavior treat MISMATCH-MESSAGE as not matched, but JSON currently counts it as matched. This can mislead automation consuming --json.

Proposed fix
 	switch r.Status {
-	case StatusMatch, StatusMismatchMessage:
+	case StatusMatch:
 		report.Summary.Matched++
 		jr.Main = &jsonCommit{Hash: r.MainMatch.Hash, Subject: r.MainMatch.Subject()}
+	case StatusMismatchMessage:
+		jr.Main = &jsonCommit{Hash: r.MainMatch.Hash, Subject: r.MainMatch.Subject()}
 	case StatusUnmatched:

Also applies to: 829-831

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport.go` around lines 360 - 364, The JSON
summary currently treats MISMATCH-MESSAGE as a match while the text summary and
exit code do not; update the counting logic in the loop that iterates over
results (the matchCount increment inside the for _, r := range results switch on
r.Status) to only increment for the explicit StatusMatch value and not for
MISMATCH-MESSAGE or any other non-matching status, and apply the same change to
the analogous counting block referenced around the second occurrence (the other
loop handling results at the later location) so the JSON summary, text output,
and exit code share identical "matched" semantics.

239-240: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Harden git invocation paths: remove shell execution and terminate revision args safely.

gitRange is interpolated into sh -c (Line 256-259), which is command-injection prone; and revision/hash args are passed without --, so flag-like input can be parsed as git options (Line 240, Line 281, Line 864).

Proposed fix
 import (
+	"bytes"
 	"encoding/json"
 	"flag"
 	"fmt"
 	"os"
 	"os/exec"
@@
 	"golang.org/x/term"
 )
@@
+func validateRevisionArg(name, v string) error {
+	v = strings.TrimSpace(v)
+	if v == "" {
+		return fmt.Errorf("%s cannot be empty", name)
+	}
+	if strings.HasPrefix(v, "-") {
+		return fmt.Errorf("%s cannot start with '-': %q", name, v)
+	}
+	return nil
+}
+
 func getHashes(gitRange string) ([]string, error) {
-	cmd := exec.Command("git", "log", "--no-merges", "--format=%H", gitRange)
+	if err := validateRevisionArg("git range", gitRange); err != nil {
+		return nil, err
+	}
+	cmd := exec.Command("git", "log", "--no-merges", "--format=%H", gitRange, "--")
@@
 func getPatchIDs(gitRange string) (map[string]string, error) {
-	pipeline := fmt.Sprintf("git log -p --no-merges %s | git patch-id --stable", gitRange)
-	cmd := exec.Command("sh", "-c", pipeline)
-	out, err := cmd.Output()
-	if err != nil {
-		return nil, fmt.Errorf("patch-id pipeline: %w: %s", err, cmdStderr(err))
-	}
+	if err := validateRevisionArg("git range", gitRange); err != nil {
+		return nil, err
+	}
+	logCmd := exec.Command("git", "log", "-p", "--no-merges", gitRange, "--")
+	idCmd := exec.Command("git", "patch-id", "--stable")
+	pipe, err := logCmd.StdoutPipe()
+	if err != nil {
+		return nil, fmt.Errorf("stdout pipe: %w", err)
+	}
+	idCmd.Stdin = pipe
+	var out bytes.Buffer
+	var idErr bytes.Buffer
+	var logErr bytes.Buffer
+	idCmd.Stdout = &out
+	idCmd.Stderr = &idErr
+	logCmd.Stderr = &logErr
+	if err := logCmd.Start(); err != nil {
+		return nil, fmt.Errorf("git log: %w", err)
+	}
+	if err := idCmd.Run(); err != nil {
+		return nil, fmt.Errorf("git patch-id: %w: %s", err, strings.TrimSpace(idErr.String()))
+	}
+	if err := logCmd.Wait(); err != nil {
+		return nil, fmt.Errorf("git log: %w: %s", err, strings.TrimSpace(logErr.String()))
+	}
@@
 func getMessages(gitRange string) (map[string]string, error) {
-	cmd := exec.Command("git", "log", "--no-merges", "--format=%H%x00%B%x00", gitRange)
+	if err := validateRevisionArg("git range", gitRange); err != nil {
+		return nil, err
+	}
+	cmd := exec.Command("git", "log", "--no-merges", "--format=%H%x00%B%x00", gitRange, "--")
@@
 func getCommitPatch(hash string) (string, error) {
-	cmd := exec.Command("git", "diff-tree", "-p", hash)
+	if err := validateRevisionArg("commit hash", hash); err != nil {
+		return "", err
+	}
+	cmd := exec.Command("git", "diff-tree", "-p", hash, "--")
#!/bin/bash
set -euo pipefail

echo "1) Shell-based git command execution:"
rg -n 'exec\.Command\("sh",\s*"-c"' --type go

echo
echo "2) git command callsites using revision/hash args:"
rg -n 'exec\.Command\("git",\s*"(log|diff-tree)".*(gitRange|hash)' --type go

echo
echo "3) Current implementations around affected ranges:"
fd -i 'verify-backport.go' --exec sed -n '235,290p' {}
fd -i 'verify-backport.go' --exec sed -n '860,870p' {}

Also applies to: 256-259, 281-282, 863-865

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport.go` around lines 239 - 240, The git
invocation is using shell execution and passing revision/hash args without a
"--", which allows command injection and lets inputs be interpreted as git
options; update getHashes and other callsites that use exec.Command("sh", "-c",
...) and exec.Command("git", ...) to call git directly with distinct args and
terminate revisions with "--". Concretely, stop using exec.Command("sh","-c",
...) and instead call exec.Command("git", "log", "--no-merges", "--format=%H",
"--", gitRange) in getHashes, and similarly add a "--" before any revision/hash
args in git diff-tree / other git invocations (e.g., where
exec.Command("git","diff-tree",..., hash) is used) so the revision is passed as
a single safe argument rather than parsed as an option.
🧹 Nitpick comments (1)
tools/verify-backport/verify-backport_test.go (1)

491-527: ⚡ Quick win

Add a regression case for main-only edit labels.

Given main-only unmatched entries are possible, include a fileResult{mainPath: "...", bpPath: ""} case to lock expected label output and prevent filename loss regressions.

Suggested test case
 		{
 			name: "edits both sides",
 			fr:   fileResult{bpPath: "foo.go", changeType: changeEdits, onlyBP: []string{"a"}, onlyMain: []string{"b", "c"}},
 			want: "foo.go (+1 only in backport, +2 only in main)",
 		},
+		{
+			name: "edits main-only path fallback",
+			fr:   fileResult{mainPath: "main_only.go", changeType: changeEdits, onlyMain: []string{"x"}},
+			want: "main_only.go (+0 only in backport, +1 only in main)",
+		},
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport_test.go` around lines 491 - 527, Add a
regression test in TestFileResultLabel to cover main-only unmatched entries: add
a case named like "main-only edit" where fileResult has mainPath set (e.g.,
"old/foo.go"), bpPath empty, changeType set to changeEdits (or changeAdditive if
appropriate), and onlyMain populated (e.g., []string{"a"}); set want to the
expected label that preserves mainPath (e.g., "old/foo.go (+1 only in main)") to
prevent filename-loss regressions—look for TestFileResultLabel and the
fileResult fields mainPath, bpPath, changeType, onlyMain to place the new test
case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/verify-backport/verify-backport_githistory_test.go`:
- Around line 169-216: The tests incorrectly assume matchCommits returns matched
results then unmatched main commits; instead matchCommits returns matched
results first and unmatched backport commits as the second return. Update the
two failing subtests to assert the second return value refers to unmatched
backport commits (not unmatched main) — e.g., check that unmatched
contains/backport hash commitMakefileHash or commitCherryPickHash as
appropriate, and adjust the Status expectations (use StatusUnmatched for results
where patch-ids differ and StatusMatch where patch-ids equal) while referencing
matchCommits, results, unmatched, StatusMatch, StatusUnmatched,
commitDockerfileHash, commitMakefileHash, commitCherryPickHash and
commitOriginalHash to locate and correct the assertions.

In `@tools/verify-backport/verify-backport.go`:
- Around line 559-570: The code currently builds bpByBase and mainByBase as
map[string]filePatch which causes basename collisions to overwrite candidates;
change these to map[string][]filePatch (populate from bpFiles and mainFiles) so
all same basenames are preserved, then in the rename-detection logic
(referencing bpByBase, mainByBase and filePatch) resolve basename candidate
lists by computing and preferring the best match (e.g., exact line-set equality
or highest overlap) instead of a single-value lookup; apply the same change to
the other similar maps created at the second location (the block around the
other bpByBase/mainByBase usage).
- Around line 483-490: The report path selection currently sets path :=
fr.bpPath and only handles RENAMED by combining fr.mainPath -> fr.bpPath, which
causes main-only entries (where fr.bpPath == "") to lose the filename; update
the logic in the block using fr.bpPath/fr.mainPath/fr.relation (the variable
named path near bpCount/mainCount) to prefer a non-empty fr.bpPath but fall back
to fr.mainPath when bpPath is empty, and still format RENAMED as "main -> bp"
only when both are present; apply the same fix to the analogous code around the
second occurrence that the reviewer noted.

---

Duplicate comments:
In `@tools/verify-backport/verify-backport.go`:
- Around line 360-364: The JSON summary currently treats MISMATCH-MESSAGE as a
match while the text summary and exit code do not; update the counting logic in
the loop that iterates over results (the matchCount increment inside the for _,
r := range results switch on r.Status) to only increment for the explicit
StatusMatch value and not for MISMATCH-MESSAGE or any other non-matching status,
and apply the same change to the analogous counting block referenced around the
second occurrence (the other loop handling results at the later location) so the
JSON summary, text output, and exit code share identical "matched" semantics.
- Around line 239-240: The git invocation is using shell execution and passing
revision/hash args without a "--", which allows command injection and lets
inputs be interpreted as git options; update getHashes and other callsites that
use exec.Command("sh", "-c", ...) and exec.Command("git", ...) to call git
directly with distinct args and terminate revisions with "--". Concretely, stop
using exec.Command("sh","-c", ...) and instead call exec.Command("git", "log",
"--no-merges", "--format=%H", "--", gitRange) in getHashes, and similarly add a
"--" before any revision/hash args in git diff-tree / other git invocations
(e.g., where exec.Command("git","diff-tree",..., hash) is used) so the revision
is passed as a single safe argument rather than parsed as an option.

---

Nitpick comments:
In `@tools/verify-backport/verify-backport_test.go`:
- Around line 491-527: Add a regression test in TestFileResultLabel to cover
main-only unmatched entries: add a case named like "main-only edit" where
fileResult has mainPath set (e.g., "old/foo.go"), bpPath empty, changeType set
to changeEdits (or changeAdditive if appropriate), and onlyMain populated (e.g.,
[]string{"a"}); set want to the expected label that preserves mainPath (e.g.,
"old/foo.go (+1 only in main)") to prevent filename-loss regressions—look for
TestFileResultLabel and the fileResult fields mainPath, bpPath, changeType,
onlyMain to place the new test case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e3f0f476-7673-46e7-882f-708b6013d738

📥 Commits

Reviewing files that changed from the base of the PR and between 22618a5 and 582cd31.

📒 Files selected for processing (13)
  • tools/verify-backport/README.md
  • tools/verify-backport/testdata/json_all_matched.golden.json
  • tools/verify-backport/testdata/json_githistory_matched.golden.json
  • tools/verify-backport/testdata/json_githistory_unmatched_closest.golden.json
  • tools/verify-backport/testdata/json_mismatch_message.golden.json
  • tools/verify-backport/testdata/json_missing_backports.golden.json
  • tools/verify-backport/testdata/json_missing_not_reported.golden.json
  • tools/verify-backport/testdata/json_mixed_statuses.golden.json
  • tools/verify-backport/testdata/json_unmatched_no_closest.golden.json
  • tools/verify-backport/verify-backport.go
  • tools/verify-backport/verify-backport_githistory_test.go
  • tools/verify-backport/verify-backport_json_test.go
  • tools/verify-backport/verify-backport_test.go
✅ Files skipped from review due to trivial changes (2)
  • tools/verify-backport/testdata/json_mismatch_message.golden.json
  • tools/verify-backport/testdata/json_unmatched_no_closest.golden.json

Comment on lines +169 to +216
t.Run("cherry-pick with diverged patch-id is unmatched", func(t *testing.T) {
if !commitExists(commitCherryPickHash) || !commitExists(commitOriginalHash) {
t.Skip("cherry-pick or original commit not found in local history")
}

cpMsg := getCommitMessage(t, commitCherryPickHash)
origMsg := getCommitMessage(t, commitOriginalHash)

bp := []CommitInfo{{Hash: commitCherryPickHash, PatchID: commitCherryPickPatchID, Message: cpMsg}}
main := []CommitInfo{{Hash: commitOriginalHash, PatchID: commitOriginalPatchID, Message: origMsg}}

results, _ := matchCommits(bp, main)

if len(results) != 1 {
t.Fatalf("expected 1 result, got %d", len(results))
}
if results[0].Status != StatusUnmatched {
t.Errorf("status = %v, want StatusUnmatched (different patch-ids)", results[0].Status)
}
if results[0].ClosestMatch == nil {
t.Errorf("ClosestMatch should be set via subject fallback (same subject)")
}
})

t.Run("two commits with distinct patch-ids produce correct unmatched main", func(t *testing.T) {
if !commitExists(commitMakefileHash) || !commitExists(commitDockerfileHash) {
t.Skip("required commits not found")
}

makefileMsg := getCommitMessage(t, commitMakefileHash)
dockerfileMsg := getCommitMessage(t, commitDockerfileHash)

bp := []CommitInfo{
{Hash: commitMakefileHash, PatchID: commitMakefilePatchID, Message: makefileMsg},
}
main := []CommitInfo{
{Hash: commitMakefileHash, PatchID: commitMakefilePatchID, Message: makefileMsg},
{Hash: commitDockerfileHash, PatchID: commitDockerfilePatchID, Message: dockerfileMsg},
}

results, unmatched := matchCommits(bp, main)

if len(results) != 1 || results[0].Status != StatusMatch {
t.Errorf("expected 1 MATCH result, got %v", results)
}
if len(unmatched) != 1 || unmatched[0].Hash != commitDockerfileHash {
t.Errorf("expected dockerfile commit as unmatched main, got %v", unmatched)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

matchCommits contract is asserted incorrectly in these subtests.

These assertions expect unmatched entries in results and treat the second return as “unmatched main”, but matchCommits returns matched results first and unmatched backport commits second. This will make these tests fail against the current function behavior.

Proposed test-fix direction
- results, _ := matchCommits(bp, main)
- if len(results) != 1 { ... }
- if results[0].Status != StatusUnmatched { ... }
- if results[0].ClosestMatch == nil { ... }
+ results, unmatched := matchCommits(bp, main)
+ if len(results) != 0 {
+   t.Fatalf("expected 0 matched results, got %d", len(results))
+ }
+ if len(unmatched) != 1 || unmatched[0].Hash != commitCherryPickHash {
+   t.Fatalf("expected cherry-pick in unmatched backports, got %v", unmatched)
+ }
- results, unmatched := matchCommits(bp, main)
- if len(results) != 1 || results[0].Status != StatusMatch { ... }
- if len(unmatched) != 1 || unmatched[0].Hash != commitDockerfileHash { ... }
+ results, unmatched := matchCommits(bp, main)
+ if len(results) != 1 || results[0].Status != StatusMatch {
+   t.Errorf("expected 1 MATCH result, got %v", results)
+ }
+ if len(unmatched) != 0 {
+   t.Errorf("expected 0 unmatched backport commits, got %v", unmatched)
+ }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport_githistory_test.go` around lines 169 -
216, The tests incorrectly assume matchCommits returns matched results then
unmatched main commits; instead matchCommits returns matched results first and
unmatched backport commits as the second return. Update the two failing subtests
to assert the second return value refers to unmatched backport commits (not
unmatched main) — e.g., check that unmatched contains/backport hash
commitMakefileHash or commitCherryPickHash as appropriate, and adjust the Status
expectations (use StatusUnmatched for results where patch-ids differ and
StatusMatch where patch-ids equal) while referencing matchCommits, results,
unmatched, StatusMatch, StatusUnmatched, commitDockerfileHash,
commitMakefileHash, commitCherryPickHash and commitOriginalHash to locate and
correct the assertions.

Comment on lines +483 to +490
path := fr.bpPath
if fr.mainPath != "" && fr.relation == "RENAMED" {
path = fmt.Sprintf("%s -> %s", fr.mainPath, fr.bpPath)
}

bpCount := len(fr.onlyBP)
mainCount := len(fr.onlyMain)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve file path for main-only unmatched entries.

label() falls back to bpPath only. For main-only unpaired files, bpPath is empty, so report lines can lose the filename.

Proposed fix
 func (fr fileResult) label() string {
 	path := fr.bpPath
+	if path == "" {
+		path = fr.mainPath
+	}
 	if fr.mainPath != "" && fr.relation == "RENAMED" {
 		path = fmt.Sprintf("%s -> %s", fr.mainPath, fr.bpPath)
 	}

Also applies to: 690-696

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport.go` around lines 483 - 490, The report
path selection currently sets path := fr.bpPath and only handles RENAMED by
combining fr.mainPath -> fr.bpPath, which causes main-only entries (where
fr.bpPath == "") to lose the filename; update the logic in the block using
fr.bpPath/fr.mainPath/fr.relation (the variable named path near
bpCount/mainCount) to prefer a non-empty fr.bpPath but fall back to fr.mainPath
when bpPath is empty, and still format RENAMED as "main -> bp" only when both
are present; apply the same fix to the analogous code around the second
occurrence that the reviewer noted.

Comment on lines +559 to +570
bpByBase := make(map[string]filePatch)
for _, fp := range bpFiles {
bpByPath[fp.path] = fp
bpByBase[filepath.Base(fp.path)] = fp
}

mainByPath := make(map[string]filePatch)
mainByBase := make(map[string]filePatch)
for _, fp := range mainFiles {
mainByPath[fp.path] = fp
mainByBase[filepath.Base(fp.path)] = fp
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid basename-collision overwrite during rename detection.

Using map[string]filePatch by basename drops all but one candidate per basename. In commits with repeated basenames across dirs, rename pairing becomes nondeterministic and can misclassify files.

Refactor direction
- bpByBase := make(map[string]filePatch)
+ bpByBase := make(map[string][]filePatch)
...
- bpByBase[filepath.Base(fp.path)] = fp
+ base := filepath.Base(fp.path)
+ bpByBase[base] = append(bpByBase[base], fp)

- mainByBase := make(map[string]filePatch)
+ mainByBase := make(map[string][]filePatch)
...
- mainByBase[filepath.Base(fp.path)] = fp
+ base := filepath.Base(fp.path)
+ mainByBase[base] = append(mainByBase[base], fp)

Then resolve basename matches from candidate slices (prefer highest overlap / exact line-set match) instead of single-value lookup.

Also applies to: 594-599

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/verify-backport/verify-backport.go` around lines 559 - 570, The code
currently builds bpByBase and mainByBase as map[string]filePatch which causes
basename collisions to overwrite candidates; change these to
map[string][]filePatch (populate from bpFiles and mainFiles) so all same
basenames are preserved, then in the rename-detection logic (referencing
bpByBase, mainByBase and filePatch) resolve basename candidate lists by
computing and preferring the best match (e.g., exact line-set equality or
highest overlap) instead of a single-value lookup; apply the same change to the
other similar maps created at the second location (the block around the other
bpByBase/mainByBase usage).

Add an (AI-assisted) tool to help users verify a backport.
The tool scans all the commits in a backport branch,
summarize the changes, flags the language-agnostic mechanical changes
(e.g. file rename) and code movements, and helps prioritize
the review effort.

The tool intentionally builds on `git` tooling, acting
an orchestrator and custom-built porcelain layer.

The tool includes a LLM-optimized JSON output.
This can help LLM-driven review in the folling ways:
1. Reducing orchestration (single command vs multi-step git pipelines)
2. Returning pre-correlated commit relationships
3. Providing stable machine-parseable output

Token savings are possible but scenario-dependent and are not guaranteed
for every run; benchmarking and measuring over time and multiple usages
are the only way to know for sure.

See the included README.md for the full details.

AI-Attribution: AIA PAI Nc Hin R claude-4.6-opus-1M v1.0
Signed-off-by: Francesco Romani <fromani@redhat.com>
@ffromani ffromani force-pushed the tool-verify-backport branch from 582cd31 to ee0f822 Compare May 6, 2026 14:12
@ffromani

Copy link
Copy Markdown
Member Author

/test ci-e2e-install-hypershift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant