[release-1.17] Add CI check for invalid characters in file paths by kaovilai · Pull Request #9690 · velero-io/velero

kaovilai · 2026-04-08T21:19:22Z

Cherry-pick of #9553 to release-1.17.

Go's module zip rejects filenames containing certain characters (shell special chars, path separators, and non-letter Unicode such as control/format characters). This adds a GitHub Actions workflow that validates all tracked file paths on every PR to catch these issues before they reach downstream consumers.

No filename fixes needed on this branch.

Note

Responses generated with Claude

Go's module zip rejects filenames containing certain characters (shell special chars like " ' * < > ? ` |, path separators : \, and non-letter Unicode such as control/format characters). This caused a build failure when a changelog file contained an invisible U+200E LEFT-TO-RIGHT MARK (see PR velero-io#9552). Add a GitHub Actions workflow that validates all tracked file paths on every PR to catch these issues before they reach downstream consumers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com> (cherry picked from commit 6d18d9b)

Copilot

Pull request overview

Adds a new GitHub Actions workflow that validates all tracked file paths on every pull request to prevent filenames that would be rejected by Go module zip rules (e.g., invisible format characters like U+200E) from reaching downstream consumers.

Changes:

Introduce a PR workflow that scans git ls-files output and fails the job when disallowed characters are found.
Emit actionable error output and suggested rename commands for problematic paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T21:30:33Z

+                        clean = '/'.join(
+                            ''.join(c for c in elem if is_ok(c))
+                            for elem in name.split('/')
+                        )
+                        print(f'::error file={name}::File \"{name}\" contains invalid char {char_desc}')
+                        bad_files.append((name, clean, char_desc))


The workflow command ::error file={name}::... embeds the raw filename in a GitHub Actions command. Since % is allowed by your checker, a filename containing sequences like %0A could be interpreted as a newline when the runner decodes workflow commands (command injection / log corruption). Escape %, \r, and \n per the workflow command spec (or avoid using workflow commands for untrusted data and print a plain error instead).

Copilot · 2026-04-08T21:30:34Z

+            # Check each path element (split on /)
+            for element in name.split('/'):
+                for ch in element:
+                    if not is_ok(ch):
+                        cp = ord(ch)
+                        char_name = unicodedata.name(ch, f'U+{cp:04X}')
+                        char_desc = f'U+{cp:04X} ({char_name})'
+                        # Build cleaned path by stripping invalid chars
+                        clean = '/'.join(
+                            ''.join(c for c in elem if is_ok(c))
+                            for elem in name.split('/')
+                        )
+                        print(f'::error file={name}::File \"{name}\" contains invalid char {char_desc}')
+                        bad_files.append((name, clean, char_desc))
+                        break
+


Once an invalid character is found, the script only breaks out of the innermost character loop, so the same path can be reported multiple times (e.g., invalid chars in multiple path elements). Consider tracking a per-file "invalid" flag and breaking out of the element loop (or continue to the next file) after the first finding to keep output/action annotations deterministic.

Copilot · 2026-04-08T21:30:34Z

+                        # Build cleaned path by stripping invalid chars
+                        clean = '/'.join(
+                            ''.join(c for c in elem if is_ok(c))
+                            for elem in name.split('/')
+                        )
+                        print(f'::error file={name}::File \"{name}\" contains invalid char {char_desc}')
+                        bad_files.append((name, clean, char_desc))
+                        break
+
+        if bad_files:
+            print()
+            print('The following files have characters that are invalid in Go module zip archives:')
+            print()
+            for original, clean, desc in bad_files:
+                print(f'  {original}  — {desc}')
+            print()
+            print('To fix, rename the files to remove the problematic characters:')
+            print()
+            for original, clean, desc in bad_files:
+                if clean:
+                    print(f'  mv \"{original}\" \"{clean}\" && git add \"{clean}\"')
+                    print(f'  # or: git mv \"{original}\" \"{clean}\"')
+                else:
+                    print(f'  # {original} — cannot auto-suggest rename (non-UTF-8)')


The auto-suggested clean rename is built by stripping invalid characters from each path element, but it can produce an empty path element (or even an empty path) if the name is comprised entirely of invalid chars. In that case the suggested mv/git mv commands will be invalid; please guard for empty elements/paths and fall back to a message that no safe auto-suggestion is available.

Copilot · 2026-04-08T21:30:34Z

+        # Characters explicitly rejected by Go's fileNameOK
+        # (path separators / and \ are inherent to paths so we check per-element)
+        bad_ascii = set('\"' + \"'\" + '*<>?\`|:')
+


bad_ascii is defined but never used, which adds noise and suggests there is an additional check that isn't actually performed. Consider removing it, or use it to make the intent of the ASCII checks clearer (e.g., explicitly reject those chars and keep allowed_ascii focused on allow-list behavior).

Suggested change

# Characters explicitly rejected by Go's fileNameOK

# (path separators / and \ are inherent to paths so we check per-element)

bad_ascii = set('\"' + \"'\" + '*<>?\`|:')

# ASCII characters allowed by Go's fileNameOK in addition to letters

# and digits. Path separators / and \ are inherent to paths, so we

# validate each path element separately below.

Copilot AI review requested due to automatic review settings April 8, 2026 21:19

kaovilai mentioned this pull request Apr 8, 2026

Add CI check for invalid characters in file paths #9553

Merged

kaovilai changed the title ~~Add CI check for invalid characters in file paths~~ [release-1.17] Add CI check for invalid characters in file paths Apr 8, 2026

github-actions bot requested review from reasonerjt and ywk253100 April 8, 2026 21:24

Copilot started reviewing on behalf of kaovilai April 8, 2026 21:24 View session

github-actions bot assigned kaovilai Apr 8, 2026

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release-1.17] Add CI check for invalid characters in file paths#9690

[release-1.17] Add CI check for invalid characters in file paths#9690
kaovilai wants to merge 1 commit intovelero-io:release-1.17from
kaovilai:pr-filepath-check-release-1.17

kaovilai commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kaovilai commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants