Skip to content

fix: Unescape Unicode Characters accepts 4-6 hex digits for U+ prefix#2287

Open
williballenthin wants to merge 1 commit intogchq:masterfrom
williballenthin:fix-2242
Open

fix: Unescape Unicode Characters accepts 4-6 hex digits for U+ prefix#2287
williballenthin wants to merge 1 commit intogchq:masterfrom
williballenthin:fix-2242

Conversation

@williballenthin
Copy link
Copy Markdown

The U+ prefix regex was hardcoded to exactly 4 hex digits, rejecting valid astral plane codepoints like U+1F600 and zero-padded forms like U+000041. Widen the quantifier to {4,6} for U+ only; \u and %u retain their fixed 4-digit requirement.

Closes #2242

AI disclosure
Claude Code Opus 4.6

The U+ prefix regex was hardcoded to exactly 4 hex digits, rejecting
valid astral plane codepoints like U+1F600 and zero-padded forms like
U+000041. Widen the quantifier to {4,6} for U+ only; \u and %u retain
their fixed 4-digit requirement.

Closes gchq#2242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug report: Unescape Unicode Characters only accepts exactly 4 hex digits for U+

2 participants