Skip to content

folly/cli: eliminate double scan in cli_parse_args_from_content EOF path#2647

Open
darion-yaphet wants to merge 1 commit intofacebook:mainfrom
darion-yaphet:fix/trailing-whitespace-double-scan
Open

folly/cli: eliminate double scan in cli_parse_args_from_content EOF path#2647
darion-yaphet wants to merge 1 commit intofacebook:mainfrom
darion-yaphet:fix/trailing-whitespace-double-scan

Conversation

@darion-yaphet
Copy link
Copy Markdown

The code that handles a token still in current when content ends had two independent backward scans:

  1. A while/pop_back loop that stripped trailing spaces from the processed string value (current).
  2. A for loop that scanned content backward from EOF to recompute the byte-span length, because current may have undergone escape substitution and its size no longer matches the raw source bytes.

The root cause was that the main scan loop had no record of where in content the last non-whitespace character of the current token sat.

Fix: add token_end_offset (byte index into content of the last non-space character), updated in every branch that already updates token_end_line/token_end_col — seven sites in total. The EOF path can now compute length = token_end_offset + 1 - token_start in O(1), removing the backward scan of content entirely. The while/pop_back trim of current is kept (it handles the string value) but tightened to the idiomatic while (!x.empty() && cond(x.back())) form.

The code that handles a token still in `current` when content ends had
two independent backward scans:

  1. A `while/pop_back` loop that stripped trailing spaces from the
     processed string value (`current`).
  2. A `for` loop that scanned `content` backward from EOF to recompute
     the byte-span `length`, because `current` may have undergone escape
     substitution and its size no longer matches the raw source bytes.

The root cause was that the main scan loop had no record of *where* in
`content` the last non-whitespace character of the current token sat.

Fix: add `token_end_offset` (byte index into `content` of the last
non-space character), updated in every branch that already updates
`token_end_line`/`token_end_col` — seven sites in total.  The EOF path
can now compute `length = token_end_offset + 1 - token_start` in O(1),
removing the backward scan of `content` entirely.  The `while/pop_back`
trim of `current` is kept (it handles the string value) but tightened
to the idiomatic `while (!x.empty() && cond(x.back()))` form.
@meta-cla meta-cla Bot added the CLA Signed label May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant