Skip to content

feat: wait on Claude rate limits#147

Open
merahul22 wants to merge 3 commits intosnarktank:mainfrom
merahul22:feat/claude-rate-limit-wait
Open

feat: wait on Claude rate limits#147
merahul22 wants to merge 3 commits intosnarktank:mainfrom
merahul22:feat/claude-rate-limit-wait

Conversation

@merahul22
Copy link
Copy Markdown

This PR introduces a retry loop and rate-limit wait mechanism for Claude Code in ralph.sh.

When running Ralph on long tasks, hitting Claude's rate limits would previously fail or halt the agent. This PR adds a resilient retry loop in ralph.sh that detects rate limit responses from Claude, correctly parses the reset time logic using python, pauses the exact required duration (or falls back to a default wait time), and cleanly resumes the operation.

Changes made:

🛡️ Add resilient inner retry loop inside ralph.sh for Claude Code.
🕒 Create lib/rate-limit.sh with a Python utility to accurately parse and compute time differences for quota reset messages.
🧪 Enhance test suite with test-rate-limit.sh to cover Claude rate limit parsing, fallback logic, and to prove the Amp execution path remains functioning.
How to test: You can run the included smoke test script to verify that the sleep intervals and retry behaviors trigger correctly:

./test-rate-limit.sh

@giobart
Copy link
Copy Markdown

giobart commented Apr 21, 2026

This is super usefull, I love it!

Although in my case this did not work right away, I had to add the following condition inside the is_rate_limit_output() function:

    || [[ "$output_lower" == *"hit your limit"* ]]

The time parsing is still defaulting to 5h. Maybe the quota limit output recently changed?

@merahul22
Copy link
Copy Markdown
Author

merahul22 commented Apr 22, 2026

Thanks for testing this and flagging issues!

Applied two fixes:

  1. Detection pattern — The actual Claude message is "You've hit your usage limit..." where the word usage sits between hit your and rate-limit
    , so the original suggestion "hit your limit" didn't match as an exact substring. Changed the pattern to "hit your" which correctly catches both variants.

  2. Reset time parsing (5h fallback) — The new Claude message format says "It will reset at 7pm (Asia/Tokyo)" instead of "Your limit will reset at 7pm (Asia/Tokyo)". The old regex required the full "Your limit will reset at" phrase, so it never matched. Broadened it to match just "reset at" (case-insensitive), which works for both old and new formats.

Also added a new smoke test (run_claude_new_format_test) that simulates the new message format end-to-end and asserts the rate limit is detected, the reset time is correctly parsed (no 5h fallback), and Ralph retries the same iteration — all 4 tests pass.

@LorhanSohaky
Copy link
Copy Markdown

I had to make a few adjustments to get this working:

  1. Added "resets" to detection patterns - The is_rate_limit_output() function now checks for "resets" in addition to the other patterns. This was necessary because the error message format from Claude API includes "resets" (e.g., "resets 12:30am (America/Sao_Paulo)").

  2. Fixed the regex - The original regex used [Rr]eset[s]? which was incorrect. This pattern treats the "s" as optional when it's actually part of the word "resets". The correct pattern is [Rr]esets? - the "s" makes "reset" optional, allowing both "reset" and "resets" to match.

The original regex only handled "Reset at X (timezone)" but not "resets X (timezone)", which caused the parsing to fail and fall back to the 5-hour wait time.

#!/bin/bash

RATE_LIMIT_FALLBACK_WAIT_SECONDS=$((5 * 60 * 60))

is_rate_limit_output() {
  local output_lower
  output_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]')

  [[ "$output_lower" == *"usage limit reached"* ]] \
    || [[ "$output_lower" == *"rate limit"* ]] \
    || [[ "$output_lower" == *"quota exceeded"* ]] \
    || [[ "$output_lower" == *"hit your"* ]] \
    || [[ "$output_lower" == *"resets"* ]]
}

extract_rate_limit_reset_details() {
  local output="$1"
  local reset_time=""
  local reset_timezone=""
  local reset_regex='[Rr]esets?[[:space:]]+([^()[:space:]][^()]*)[[:space:]]\(([^)]+)\)'

  if [[ "$output" =~ $reset_regex ]]; then
    reset_time="${BASH_REMATCH[1]}"
    reset_timezone="${BASH_REMATCH[2]}"
  fi

  if [[ -n "$reset_time" && -n "$reset_timezone" ]]; then
    printf '%s|%s\n' "$reset_time" "$reset_timezone"
    return 0
  fi

  return 1
}

calculate_rate_limit_wait_seconds() {
  local reset_time="$1"
  local reset_timezone="$2"
  local seconds_until_reset=""

  if ! command -v python3 >/dev/null 2>&1; then
    return 1
  fi

  seconds_until_reset=$(python3 - "$reset_time" "$reset_timezone" <<'PY'
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
import sys

reset_time = sys.argv[1].strip().lower().replace(".", "")
reset_timezone = sys.argv[2].strip()

formats = ("%I%p", "%I:%M%p", "%I %p", "%I:%M %p")
now = datetime.now(ZoneInfo(reset_timezone))

parsed_time = None
for fmt in formats:
    try:
        parsed_time = datetime.strptime(reset_time.upper(), fmt)
        break
    except ValueError:
        continue

if parsed_time is None:
    raise SystemExit(1)

reset_at = now.replace(
    hour=parsed_time.hour,
    minute=parsed_time.minute,
    second=0,
    microsecond=0,
)

if reset_at <= now:
    reset_at += timedelta(days=1)

print(max(1, int((reset_at - now).total_seconds())))
PY
  ) || return 1

  if [[ "$seconds_until_reset" =~ ^[0-9]+$ ]]; then
    printf '%s\n' "$seconds_until_reset"
    return 0
  fi

  return 1
}

format_wait_duration() {
  local total_seconds="$1"
  local hours=$((total_seconds / 3600))
  local minutes=$(((total_seconds % 3600) / 60))
  local seconds=$((total_seconds % 60))

  if (( hours > 0 )); then
    printf '%dh %dm %ds' "$hours" "$minutes" "$seconds"
  elif (( minutes > 0 )); then
    printf '%dm %ds' "$minutes" "$seconds"
  else
    printf '%ds' "$seconds"
  fi
}

handle_rate_limit() {
  local output="$1"
  local wait_seconds="$RATE_LIMIT_FALLBACK_WAIT_SECONDS"
  local reset_details=""
  local reset_time=""
  local reset_timezone=""
  local wait_duration=""

  if ! is_rate_limit_output "$output"; then
    return 1
  fi

  echo "Claude hit a rate limit. Waiting for quota reset before retrying this iteration..."

  if reset_details=$(extract_rate_limit_reset_details "$output"); then
    reset_time="${reset_details%%|*}"
    reset_timezone="${reset_details#*|}"

    if wait_seconds=$(calculate_rate_limit_wait_seconds "$reset_time" "$reset_timezone"); then
      wait_duration=$(format_wait_duration "$wait_seconds")
      echo "Detected reset time: $reset_time ($reset_timezone). Sleeping for $wait_duration."
    else
      wait_duration=$(format_wait_duration "$wait_seconds")
      echo "Couldn't calculate reset time from Claude output. Falling back to $wait_duration."
    fi
  else
    wait_duration=$(format_wait_duration "$wait_seconds")
    echo "Couldn't parse reset details from Claude output. Falling back to $wait_duration."
  fi

  sleep "$wait_seconds"
  return 0
}

@merahul22
Copy link
Copy Markdown
Author

"This is a great catch, thank you! The regex fix to use [Rr]esets? and making the rate_limit.sh
optional is perfect for handling the resets 12:30am format.

One small concern: adding just "resets" as a standalone trigger in is_rate_limit_output()
might be a bit too broad. Since Claude outputs code and explanations, if it naturally types a sentence like 'this function resets the connection' during normal operation, Ralph will falsely detect a rate limit and fall asleep.

What do you think about omitting "resets" from the is_rate_limit_output
check entirely? The existing "hit your" or "usage limit reached" patterns likely already catch the first part of that exact error message anyway. We can definitely keep your excellent [Rr]esets? regex improvement in extract_rate_limit_reset_details()
so the parsing succeeds!"

@merahul22
Copy link
Copy Markdown
Author

merahul22 commented Apr 25, 2026

"I've pushed an update to incorporate these improvements!

A couple of small adjustments were made during implementation:

Refined the Regex (extract_rate_limit_reset_details): Your [Rr]esets?[[:space:]]+ suggestion was spot on for capturing the new format. I slightly tweaked it to [Rr]esets?([[:space:]]+at)?[[:space:]]+ because omitting the _is_rate_limit_output()entirely broke the extraction for the older Claude message format ("reset at 7pm"). It now safely supports and parses both formats perfectly!

Omitted "resets" trigger (is_rate_limit_output): As mentioned above, to prevent Ralph from accidentally falling asleep if Claude casually uses the word 'resets' in its code logic, I've left that out. The existing patterns successfully catch this error regardless.
Added Tests: Added a dedicated mock test (run_claude_resets_format_test) specifically tailored to the resets 12:30am variation to make sure neither format causes a 5-hour fallback regression in the future.
Thanks again for catching this edge case and pointing me in the right direction with the regex!"

@merahul22
Copy link
Copy Markdown
Author

@giobart any followup on this pr ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants