Skip to content

Task timeout: per-task + global --timeout via abort path #17

Description

@no0dles

What to build

Add a hammerkit-native, runtime-agnostic task timeout so a stuck task (deadlocked process, wedged network call, sandbox that never returns) fails cleanly instead of hanging the whole run. A task accepts an optional timeout (a duration); on expiry the task is aborted via the existing abort path (checkForAbort) and reported as failed with a timeout reason. A global --timeout applies as a default to tasks that declare none; a per-task timeout takes precedence. Enforcement works consistently across the local, Docker, and Kubernetes runtimes (it is a timer that triggers the existing abort signal, not a new cancellation mechanism). A timed-out task MUST NOT write a cache entry (a hang is not a successful build), and on timeout the container/process is cleaned up (no orphaned container/volume), even if the timeout fires during the task's own cleanup. Durations use a documented format (30s, 5m); invalid values are rejected at validation time. No timeout configured → behavior is identical to today (additive). Timeout does not participate in cache identity (ADR-0002). Service readiness timeouts are a separate, deferred concern.

Together with the container-runtime-options slice, this replaces the NOTE-block + manual --test-timeout workaround on the affected sandbox test task.

References: specs/task-timeout/spec.md (US1, US2, FR-001–007); docs/adr/0002.

Acceptance criteria

  • A task that exceeds its timeout fails within the timeout window rather than hanging, reported with a timeout reason (SC-001).
  • After a timeout, no orphaned container/volume remains and no cache entry was written (SC-002).
  • A per-task timeout overrides a global --timeout; a task with no own timeout fails at the global ceiling (SC-003).
  • Timeout behaves consistently across local, Docker, and Kubernetes via the existing abort path.
  • Durations use the documented format; invalid values are rejected at validation time.
  • Timeout does not enter cache identity (the task id is unchanged by setting it).
  • A build file with no timeout behaves identically to the prior version (SC-004).

Blocked by

  • None - can start immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-readyReady for an AFK agent to grab and implement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions