Skip to content

feat(disk): add IOPS limiters.#1087

Open
Zenithar wants to merge 1 commit into
mainfrom
zenithar/disruption/disk-iops-bottleneck
Open

feat(disk): add IOPS limiters.#1087
Zenithar wants to merge 1 commit into
mainfrom
zenithar/disruption/disk-iops-bottleneck

Conversation

@Zenithar
Copy link
Copy Markdown
Contributor

@Zenithar Zenithar commented Jun 5, 2026

What does this PR do?

  • Adds new functionality
  • Alters existing functionality
  • Fixes a bug
  • Improves documentation or testing

Extends the DiskPressure disruption with IOPS throttling support alongside the existing bandwidth (bytes/sec) throttling.

Changes

API (api/v1beta1/disk_pressure.go)

  • Add readIOPSPerSec and writeIOPSPerSec fields to DiskPressureThrottlingSpec
  • Validate all throttle fields are non-negative (zero = remove limit; negative = rejected at admission)
  • Extend GenerateArgs() to emit --read-iops-per-sec / --write-iops-per-sec flags
  • Extend Explain() with IOPS descriptions

Injector (injector/disk_pressure.go)

  • Rename diskPressureThrottleModeRead/WritediskPressureThrottleModeReadBps/WriteBps to disambiguate from new IOPS modes
  • Add diskPressureThrottleModeReadIops / diskPressureThrottleModeWriteIops enum values
  • cgroups v1: write to blkio.throttle.read_iops_device / blkio.throttle.write_iops_device
  • cgroups v2: write riops= / wiops= key-value pairs to the unified io.max file
  • Inject() and Clean() handle all four throttle modes
  • Add IopsKey observability tag for structured logging

CLI (cli/injector/disk_pressure.go)

  • Wire --read-iops-per-sec / --write-iops-per-sec flags

Docs & Examples

  • Update docs/disk_pressure.md and docs/disruption_catalogue.md
  • Add examples/disk_pressure_read_iops.yaml example manifest
  • Update examples/complete.yaml

CRDs

  • Regenerated chaos.datadoghq.com_disruptions.yaml, _disruptioncrons.yaml, _disruptionrollouts.yaml

Code Quality Checklist

  • The documentation is up to date.
  • My code is sufficiently commented and passes continuous integration checks.
  • I have signed my commit (see Contributing Docs).

Testing

  • I leveraged continuous integration testing
    • by depending on existing unit tests or end-to-end tests.
    • by adding new unit tests or end-to-end tests.
  • I manually tested the following steps:
    • Inject readIOPSPerSec on a cgroups v1 node → verify blkio.throttle.read_iops_device written
    • Inject writeIOPSPerSec on a cgroups v2 node → verify io.max contains wiops=
    • Cleanup path resets all four throttle files to 0
    • locally.
    • as a canary deployment to a cluster.

@Zenithar Zenithar self-assigned this Jun 5, 2026
@Zenithar Zenithar marked this pull request as ready for review June 5, 2026 07:27
@Zenithar Zenithar requested a review from a team as a code owner June 5, 2026 07:27
@datadog-prod-us1-5
Copy link
Copy Markdown

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 62.96%
Overall Coverage: 39.24% (+0.26%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 842a405 | Docs | Datadog PR Page | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant