[CELEBORN-2317] Validate applicationId to prevent worker path traversal#3674
[CELEBORN-2317] Validate applicationId to prevent worker path traversal#3674afterincomparableyum wants to merge 1 commit into
Conversation
|
I'll fix CI issues and push |
cbd03e9 to
1a54265
Compare
The worker builds local shuffle paths by concatenating applicationId received over RPC: `<workingDir>/<appId>/<shuffleId>/<fileName>`. The int32 and fileName is built from int id/epoch/mode, but it was never validated against a charset. With auth disabled, any client on the network could supply `appId = "../foo"` and have the worker mkdir, create, or delete files outside its working dir. With auth enabled, the SASL clientId == applicationId equality check in RpcEndpoint did not constrain format, so a tenant whose registered id contained `..` could still escape.
This change:
- Adds Utils.validateAppId enforcing `^[A-Za-z0-9_-]+$`, which matches
Spark (`application_<ts>_<n>`, `local-<ts>`), Flink, and MR formats.
- Calls it at every worker RPC entry point that takes an applicationId
or shuffleKey from the wire: Controller (ReserveSlots, CommitFiles,
DestroyWorkerSlots), PushDataHandler.handleCore, and the two
checkAuth sites in FetchHandler.
- Adds a canonical path containment check in
StorageManager.createDiskFile (local disk branch) as defense in
depth, before mkdirs() runs.
1a54265 to
6e65966
Compare
|
Thanks for the fix! The path traversal issue is real and the approach is solid. One suggestion: I think the So the checks added in That said, the canonical path containment check in Reviewed with Claude Code |
What changes were proposed in this pull request?
This change:
^[A-Za-z0-9_-]+$, which matches Spark (application_<ts>_<n>,local-<ts>), Flink, and MR formats.Why are the changes needed?
The worker builds local shuffle paths by concatenating applicationId received over RPC:
<workingDir>/<appId>/<shuffleId>/<fileName>. The int32 and fileName is built from int id/epoch/mode, but it was never validated against a charset. With auth disabled, any client on the network could supplyappId = "../foo"and have the worker mkdir, create, or delete files outside its working dir. With auth enabled, the SASL clientId == applicationId equality check in RpcEndpoint did not constrain format, so a tenant whose registered id contained..could still escape.Does this PR resolve a correctness bug?
Yes
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests/CI