Skip to content

ADLS Gen 2#2635

Open
Skrypt wants to merge 29 commits intoAzure:mainfrom
Skrypt:jsavard/adls-gen2
Open

ADLS Gen 2#2635
Skrypt wants to merge 29 commits intoAzure:mainfrom
Skrypt:jsavard/adls-gen2

Conversation

@Skrypt
Copy link
Copy Markdown

@Skrypt Skrypt commented Mar 17, 2026

Adds ADLS Gen 2 feature parity to Azurite. See docs/design/ADLS-gen2-parity.md file for details.

This was created by Claude Sonnet 4.6. Just sharing my progress here.

See OrchardCMS/OrchardCore#19014 for some integration tests.
The Github CI over there uses the current latest Docker image but I've succesfully ran my unit tests using a local Docker container based on this PR.

Maybe we need to create more unit tests. I let you guys decide, but it is looking good for a start.

Adds ADSL Gen 2 feature parity to Azurite. See docs/design/ADLS-gen2-parity.md file for details.
@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 17, 2026

@microsoft-github-policy-service agree

@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

hmm ok, just found this doc here: https://github.com/Azure/Azurite/wiki/ADLS-Gen2-Implementation-Guidance
Will push updates based on this. The only thing is that I'm pushing the whole thing as a single PR which will fail to get merged but at least we get something to work with.

Skrypt and others added 2 commits March 18, 2026 02:30
…DFS endpoint

Close compliance gaps with the Azurite ADLS Gen2 Implementation Guidance wiki:

- Add --enableHierarchicalNamespace CLI flag (default: true) so the emulator
  can run in FNS or HNS mode; wired through BlobEnvironment, Environment,
  VSCEnvironment, DfsServer, and FilesystemHandler.
- Replace non-atomic copy+delete rename with persistence-level renameBlob()
  and renameBlobsByPrefix() on both LokiJS and SQL stores, keeping directory
  renames atomic and avoiding unnecessary extent copies.
- Extract and forward If-Match, If-None-Match, If-Modified-Since, and
  If-Unmodified-Since conditional headers on DFS getProperties, read, and
  delete operations via ModifiedAccessConditions.
- Expose full blob lease lifecycle (acquire, release, renew, break, change)
  on the DFS endpoint by detecting x-ms-lease-action and delegating to the
  existing IBlobMetadataStore lease infrastructure.
- Add DFS-specific error helpers for conditional and lease failures.
- Extend dfsProxy.test.ts with tests for conditional headers, all lease
  operations, and atomic directory rename with children verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce the Swagger-generated DFS interface layer as called for by the
Azurite ADLS Gen2 Implementation Guidance wiki:

- swagger/dfs-storage-2023-11-03.json: OpenAPI 2.0 spec covering all DFS
  REST API operations (filesystem CRUD, path CRUD, read, update, lease).
- swagger/dfs.md: AutoRest configuration for DFS code generation.
- src/blob/generated-dfs/: Generated-style TypeScript artifacts mirroring
  the blob generated layer pattern (src/blob/generated/):
  - artifacts/operation.ts: DfsOperation enum (12 operations)
  - artifacts/models.ts: Typed request/response interfaces for all DFS ops
  - artifacts/specifications.ts: Dispatch specs for HTTP-to-operation matching
  - handlers/IFilesystemHandler.ts, IPathHandler.ts: Handler contracts
  - handlers/IHandlers.ts: Combined IDfsHandlers interface
  - handlers/handlerMappers.ts: Operation-to-handler routing map
  - Context.ts: DFS request context for the generated pattern
- package.json: Added build:autorest:dfs script
- DfsRequestListenerFactory.ts: Architecture docs referencing generated layer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Implement the dedicated hierarchy table required by the Azurite ADLS Gen2
Implementation Guidance wiki ("Add table matching each item with parent").

- IBlobMetadataStore: Add registerHnsPath, unregisterHnsPath,
  unregisterHnsPathsByPrefix, renameHnsPaths, isHnsDirectoryEmpty, and
  hnsPathExists methods for managing parent-child relationships.
- LokiBlobMetadataStore: New $HNS_HIERARCHY$ collection indexed on
  (accountName, containerName, path) and (parentPath) for fast lookups.
- SqlBlobMetadataStore: New HnsHierarchy table with unique path index and
  parentPath index; rename operations use transactions.
- PathHandler: create() registers paths in hierarchy; delete() uses
  isHnsDirectoryEmpty() for non-recursive guard and cleans up hierarchy
  records; renamePath() updates hierarchy atomically;
  ensureIntermediateDirectories() registers intermediate dirs.
- Tests: non-empty directory delete returns 409; recursive delete succeeds;
  auto-created intermediate directories are visible via HEAD.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native ADLS Gen2 (DFS) endpoint support to Azurite by standing up a first-class DFS server that shares the existing blob metadata/extent stores, plus a broad test suite and configuration/documentation updates.

Changes:

  • Introduces a native DFS Express pipeline (context/auth/dispatch + filesystem/path handlers) running on a new configurable DFS host/port.
  • Extends blob metadata stores with atomic rename operations to support DFS rename semantics (single-path + prefix/directory).
  • Adds DFS-focused integration tests and updates CLI/VSC/Docker/README to expose the DFS endpoint and HNS toggle.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
tests/blob/dfsProxy.test.ts Adds DFS end-to-end tests for filesystem/path ops, append/flush, ACLs, leases, and rename.
src/common/VSCEnvironment.ts Adds dfsHost/dfsPort and enableHierarchicalNamespace to VS Code environment config.
src/common/Telemetry.ts Adds dfsHost/dfsPort to telemetry parameter collection.
src/common/Environment.ts Adds CLI flags for DFS host/port and hierarchical namespace toggle.
src/blob/utils/constants.ts Adds DFS defaults and changes HNS default constant behavior.
src/blob/persistence/SqlBlobMetadataStore.ts Implements SQL rename operations for DFS (single blob + prefix).
src/blob/persistence/LokiBlobMetadataStore.ts Implements Loki rename operations for DFS (single blob + prefix).
src/blob/persistence/IBlobMetadataStore.ts Extends store interface with rename methods used by DFS rename.
src/blob/main.ts Starts DFS server alongside standalone blob server and wires shared stores.
src/blob/dfs/handlers/PathHandler.ts Implements DFS path CRUD, list paths, append/flush, ACLs, leases, and rename.
src/blob/dfs/handlers/FilesystemHandler.ts Implements DFS filesystem CRUD/list and property setting via container operations.
src/blob/dfs/DfsPropertyEncoding.ts Adds shared base64 properties header encode/decode helper.
src/blob/dfs/DfsOperation.ts Defines DFS operation enum for routing/auth mapping.
src/blob/dfs/DfsErrorFactory.ts Adds DFS JSON error shaping (including special-casing HEAD).
src/blob/dfs/DfsContextFactory.ts Creates minimal blob Context objects for store calls from DFS handlers.
src/blob/dfs/DfsContext.ts Extracts DFS request context (account/filesystem/path) and performs version checks.
src/blob/dfs/DfsAuthenticationMiddleware.ts Adds DFS auth middleware reusing blob authenticators.
src/blob/SqlBlobServer.ts Exposes stores publicly to allow DFS to share them.
src/blob/IBlobEnvironment.ts Extends blob environment interface with DFS host/port + HNS toggle.
src/blob/DfsServer.ts Adds new DFS server implementation (HTTP/HTTPS) using shared stores.
src/blob/DfsRequestListenerFactory.ts Adds DFS Express pipeline and operation dispatch.
src/blob/DfsConfiguration.ts Adds DFS configuration wrapper based on existing ConfigurationBase.
src/blob/BlobServer.ts Exposes stores publicly to allow DFS to share them.
src/blob/BlobEnvironment.ts Adds blob-service CLI flags for DFS host/port and HNS toggle.
src/azurite.ts Starts DFS server alongside blob/queue/table in the main Azurite entrypoint.
package.json Adds VS Code extension config entries for dfsHost/dfsPort.
docs/designs/ADLS-gen2-parity.md Documents the ADLS Gen2 parity approach, phases, and design rationale.
README.md Documents DFS host/port usage and docker port exposure.
Dockerfile.Windows Exposes DFS port and starts Azurite with dfsHost binding.
Dockerfile Exposes DFS port and starts Azurite with dfsHost binding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/dfs/handlers/FilesystemHandler.ts Outdated
Comment thread tests/blob/dfsProxy.test.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts Outdated
Comment thread src/blob/dfs/DfsAuthenticationMiddleware.ts Outdated
Comment thread package.json Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Skrypt and others added 2 commits March 18, 2026 03:12
Add comprehensive integration tests using the official Azure DataLake SDK
to validate Azurite DFS endpoint compatibility, as required by the wiki
("Pass all language SDK tests").

- Install @azure/storage-file-datalake@^12.29.0 as devDependency
- tests/blob/dfsSDKIntegration.test.ts: 22 tests across 6 categories:
  - Filesystem: create/delete, getProperties, listFileSystems
  - Directory: create/delete, nested dirs, move (rename)
  - File: create, append+flush+read, multi-chunk write, delete, move
  - ACL: setAccessControl/getAccessControl, setPermissions
  - List paths: recursive and non-recursive listing
  - Cross-API: DFS file readable via Blob API and vice versa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement OAuth with ACL enforcement as specified by the Azurite ADLS Gen2
wiki Phase III ("OAuth: ACL works when user login with AAD account").

- OAuthLevel: Add ACL level (--oauth acl) alongside existing BASIC level.
- ConfigurationBase: Accept "acl" as --oauth parameter value.
- DfsContext: Add IDfsAuthenticatedIdentity interface with oid, upn, tid,
  appid fields; add identity field to IDfsContext.
- DfsAuthenticationMiddleware: Extract identity claims (oid, upn, tid,
  appid) from Bearer JWT tokens when ACL mode is active; store in DFS
  context for downstream enforcement.
- DfsAclEnforcer: New module implementing POSIX ACL evaluation:
  - Parses ACL strings ("user::rwx,user:oid:r-x,group::r-x,other::---")
  - Evaluates in POSIX order: owner -> named user -> group -> other
  - Applies mask entries to limit named user/group permissions
  - $superuser and unauthenticated requests bypass checks
  - Maps operations to required permissions (r/w/x)
- PathHandler: Add enforceAcl() helper; check ACL before getProperties,
  read, delete, and update operations; accept OAuthLevel from factory.
- DfsRequestListenerFactory: Pass OAuth level to PathHandler.
- Tests: 19 unit tests for ACL enforcer covering parsing, bypass scenarios,
  owner/named-user/group/other permissions, mask application, and UPN
  matching. All passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

I added new commits to comply to the https://github.com/Azure/Azurite/wiki/ADLS-Gen2-Implementation-Guidance.
It changed a lot of code. I would dismiss that copilot code review and restart a new one now.

The recursive delete was relying on the HNS hierarchy table to check for
children, which missed blobs created via the Blob API or before HNS
tracking was added. Now always uses listBlobs prefix scan (no delimiter)
to find ALL descendant blobs regardless of how they were created, then
deletes them all before removing the directory marker itself.

Fixes: DeleteDirectory_WithContents_DeletesAll test failure where
directoryClient.DeleteAsync(recursive: true) deleted the directory
marker but left child blobs intact.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 18, 2026 07:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds native ADLS Gen2 (DFS) endpoint support to Azurite by introducing a DFS server pipeline that shares the Blob service’s metadata/extent/account stores, along with HNS directory semantics and initial ACL enforcement.

Changes:

  • Introduces a native DFS server (port 10004 by default) with filesystem/path handlers, auth middleware reuse, and DFS-specific context/error handling.
  • Extends metadata stores (Loki + SQL) to support rename operations and an HNS hierarchy registry.
  • Adds DFS-focused tests (proxy-style REST tests, SDK integration tests, and ACL enforcer unit tests) plus swagger artifacts/docs.

Reviewed changes

Copilot reviewed 45 out of 46 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
tests/blob/dfsSDKIntegration.test.ts Adds JS SDK integration coverage via @azure/storage-file-datalake.
tests/blob/dfsProxy.test.ts Adds direct DFS REST-level tests and cross-API (Blob↔DFS) checks.
tests/blob/dfsAclEnforcer.test.ts Adds unit tests for ACL parsing/evaluation logic.
swagger/dfs.md Adds AutoRest configuration for DFS server generation.
swagger/dfs-storage-2023-11-03.json Adds DFS swagger spec used for generated artifacts.
src/common/models.ts Extends OAuth level enum with ACL enforcement mode.
src/common/VSCEnvironment.ts Adds VS Code settings for DFS host/port + HNS toggle.
src/common/Telemetry.ts Adds dfsHost/dfsPort telemetry parameter handling.
src/common/Environment.ts Adds CLI flags for dfsHost/dfsPort and HNS enablement.
src/common/ConfigurationBase.ts Parses new --oauth acl option.
src/blob/utils/constants.ts Adds DFS defaults + flips HNS “enabled” constant default.
src/blob/persistence/SqlBlobMetadataStore.ts Adds HNS hierarchy table + rename/HNS operations in SQL store.
src/blob/persistence/LokiBlobMetadataStore.ts Adds rename/HNS operations and HNS collection in Loki store.
src/blob/persistence/IBlobMetadataStore.ts Extends store interface with rename + HNS hierarchy methods.
src/blob/main.ts Starts DFS server alongside blob server in azurite-blob.
src/blob/generated-dfs/handlers/handlerMappers.ts Adds generated DFS handler mapper wiring.
src/blob/generated-dfs/handlers/IPathHandler.ts Adds generated DFS path handler interface.
src/blob/generated-dfs/handlers/IHandlers.ts Adds generated DFS handler registry interface.
src/blob/generated-dfs/handlers/IFilesystemHandler.ts Adds generated DFS filesystem handler interface.
src/blob/generated-dfs/artifacts/specifications.ts Adds operation matching specs for DFS routing.
src/blob/generated-dfs/artifacts/operation.ts Adds generated DFS operation enum.
src/blob/generated-dfs/artifacts/models.ts Adds generated DFS request/response model types.
src/blob/generated-dfs/Context.ts Adds DFS-specific generated context wrapper.
src/blob/dfs/handlers/PathHandler.ts Implements core DFS path operations (create/read/delete/list/update/lease/rename).
src/blob/dfs/handlers/FilesystemHandler.ts Implements DFS filesystem ops (create/delete/properties/list/setProperties).
src/blob/dfs/DfsPropertyEncoding.ts Adds helper for DFS x-ms-properties encoding/decoding.
src/blob/dfs/DfsOperation.ts Adds DFS operation enum for middleware/dispatch.
src/blob/dfs/DfsErrorFactory.ts Adds DFS JSON error writer + common DFS error types.
src/blob/dfs/DfsContextFactory.ts Adds helper to create minimal Blob Context for store calls.
src/blob/dfs/DfsContext.ts Adds DFS context middleware and URL parsing.
src/blob/dfs/DfsAuthenticationMiddleware.ts Reuses blob authenticators for DFS + extracts identity for ACL mode.
src/blob/dfs/DfsAclEnforcer.ts Adds ACL parsing and evaluation logic for DFS operations.
src/blob/SqlBlobServer.ts Exposes stores publicly for DFS server sharing.
src/blob/IBlobEnvironment.ts Extends blob environment interface with DFS host/port + HNS toggle.
src/blob/DfsServer.ts Adds DFS server wrapper around request listener pipeline.
src/blob/DfsRequestListenerFactory.ts Adds Express pipeline for DFS routing/auth/handlers.
src/blob/DfsConfiguration.ts Adds configuration class for DFS endpoint.
src/blob/BlobServer.ts Exposes stores publicly for DFS server sharing.
src/blob/BlobEnvironment.ts Adds CLI flags for DFS host/port and HNS toggle in azurite-blob.
src/azurite.ts Starts DFS server alongside blob/queue/table in the combined entrypoint.
package.json Adds @azure/storage-file-datalake dev dependency and DFS extension settings/script.
docs/designs/ADLS-gen2-parity.md Adds design/phase plan for DFS parity implementation.
README.md Documents DFS host/port config and Docker port exposure.
Dockerfile.Windows Exposes 10004 and passes --dfsHost.
Dockerfile Exposes 10004 and passes --dfsHost.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/dfs/handlers/FilesystemHandler.ts
Comment thread src/blob/DfsRequestListenerFactory.ts Outdated
Comment thread src/blob/persistence/LokiBlobMetadataStore.ts Outdated
Comment thread tests/blob/dfsProxy.test.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/utils/constants.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts
Azure ADLS Gen2 returns 409 when creating a directory that already exists.
The SDK's CreateIfNotExistsAsync relies on this to return null/false for
existing directories. Azurite was returning 201 regardless, causing
CreateDirectory_AlreadyExists_ReturnsFalse to fail.

Now checks if a directory blob with hdi_isfolder=true already exists
before creating, and returns 409 PathAlreadyExists if so. File creates
remain idempotent (overwrite) matching Azure behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

I've updated the unit tests on this PR: OrchardCMS/OrchardCore#19014
They run against a custom docker image that I created on my own Github account.

Guard against Express query params and request body being arrays
instead of strings. CodeQL flagged req.query.position and req.body
as potential vectors for type confusion in appendData and flushData.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 18, 2026 17:27
@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native ADLS Gen2 (DFS) endpoint support to Azurite by introducing a dedicated DFS server/pipeline that shares the existing Blob metadata + extent stores, plus new persistence helpers (rename + HNS hierarchy) and a broad set of DFS-focused tests and swagger artifacts.

Changes:

  • Introduce a DFS server (port 10004 by default) with native Express handlers for filesystem/path operations, append/flush, leases, rename, and ACL metadata.
  • Extend metadata stores (Loki + SQL) with rename APIs and HNS hierarchy tracking.
  • Add DFS integration/proxy/unit tests, swagger inputs/config, and environment/Docker/README wiring for DFS.

Reviewed changes

Copilot reviewed 45 out of 46 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tests/blob/dfsSDKIntegration.test.ts New JS SDK integration tests using @azure/storage-file-datalake.
tests/blob/dfsProxy.test.ts New REST-level DFS behavior tests (create/list/append/flush/rename/ACL/leases).
tests/blob/dfsAclEnforcer.test.ts Unit tests for ACL parsing/evaluation.
swagger/dfs.md AutoRest config for DFS codegen.
swagger/dfs-storage-2023-11-03.json DFS swagger used for codegen/artifacts.
src/common/models.ts Add OAuthLevel.ACL enum value.
src/common/VSCEnvironment.ts Add dfsHost/dfsPort + enableHierarchicalNamespace settings.
src/common/Telemetry.ts Include DFS flags in telemetry parameter list.
src/common/Environment.ts Add --dfsHost/--dfsPort and --enableHierarchicalNamespace CLI flags.
src/common/ConfigurationBase.ts Parse --oauth acl into OAuthLevel.ACL.
src/blob/utils/constants.ts Add DFS defaults; switch emulator HNS default to enabled.
src/blob/persistence/SqlBlobMetadataStore.ts Add HNS hierarchy table + rename helpers.
src/blob/persistence/LokiBlobMetadataStore.ts Add HNS hierarchy collection + rename helpers.
src/blob/persistence/IBlobMetadataStore.ts Add interfaces for rename and HNS hierarchy operations.
src/blob/main.ts Start DFS server alongside blob for azurite-blob.
src/blob/generated-dfs/** New generated DFS artifacts/interfaces/specs.
src/blob/dfs/** New DFS middleware, handlers, ACL enforcer, errors, property encoding.
src/blob/SqlBlobServer.ts Expose stores publicly for DFS server sharing.
src/blob/IBlobEnvironment.ts Add DFS + HNS configuration surface.
src/blob/DfsServer.ts New DFS server wrapper over ServerBase.
src/blob/DfsRequestListenerFactory.ts New DFS Express pipeline (context/dispatch/auth/routing/errors).
src/blob/DfsConfiguration.ts New DFS configuration class.
src/blob/BlobServer.ts Expose stores publicly for DFS server sharing.
src/blob/BlobEnvironment.ts Add DFS host/port + HNS flag for azurite-blob.
src/azurite.ts Start DFS server alongside blob/queue/table for azurite.
package.json Add DFS autorest script + @azure/storage-file-datalake dep + VS Code settings for DFS host/port.
docs/designs/ADLS-gen2-parity.md Design/phase plan for ADLS Gen2 parity implementation.
README.md Document DFS host/port + Docker port mapping updates.
Dockerfile.Windows Expose port 10004 and pass --dfsHost.
Dockerfile Expose port 10004 and pass --dfsHost.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/FilesystemHandler.ts
Comment thread src/azurite.ts Outdated
Comment thread src/azurite.ts Outdated
Comment thread package.json Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/DfsRequestListenerFactory.ts Outdated
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts Outdated
@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

@copilot open a new pull request to apply changes based on the comments in this thread

@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 18, 2026

I'm sorry, I probably don't have the permission to run agents on this repository.

@Skrypt
Copy link
Copy Markdown
Author

Skrypt commented Mar 29, 2026

No feedback on this one yet after 2 weeks. I know that this is AI generated and this is probably not optimal but at least some feedback would be welcomed. Should I simply drop the PR and consider simply keeping this on my own fork? @sebastienros sorry to ping you on this one. Totally non-related with OC but what's your thoughts on this? This is an open sourced project by Microsoft on the Azure repository. Not sure who's managing this or if I will ever get feedback at this point.

Thanks.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 49 out of 52 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/persistence/SqlBlobMetadataStore.ts Outdated
Comment thread tests/blob/dfsProxy.test.ts Outdated
Comment thread src/blob/handlers/BlobHandler.ts
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
Comment thread src/blob/handlers/ContainerHandler.ts
Comment thread src/blob/handlers/ServiceHandler.ts
Comment thread src/blob/context/BlobStorageContext.ts Outdated
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
…k, typing, fetch cleanup

SQL REPLACE → prefix-only rewrite (fixes over-replace when prefix appears in path segment):
- renamePathAtomic blob children: REPLACE → destPrefix || SUBSTR(blobName, len+1)
- renamePathAtomic HNS hierarchy: same fix for path and CASE/REPLACE for parentPath
- renameBlobsByPrefix: same prefix-only rewrite

HNS getAccountInfo fallback:
- BlobHandler.getAccountInfo: fall back to enableHierarchicalNamespace when azurite_hns_enabled metadata absent
- ContainerHandler.getAccountInfo: same fallback; add enableHierarchicalNamespace param (default false)
- BlobHandler: add enableHierarchicalNamespace param (default false), threaded from BlobRequestListenerFactory
- BlobBatchHandler: add enableHierarchicalNamespace param, thread to ServiceHandler and from call sites
  in ContainerHandler and ServiceHandler

BlobStorageContext: remove unused environment?: any field

PathHandler.read: handle 304 Not Modified from conditional GET (If-None-Match)

dfsProxy.test.ts: replace all fetch() calls with dfsAxios (axios instance with DataLake user-agent)
to avoid dependency on Node 18+ global fetch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 49 out of 53 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… tests

New tests:
- dfsProxy.test.ts: setAccessControlRecursive (set/modify/remove modes),
  out-of-order append 409 ConditionNotMet, wrong-position flush 409 InvalidFlushPosition
- dfsAclIntegration.test.ts: end-to-end OAuth ACL enforcement with HTTPS server
  (named user allow, owner write allow/deny, SAS bypass)

Bugs found and fixed by the new tests:
- BlobTokenAuthenticator: OAuthLevel.ACL fell into the default branch and
  returned undefined (auth failure) — add ACL case alongside BASIC so Bearer
  tokens are validated in ACL mode
- DfsAclEnforcer.checkAcl: checked whether the PATH OWNER is $superuser and
  bypassed ACL for all callers; correct check is whether the CALLER is $superuser

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 51 out of 55 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/handlers/ContainerHandler.ts Outdated
Comment thread src/blob/handlers/BlobBatchHandler.ts
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
…BatchHandler wiring

ContainerHandler.createContainer: default azurite_hns_enabled to enableHierarchicalNamespace
when x-ms-namespace-enabled header is absent, so Blob-API-created containers honour the
server-wide HNS setting instead of always writing false.

SqlBlobMetadataStore: replace hardcoded || / SUBSTR / double-quoted identifiers in all
prefix-rename literals with two private helpers (prefixReplaceExpr, conditionalPrefixReplaceExpr)
that branch on sequelize.getDialect() to emit correct syntax for SQLite/PostgreSQL,
MySQL/MariaDB, and MSSQL — fixes portability of renamePathAtomic and renameBlobsByPrefix.

BlobBatchHandler: thread disableProductStyle and enableHierarchicalNamespace through
to ContainerHandler construction (was missing both, consistent with ServiceHandler).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 51 out of 55 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/dfs/handlers/PathHandler.ts Outdated
Comment thread src/blob/dfs/handlers/FilesystemHandler.ts
Comment thread src/blob/BlobRequestListenerFactory.ts Outdated
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Skrypt and others added 3 commits May 1, 2026 07:51
…te concurrency

BlobRequestListenerFactory: forward body-parser errors to next(err) so oversized/
invalid payloads don't silently reach the DFS router with a bad body.

FilesystemHandler.create: honour x-ms-namespace-enabled request header (explicit
value overrides server default), consistent with ContainerHandler.

PathHandler:
- Recursive directory delete: replace sequential for-await with bounded-concurrency
  Promise.all batches (batch size 16) to avoid timeout on large trees.
- listPaths: add enforceAcl("r") on the target directory before listing paths,
  so callers without read permission cannot enumerate contents in --oauth acl mode.
- renamePath: enforce write ACL on both source and destination before rename,
  closing the bypass that let unauthorized callers move/rename paths.
- enforceAcl: fail closed (403 + log) on metadata-store errors instead of
  silently allowing the request through.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… issues

Critical:
- flushData: include previously-committed blocks in commit list so multi-cycle
  append→flush preserves all data
- deleteContainer: clean up HnsHierarchy rows in both Loki and SQL stores
- FilesystemHandler.getProperties: read per-container azurite_hns_enabled flag
  instead of returning server-wide default; filter it from x-ms-properties
- PathHandler.create: enforce write ACL on parent directory in --oauth acl mode
- DfsContext middleware: catch synchronous throw from checkApiVersion and call next(err)

Major:
- renamePath: Azure overwrite semantics — delete existing dest file or empty dir
  before rename; return DirectoryNotEmpty for non-empty dir targets
- PathHandler.setProperties: filter reserved internal metadata keys from
  x-ms-properties merge (hdi_isfolder, dfsAcl* keys)
- safeGetBlobProperties: only swallow 404, rethrow all other errors
- BlobHandler/ContainerHandler.getAccountInfo: wrap getContainerProperties in
  try/catch; fall back to server-wide HNS default on 404
- SqlBlobMetadataStore.renamePathAtomic: catch SequelizeUniqueConstraintError
  and surface as BlobAlreadyExists
- Batch delete: swallow 404 per-item to tolerate concurrent deletes
- SQL LIKE patterns: escape % and _ wildcards in user-controlled paths via escapeLike()
- PathHandler: replace custom hex ETags with newEtag() at all three sites

Minor:
- Remove dead code: renameBlob, renameBlobsByPrefix, renameHnsPaths,
  isHnsDirectoryEmpty, hnsPathExists from interface and both store implementations
- setAccessControlRecursive: failureCount was const 0 — now increments on error
- FilesystemHandler.setProperties: filter azurite_hns_enabled from x-ms-properties merge
- PathHandler.read: check res.headersSent before sendDfsError; call res.destroy() instead
- Validate maxResults/maxRecords query params (NaN/negative → clamped default)
- ensureIntermediateDirectories called before renamePathAtomic for consistency
- Document user-agent sniffing limitation in BlobRequestListenerFactory
- Document named group ACL limitation in DfsAclEnforcer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dfsProxy.test.ts:
- Multi-cycle append→flush: verifies both rounds of data survive (C-1 regression)
- Rename onto existing file: confirms overwrite succeeds (M-1)
- Rename onto non-empty directory: confirms 409 DirectoryNotEmpty (M-1)
- setProperties ignores hdi_isfolder: reserved key must not flip resource type (M-2)
- ETag format: DFS-created blobs must produce "0x..." ETags (M-8)
- Non-numeric position: NaN position does not crash server (m-5)

dfsAclIntegration.test.ts:
- ACL enforcement on create: owner can create inside dir, non-owner is denied (C-5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 12:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 56 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
Comment thread src/blob/persistence/IBlobMetadataStore.ts
Comment thread tests/BlobTestServerFactory.ts
Skrypt and others added 2 commits May 1, 2026 08:44
4 new criticals, 7 majors, 5 minors identified. All pass-1 items marked fixed.
6 new test gaps recorded. Pass-2 items are pending.
…or handling

Criticals (azurite_hns_enabled integrity):
- FilesystemHandler.setProperties: read existing metadata first, preserve HNS flag
  and existing user metadata before applying x-ms-properties overlay (P2-C-1, P2-m-3)
- FilesystemHandler.extractMetadata: filter x-ms-meta-azurite_hns_enabled to block forgery (P2-C-4)
- ContainerHandler.setMetadata: preserve azurite_hns_enabled from existing metadata
  when Blob API SetContainerMetadata replaces the metadata map (P2-C-2)
- ContainerHandler.getContainerProperties: filter azurite_hns_enabled from user-visible
  metadata in Blob API response (P2-C-3)
- FilesystemHandler: add newEtag() import; use it in setProperties

Majors:
- listPaths: return 404 when specified directory does not exist (P2-M-1)
- listPaths: read stored dfsAclOwner/dfsAclGroup/dfsAclPermissions per entry
  instead of hardcoded $superuser defaults (P2-M-7)
- PathHandler.delete: handle 412 ConditionNotMet from conditional-header mismatch (P2-M-2)
- DfsContext: replace raw res.status(400).json() with sendDfsError so response
  includes x-ms-error-code header (P2-M-3)
- Multi-block read: call stream.destroy() on error to prevent resource leak (P2-M-4)
- breakLease: validate x-ms-lease-break-period for NaN; return 400 if invalid (P2-M-5)
- appendData: document TOCTOU limitation for concurrent appends (P2-M-6)

Minors:
- PathHandler: replace dynamic require("crypto") with top-level import (P2-m-1)
- PathHandler.create: rename inner parentPath to hnsParentPath to avoid shadowing (P2-m-2)
- DfsRequestListenerFactory: restrict resource=filesystem + ctx.path dispatch to GET only (P2-m-4)
- ensureIntermediateDirectories: throw 409 if a file occupies an intermediate path (P2-m-5)

Tests (4 new in dfsProxy.test.ts):
- HNS flag survives filesystem setProperties PATCH
- listPaths returns 404 for non-existent directory
- delete with non-matching If-Match returns 412
- listPaths returns stored ACL owner/group per path

55 DFS @loki tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 12:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 56 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/FilesystemHandler.ts Outdated
Comment thread src/blob/dfs/handlers/FilesystemHandler.ts
Comment thread src/common/Environment.ts
Skrypt and others added 2 commits May 1, 2026 09:09
6 majors, 7 minors, 5 test gaps. All pass-2 items marked fixed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 13:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 56 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blob/persistence/LokiBlobMetadataStore.ts
Comment thread src/blob/dfs/handlers/PathHandler.ts
Comment thread src/blob/dfs/handlers/FilesystemHandler.ts Outdated
Comment thread src/blob/DfsRequestListenerFactory.ts
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
Comment thread src/blob/persistence/SqlBlobMetadataStore.ts
…ead code

Majors:
- Recursive delete: pass lease conditions to child blob deletes so leased children
  are rejected rather than force-deleted (P3-M-1)
- listPaths: add eTag field to prefix (subdirectory) entries in non-recursive listing (P3-M-2)
- setAccessControlRecursive: only include root directory on first continuation page,
  not on every page (inflated directoriesSuccessful count) (P3-M-3)
- setAccessControlRecursive: validate mode param; return 400 InvalidQueryParameterValue
  for any value other than set/modify/remove (P3-M-4)
- read: return 400 PathIsDirectory when GET targets a directory blob (P3-M-5)
- renamePathAtomic: re-key uncommitted blocks staged under old path in both
  Loki (blocks collection) and SQL (BlocksModel) so flush after rename works (P3-M-6)

Minors:
- Document rename overwrite TOCTOU limitation (P3-m-1)
- Fix invalidFlushPosition() in DfsErrorFactory: status 400→409, parameterised
  message; replace inline error object in flushData with the factory call (P3-m-2)
- Single-block read: stream.pipe(res, { end: false }) to prevent double res.end() (P3-m-3)
- FilesystemHandler.list: clamp maxResults with || fallback to prevent NaN (P3-m-4)
- Delete dead DfsPropertyEncoding.ts (never imported) (P3-m-5)
- FilesystemHandler.create: replace custom hex ETag with newEtag() (P3-m-6)
- renamePath: reject rename source paths containing '..' segments (P3-m-7)

Tests (4 new):
- GET on directory path returns 400 PathIsDirectory
- setAccessControlRecursive with invalid mode returns 400
- listPaths non-recursive: subdirectory entries include eTag
- FilesystemHandler.list with non-numeric maxResults does not crash

59 DFS @loki tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants