Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
7dfb9e8
ADSL Gen 2
Skrypt Mar 17, 2026
82fc433
Add HNS toggle, atomic rename, conditional headers, and lease ops to …
Skrypt Mar 18, 2026
bebd6c0
Add DFS Swagger spec and generated interface layer for ADLS Gen2
Skrypt Mar 18, 2026
956c39c
Add HNS parent-child hierarchy table for directory relationship tracking
Skrypt Mar 18, 2026
fb22f44
Add @azure/storage-file-datalake SDK integration tests for DFS endpoint
Skrypt Mar 18, 2026
94ef926
Add Phase III OAuth ACL enforcement for DFS endpoint
Skrypt Mar 18, 2026
fb2c8dc
Fix recursive directory deletion to remove all descendant blobs
Skrypt Mar 18, 2026
07b32a8
Return 409 PathAlreadyExists when creating an existing directory via DFS
Skrypt Mar 18, 2026
23e9269
Fix type confusion through parameter tampering in DFS PathHandler
Skrypt Mar 18, 2026
fb455e0
Fix GetAccountInfo method
Skrypt Apr 30, 2026
2b4cb30
feat(hns): Per-container HNS (Gen2) support, GetAccountInfo returns c…
Skrypt Apr 30, 2026
493d244
refactor: DFS pipeline unified on Blob port, cleanup legacy DFS serve…
Skrypt Apr 30, 2026
d2f12e4
fix(dfs): resolve three ADLS Gen2 issues reported by Izeren
Skrypt May 1, 2026
73d20d3
fix(dfs): reject DFS operations on non-HNS containers with Hierarchic…
Skrypt May 1, 2026
38f4556
fix(dfs): make DFS path rename truly atomic across blobs and HNS hier…
Skrypt May 1, 2026
94c90bf
fix(dfs): address Copilot PR review — remove legacy dfsHost/dfsPort, …
Skrypt May 1, 2026
cba5313
fix(dfs): BlobConfiguration default false, fix test routing and SDK r…
Skrypt May 1, 2026
92d6a73
fix(dfs): address remaining Copilot review comments
Skrypt May 1, 2026
e587892
fix(dfs): address latest Copilot review — REPLACE safety, HNS fallbac…
Skrypt May 1, 2026
77c9c17
test(dfs): add missing Gen2 coverage + fix two bugs discovered by new…
Skrypt May 1, 2026
16d935c
fix(dfs): ContainerHandler HNS default, dialect-safe SQL rename, Blob…
Skrypt May 1, 2026
52db20c
fix(dfs): address Copilot review — ACL, body parser, HNS header, dele…
Skrypt May 1, 2026
c2f6204
fix(dfs): address internal code review — 6 critical, 8 major, 8 minor…
Skrypt May 1, 2026
86c3eba
test(dfs): add coverage for all review-identified test gaps
Skrypt May 1, 2026
4844856
docs: add pass-2 code review findings to ADLS-gen2-review.md
Skrypt May 1, 2026
aefab39
fix(dfs): address pass-2 review — HNS metadata safety, listPaths, err…
Skrypt May 1, 2026
b45b0d2
docs: add pass-3 code review findings to ADLS-gen2-review.md
Skrypt May 1, 2026
29abf72
fix: correct 'telemtry' typo in --disableTelemetry CLI help text
Skrypt May 1, 2026
d83a0e4
fix(dfs): address pass-3 review — correctness, resource management, d…
Skrypt May 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,7 @@ EXPOSE 10000
EXPOSE 10001
# Table Storage Port
EXPOSE 10002
# DFS (ADLS Gen2) Port
EXPOSE 10004

CMD ["azurite", "-l", "/data", "--blobHost", "0.0.0.0","--queueHost", "0.0.0.0", "--tableHost", "0.0.0.0"]
CMD ["azurite", "-l", "/data", "--blobHost", "0.0.0.0", "--dfsHost", "0.0.0.0", "--queueHost", "0.0.0.0", "--tableHost", "0.0.0.0"]
Comment thread
Skrypt marked this conversation as resolved.
Outdated
4 changes: 3 additions & 1 deletion Dockerfile.Windows
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,11 @@ EXPOSE 10000
EXPOSE 10001
# Table Storage Port
EXPOSE 10002
# DFS (ADLS Gen2) Port
EXPOSE 10004

ENTRYPOINT "cmd.exe /S /C"

WORKDIR C:\\Node\\node-v22.12.0-win-x64\\

CMD azurite -l c:/data --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0
CMD azurite -l c:/data --blobHost 0.0.0.0 --dfsHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,8 @@ Following extension configurations are supported:

- `azurite.blobHost` Blob service listening endpoint, by default 127.0.0.1
- `azurite.blobPort` Blob service listening port, by default 10000
- `azurite.dfsHost` DFS service listening endpoint, by default 127.0.0.1
- `azurite.dfsPort` DFS service listening port, by default 10004
- `azurite.blobKeepAliveTimeout` Blob service keep alive timeout in seconds, by default 5
- `azurite.queueHost` Queue service listening endpoint, by default 127.0.0.1
- `azurite.queuePort` Queue service listening port, by default 10001
Expand Down Expand Up @@ -214,17 +216,18 @@ Following extension configurations are supported:
> Note. Find more docker images tags in <https://mcr.microsoft.com/v2/azure-storage/azurite/tags/list>

```bash
docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 mcr.microsoft.com/azure-storage/azurite
docker run -p 10000:10000 -p 10004:10004 -p 10001:10001 -p 10002:10002 mcr.microsoft.com/azure-storage/azurite
```

`-p 10000:10000` will expose blob service's default listening port.
`-p 10004:10004` will expose dfs service's default listening port.
`-p 10001:10001` will expose queue service's default listening port.
`-p 10002:10002` will expose table service's default listening port.

Or just run blob service:

```bash
docker run -p 10000:10000 mcr.microsoft.com/azure-storage/azurite azurite-blob --blobHost 0.0.0.0
docker run -p 10000:10000 -p 10004:10004 mcr.microsoft.com/azure-storage/azurite azurite-blob --blobHost 0.0.0.0 --dfsHost 0.0.0.0
Comment thread
Skrypt marked this conversation as resolved.
Outdated
```

#### Run Azurite V3 docker image with customized persisted data location
Expand Down Expand Up @@ -317,6 +320,7 @@ You can customize the listening address per your requirements.

```cmd
--blobHost 127.0.0.1
--dfsHost 127.0.0.1
--queueHost 127.0.0.1
--tableHost 127.0.0.1
```
Expand All @@ -325,13 +329,14 @@ You can customize the listening address per your requirements.

```cmd
--blobHost 0.0.0.0
--dfsHost 0.0.0.0
--queueHost 0.0.0.0
--tableHost 0.0.0.0
```

### Listening Port Configuration

Optional. By default, Azurite V3 will listen to 10000 as blob service port, and 10001 as queue service port, and 10002 as the table service port.
Optional. By default, Azurite V3 will listen to 10000 as blob service port, 10004 as dfs service port, 10001 as queue service port, and 10002 as the table service port.
You can customize the listening port per your requirements.

> Warning: After using a customized port, you need to update connection string or configurations correspondingly in your Storage Tools or SDKs.
Expand All @@ -341,6 +346,7 @@ You can customize the listening port per your requirements.

```cmd
--blobPort 8888
--dfsPort 8889
--queuePort 9999
--tablePort 11111
```
Expand All @@ -349,6 +355,7 @@ You can customize the listening port per your requirements.

```cmd
--blobPort 0
--dfsPort 0
--queuePort 0
--tablePort 0
```
Expand Down
187 changes: 187 additions & 0 deletions docs/designs/ADLS-gen2-parity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# ADLS Gen2 Parity Implementation Plan

## Context

Azurite currently has a **thin DFS proxy layer** (port 10004) that translates a small subset of ADLS Gen2 DFS REST API calls to Blob REST API calls via HTTP proxying (axios). This covers only filesystem (container) create/delete/HEAD and account listing. Full ADLS Gen2 parity requires native support for path (file/directory) operations, the append-then-flush write pattern, rename/move, ACLs, and list paths — none of which can be achieved by simple query-parameter rewriting.

## Architectural Decision: Hybrid (Native DFS Handlers + Shared Stores)

Replace the HTTP proxy with a **native Express pipeline** in the DFS server that directly accesses `IBlobMetadataStore` and `IExtentStore` — the same store instances used by the blob server.

```
Port 10000 (Blob API) → Blob Handlers → IBlobMetadataStore + IExtentStore
Port 10004 (DFS API) → DFS Handlers → same IBlobMetadataStore + IExtentStore
```
Comment thread
Skrypt marked this conversation as resolved.

**Why not keep proxying?** DFS operations like List Paths, Create Directory, Rename, ACLs, and append-then-flush have no single blob API equivalent. Proxying would require multi-call orchestration, lose atomicity, and add latency.

### Directory Model

Directories stored as **zero-length BlockBlobs with `hdi_isfolder=true` metadata** — matching Azure's real internal behavior. No separate table needed.

### ACL Storage

New fields on `BlobModel`: `dfsAclOwner`, `dfsAclGroup`, `dfsAclPermissions`, `dfsAcl`. LokiJS is schemaless (just add fields); SQL needs ALTER TABLE.

---

## Phase 0: Foundation — Shared Store Access & HNS Flag

**Goal:** Wire DFS server to share stores with blob server; enable HNS mode.

| File | Change |
|------|--------|
| `src/blob/utils/constants.ts` | Set `EMULATOR_ACCOUNT_ISHIERARCHICALNAMESPACEENABLED = true` (or make configurable) |
| `src/blob/DfsProxyServer.ts` → rename to `DfsServer.ts` | Accept `IBlobMetadataStore` + `IExtentStore` in constructor |
| `src/blob/DfsProxyConfiguration.ts` → rename to `DfsConfiguration.ts` | Remove upstream host/port fields (no longer proxying) |
| `src/blob/BlobServer.ts` | Expose `metadataStore` and `extentStore` via public getters |
| `src/azurite.ts` | Pass shared stores to both BlobServer and DfsServer |
| `src/blob/main.ts` | Same wiring for standalone blob+dfs mode |
| `src/blob/DfsRequestListenerFactory.ts` | Rewrite: replace axios proxy with native Express pipeline + DFS routing |
| `src/blob/IBlobEnvironment.ts`, `BlobEnvironment.ts`, `src/common/Environment.ts`, `VSCEnvironment.ts` | Add `--enableHierarchicalNamespace` option |

**Deliverable:** DFS server starts, shares data with blob, existing filesystem tests pass via direct store access.

---

## Phase 1: Path CRUD + List Paths

**Goal:** Create/delete/read files and directories, list paths — the core operations most ADLS Gen2 SDKs depend on.

### New files to create

| File | Purpose |
|------|---------|
| `src/blob/dfs/DfsContext.ts` | DFS request context (account, filesystem, path) — analogous to `BlobStorageContext` |
| `src/blob/dfs/DfsOperation.ts` | Enum of DFS operations for dispatch |
| `src/blob/dfs/DfsDispatchMiddleware.ts` | Routes requests by `resource` param, `action` param, method, and headers |
| `src/blob/dfs/DfsErrorFactory.ts` | JSON error responses (`PathNotFound`, `DirectoryNotEmpty`, etc.) |
| `src/blob/dfs/DfsSerializer.ts` | JSON response serialization (DFS uses JSON, not XML) |
| `src/blob/dfs/handlers/FilesystemHandler.ts` | Filesystem ops → container store operations |
| `src/blob/dfs/handlers/PathHandler.ts` | Path create/delete/read/getProperties + listPaths |

### Operations implemented

- **Create Path** (`PUT ?resource=file|directory`): Creates zero-length BlockBlob; directories get `hdi_isfolder=true` metadata; auto-creates intermediate directories
- **Delete Path** (`DELETE`): Files → `deleteBlob()`; directories with `recursive=true` → delete all blobs with prefix; `recursive=false` → 409 if non-empty
- **Get Path Properties** (`HEAD`): Returns `x-ms-resource-type: file|directory` header
- **Read Path** (`GET`): Streams file content via `downloadBlob()` (follows `BlobHandler.download()` pattern)
- **List Paths** (`GET ?resource=filesystem&directory=...&recursive=true|false`): JSON response with `paths` array; uses `listBlobs()` with prefix/delimiter; supports continuation via `x-ms-continuation`

### Existing files modified

| File | Change |
|------|--------|
| `src/blob/persistence/IBlobMetadataStore.ts` | Add `dfsResourceType`, ACL fields to `BlobModel` / `IBlobAdditionalProperties` |
| `src/blob/persistence/LokiBlobMetadataStore.ts` | No schema changes needed (schemaless) |
| `src/blob/persistence/SqlBlobMetadataStore.ts` | Add columns: `dfsResourceType`, `dfsAclOwner`, `dfsAclGroup`, `dfsAclPermissions`, `dfsAcl` |

### Tests

Extend `tests/blob/dfsProxy.test.ts`:
- Create file / directory, verify as blob
- Delete file / empty dir / non-empty dir with recursive
- Get properties with `x-ms-resource-type`
- Read file content
- List paths recursive and non-recursive
- Cross-API: create via DFS → read via Blob API and vice versa

---

## Phase 2: Append-Flush Write Pattern

**Goal:** Implement the DFS file write model (create empty → append chunks → flush to commit).

### Key insight

DFS append-then-flush maps directly to existing **BlockBlob uncommitted blocks** infrastructure: each `action=append` becomes a `stageBlock()`, and `action=flush` becomes `commitBlockList()`. No new persistence methods needed.

### Changes to `src/blob/dfs/handlers/PathHandler.ts`

- **`updatePath_Append(position, body)`**: Write body to `IExtentStore` as extent chunk; record as uncommitted block via `metadataStore.stageBlock()`; validate `position` matches current append offset; return 202
- **`updatePath_Flush(position, close)`**: Commit all staged blocks via `metadataStore.commitBlockList()`; update content length to `position`; return 200 with updated ETag

### Tests

- Create → append 3 chunks → flush → read back, verify content
- Append with wrong position → 400
- Large file (multi-MB) append

---

## Phase 3: Rename/Move Path

**Goal:** Atomic rename for files and directories.

### New persistence methods

| Method | Description |
|--------|-------------|
| `IBlobMetadataStore.renameBlob(src, dest)` | Atomic rename of single blob (metadata-only, no extent copy) |
| `IBlobMetadataStore.renameBlobsByPrefix(srcPrefix, destPrefix)` | Atomic rename of all blobs matching prefix (for directory rename) |

### PathHandler addition

- **`renamePath(x-ms-rename-source)`**: Parse source header → for files: `renameBlob()`; for directories: `renameBlobsByPrefix()`. Supports cross-filesystem rename and conditional headers.

### Persistence implementations

- **LokiJS**: Update document `containerName` and `name` properties
- **SQL**: `UPDATE ... SET name = REPLACE(name, oldPrefix, newPrefix) WHERE name LIKE 'prefix%'` in transaction

### Tests

- Rename file within filesystem / across filesystems
- Rename directory (verify children moved)
- Rename non-existent → 404
- Rename with conditional headers

---

## Phase 4: ACL Operations

**Goal:** POSIX ACL get/set for emulator parity.

### PathHandler additions

- **`getAccessControl()`**: Read ACL fields from blob record → return as `x-ms-owner`, `x-ms-group`, `x-ms-permissions`, `x-ms-acl` headers. Defaults: `$superuser`/`$superuser`/`rwxr-x---`
- **`setAccessControl(owner, group, permissions, acl)`**: Validate ACL format → update blob record
- **`setAccessControlRecursive(mode, acl)`**: `mode` = set|modify|remove; iterate blobs under prefix; support continuation; return JSON with `directoriesSuccessful`, `filesSuccessful`, `failureCount`

### Tests

- Set/get ACL on file and directory
- Recursive ACL set on directory tree
- Default ACL values on new paths

---

## Phase 5: Polish & Remaining Operations

- **Set Filesystem Properties** (`PATCH ?resource=filesystem`) → `setContainerMetadata()`
- **`x-ms-properties` encoding/decoding** — new `src/blob/dfs/DfsPropertyEncoding.ts` utility (base64 key=value pairs)
- **DFS JSON error format**: `{"error":{"code":"...","message":"..."}}`
- **Lease support** on DFS paths (reuse blob lease infrastructure)
- **SAS validation** on DFS endpoints (reuse existing authenticators)
- **Content-MD5/CRC64 validation** on append

---

## Verification Plan

1. **Unit tests**: Extend `tests/blob/dfsProxy.test.ts` per phase
2. **Cross-API tests**: Verify DFS-created data is visible via Blob API and vice versa
3. **SDK integration**: Test with `@azure/storage-file-datalake` Node.js SDK against the emulator
4. **Manual smoke test**: Run Azurite, use Azure Storage Explorer with DFS endpoint
5. **Existing blob tests**: Ensure `npm test` still passes (no regression)

---

## Critical Reference Files

- `src/blob/handlers/ContainerHandler.ts` — pattern for handler ↔ store interaction
- `src/blob/handlers/BlockBlobHandler.ts` — `stageBlock`/`commitBlockList` for append-flush reuse
- `src/blob/handlers/BlobHandler.ts` — `download()` pattern for Read Path
- `src/blob/persistence/IBlobMetadataStore.ts` — store interface to extend
- `src/blob/generated/handlers/` — handler interface patterns
- `src/blob/middlewares/blobStorageContext.middleware.ts` — context extraction pattern for DfsContext
10 changes: 10 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,16 @@
"default": 10000,
"description": "Blob service listening port, by default 10000"
},
"azurite.dfsHost": {
"type": "string",
"default": "127.0.0.1",
"description": "DFS service listening endpoint, by default 127.0.0.1"
},
Comment thread
Skrypt marked this conversation as resolved.
Outdated
"azurite.dfsPort": {
"type": "number",
"default": 10004,
"description": "DFS service listening port, by default 10004"
},
Comment thread
Skrypt marked this conversation as resolved.
Outdated
Comment thread
Skrypt marked this conversation as resolved.
Outdated
"azurite.blobKeepAliveTimeout": {
"type": "number",
"default": 5,
Expand Down
42 changes: 39 additions & 3 deletions src/azurite.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ import {
} from "./queue/utils/constants";
import SqlBlobServer from "./blob/SqlBlobServer";
import BlobServer from "./blob/BlobServer";
import DfsServer from "./blob/DfsServer";
import DfsConfiguration from "./blob/DfsConfiguration";

import TableConfiguration from "./table/TableConfiguration";
import TableServer from "./table/TableServer";
Expand All @@ -30,11 +32,14 @@ import { AzuriteTelemetryClient } from "./common/Telemetry";

function shutdown(
blobServer: BlobServer | SqlBlobServer,
dfsServer: DfsServer,
queueServer: QueueServer,
tableServer: TableServer
) {
const blobBeforeCloseMessage = `Azurite Blob service is closing...`;
const blobAfterCloseMessage = `Azurite Blob service successfully closed`;
const dfsBeforeCloseMessage = `Azurite DFS service is closing...`;
const dfsAfterCloseMessage = `Azurite DFS service successfully closed`;
const queueBeforeCloseMessage = `Azurite Queue service is closing...`;
const queueAfterCloseMessage = `Azurite Queue service successfully closed`;
const tableBeforeCloseMessage = `Azurite Table service is closing...`;
Expand All @@ -47,6 +52,11 @@ function shutdown(
console.log(blobAfterCloseMessage);
});

console.log(dfsBeforeCloseMessage);
dfsServer.close().then(() => {
console.log(dfsAfterCloseMessage);
});

console.log(queueBeforeCloseMessage);
queueServer.close().then(() => {
console.log(queueAfterCloseMessage);
Expand Down Expand Up @@ -79,6 +89,24 @@ async function main() {
const blobServerFactory = new BlobServerFactory();
const blobServer = await blobServerFactory.createServer(env);
const blobConfig = blobServer.config;
const dfsConfig = new DfsConfiguration(
env.dfsHost(),
env.dfsPort(),
env.blobKeepAliveTimeout(),
env.cert(),
env.key(),
env.pwd()
);
const blobServerAny = blobServer as any;
const enableHns = env.enableHierarchicalNamespace();
const dfsServer = new DfsServer(
dfsConfig,
blobServerAny.metadataStore,
blobServerAny.extentStore,
blobServerAny.accountDataStore,
Comment thread
Skrypt marked this conversation as resolved.
Outdated
undefined,
Comment thread
Skrypt marked this conversation as resolved.
Outdated
enableHns
);

// TODO: Align with blob DEFAULT_BLOB_PERSISTENCE_ARRAY
// TODO: Join for all paths in the array
Expand Down Expand Up @@ -150,6 +178,14 @@ async function main() {
`Azurite Blob service is successfully listening at ${blobServer.getHttpServerAddress()}`
);

console.log(
`Azurite DFS service is starting at ${dfsConfig.getHttpServerAddress()}`
);
await dfsServer.start();
console.log(
`Azurite DFS service is successfully listening at ${dfsServer.getHttpServerAddress()}`
);

// Start server
console.log(
`Azurite Queue service is starting at ${queueConfig.getHttpServerAddress()}`
Expand All @@ -175,11 +211,11 @@ async function main() {
process
.once("message", (msg) => {
if (msg === "shutdown") {
shutdown(blobServer, queueServer, tableServer);
shutdown(blobServer, dfsServer, queueServer, tableServer);
}
})
.once("SIGINT", () => shutdown(blobServer, queueServer, tableServer))
.once("SIGTERM", () => shutdown(blobServer, queueServer, tableServer));
.once("SIGINT", () => shutdown(blobServer, dfsServer, queueServer, tableServer))
.once("SIGTERM", () => shutdown(blobServer, dfsServer, queueServer, tableServer));
}

main().catch((err) => {
Expand Down
Loading
Loading