Skip to content

Add Azure Blob Storage Gen2 (ADLS HNS) support to BlobFileStore#19014

Open
Skrypt wants to merge 25 commits intomainfrom
skrypt/media-azure-gen2
Open

Add Azure Blob Storage Gen2 (ADLS HNS) support to BlobFileStore#19014
Skrypt wants to merge 25 commits intomainfrom
skrypt/media-azure-gen2

Conversation

@Skrypt
Copy link
Copy Markdown
Contributor

@Skrypt Skrypt commented Mar 16, 2026

Make BlobFileStore adaptive: it now auto-detects whether the storage account has Hierarchical Namespace (HNS) enabled and uses native DataLake APIs for atomic moves, real directories, and efficient listing when available. Falls back to standard flat-namespace blob operations otherwise. HNS detection can be overridden via configuration.

All operations go through separate, unified API endpoints that adapt server-side

The Vue app calls the same endpoints regardless of storage backend:

Method Gen2 (HNS) path Gen1 (flat) path
GetDirectoryInfoAsync DataLakeFileSystemClient.GetDirectoryClient.ExistsAsync blob hierarchy listing
GetDirectoryContentFlatAsync reads hdi_isfolder metadata infers dirs from blob paths
TryCreateDirectoryAsync DataLakeDirectoryClient.CreateIfNotExistsAsync creates marker file
TryDeleteDirectoryAsync DataLakeDirectoryClient.DeleteAsync(recursive: true) iterates and deletes blobs individually
MoveFileAsync DataLakeFileClient.RenameAsync (atomic) CopyFileAsync + TryDeleteFileAsync

TODO

Make BlobFileStore adaptive: it now auto-detects whether the storage
account has Hierarchical Namespace (HNS) enabled and uses native
DataLake APIs for atomic moves, real directories, and efficient listing
when available. Falls back to standard flat-namespace blob operations
otherwise. HNS detection can be overridden via configuration.

- Add IFileStoreCapabilities interface and FileStoreCapabilities impl
- Add GetFilesAsync, GetDirectoriesAsync, Capabilities to IFileStore
- Add UseHierarchicalNamespace option to BlobStorageOptions
- Add Azure.Storage.Files.DataLake package dependency
- Add MediaCreatedFileAsync and MediaCopiedFileAsync event hooks
- Wire up EnsureCapabilitiesAsync at startup in Media.Azure module
@Skrypt Skrypt marked this pull request as draft March 16, 2026 17:54
Skrypt added 2 commits March 16, 2026 15:07
Add integration tests that run against Azurite to verify BlobFileStore
behavior in both flat-namespace (Gen1) and hierarchical-namespace (Gen2)
modes. Tests cover capabilities detection, file CRUD, directory
operations, move/copy, and directory content listing.

Gen2 tests that use DataLake APIs will fail until Azurite supports the
DFS endpoint (--dfsPort flag). This is intentional to track when a
newer Azurite version enables full Gen2 testing.

- Add abstract BlobFileStoreTestsBase with 26 test methods
- Add BlobFileStoreGen1Tests and BlobFileStoreGen2Tests subclasses
- Add AzuriteFactAttribute to skip tests when Azurite is unavailable
- Add OrchardCore.FileStorage.AzureBlob reference to test project
- Start Azurite in main_ci and pr_ci workflows on ubuntu runners
Add DfsEndpoint option to BlobStorageOptions for local emulators where
the DFS endpoint runs on a separate port. Parse storage credentials
from the connection string to construct the DataLakeServiceClient when
an explicit DFS endpoint is configured.

Fix Gen2 code paths to handle 404 RequestFailedException from
DataLakePathClient.ExistsAsync, which can throw instead of returning
false when the path does not exist.
Skrypt added 3 commits March 18, 2026 04:11
Reverted both workflows back to docker run steps (instead of services) since that lets us pass --skipApiVersionCheck directly. The command now uses your image with the correct flags:
@Skrypt Skrypt requested a review from Piedone March 19, 2026 20:46
@Piedone
Copy link
Copy Markdown
Member

Piedone commented Mar 21, 2026

I'll try to review this over the weekend.

Copy link
Copy Markdown
Member

@Piedone Piedone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will #14256 use the new APIs for better UX?

Comment thread test/OrchardCore.Tests/OrchardCore.Tests.csproj Outdated
Comment thread src/OrchardCore.Modules/OrchardCore.Media.Azure/Startup.cs Outdated
Comment thread src/OrchardCore/OrchardCore.FileStorage.AzureBlob/BlobFileStore.cs Outdated
Comment thread src/OrchardCore/OrchardCore.FileStorage.Abstractions/IFileStore.cs Outdated
Comment thread src/OrchardCore/OrchardCore.FileStorage.Abstractions/IFileStoreCapabilities.cs Outdated
Skrypt added 3 commits March 28, 2026 01:34
…dia startup

Move BlobFileStore.EnsureCapabilitiesAsync() from the DI singleton factory
(where it blocked via GetAwaiter().GetResult()) into
MediaBlobContainerTenantEvents.ActivatingAsync(), which already runs
asynchronously at tenant startup. BlobFileStore is now registered as its
own singleton so it can be injected.

Also remove the unused StorageProvider member from IFileStoreCapabilities
and FileStoreCapabilities, and remove the default implementation of
IFileStore.Capabilities so every provider must declare capabilities
explicitly. FileSystemStore and AwsFileStore updated accordingly.
private readonly ILogger<FileSystemStore> _logger;
private readonly string _fileSystemPath;

public IFileStoreCapabilities Capabilities { get; } = new FileStoreCapabilities(hasHierarchicalNamespace: false, supportsAtomicMove: false);
Copy link
Copy Markdown
Contributor Author

@Skrypt Skrypt Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing this configuration to report that the local FileStorage uses HierarchicalNamespace and supports atomic move. It technically should not break any custom implementations.

Skrypt added 5 commits March 29, 2026 12:19
…lesystem behavior

FileSystemStore was configured with hasHierarchicalNamespace: false and
supportsAtomicMove: false, which does not match its actual implementation.
The local filesystem uses real Directory.* APIs for first-class directory
operations and File.Move() which maps to an atomic OS rename syscall.
Updated both flags to true to align with the implementation and be
consistent with how BlobFileStore Gen2 reports the same native behaviors.
…tification

Add a StorageName default interface member to IFileStore that returns a
human-readable name for the underlying storage backend. Implementations:
FileSystemStore ("Local"), BlobFileStore ("Azure Blob (Gen1)"/"Azure Blob (Gen2)"
based on detected HNS), AwsFileStore ("Amazon S3"). DefaultMediaFileStore
delegates to the inner store. This enables the media UI to display which storage
provider is active at runtime, including distinguishing Azure Gen1 from Gen2.
…tegration project

Extract BlobFileStore Gen1/Gen2 tests from OrchardCore.Tests into a new OrchardCore.Tests.Integration project. This separates integration tests that require external services (Azurite) from unit tests, avoiding unnecessary Docker container startup and extra execution time on every PR commit. The new project is built and run by a dedicated integration_tests.yml workflow that only triggers on review submission, push to main/release, or manual dispatch — matching the same pattern as functional_all_db.yml. Removed Azurite setup from pr_ci.yml and main_ci.yml.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

This pull request has merge conflicts. Please resolve those before requesting a review.

@Skrypt
Copy link
Copy Markdown
Contributor Author

Skrypt commented Apr 2, 2026

@Piedone Same here, moved everything to a new project and also new GH workflow.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

This pull request has merge conflicts. Please resolve those before requesting a review.

# Conflicts:
#	.github/workflows/integration_tests.yml
#	test/OrchardCore.Tests.Integration/OrchardCore.Tests.Integration.csproj
@hishamco
Copy link
Copy Markdown
Member

As we discussed in the latest steering meeting to utilize TestContainers, this closed in favor of #19162

@hishamco hishamco closed this Apr 17, 2026
@Skrypt Skrypt reopened this Apr 17, 2026
@Skrypt Skrypt marked this pull request as ready for review April 30, 2026 16:01
Comment thread src/OrchardCore/OrchardCore.FileStorage.Abstractions/IFileStore.cs Outdated
Comment thread src/OrchardCore/OrchardCore.FileStorage.Abstractions/IFileStore.cs Outdated
/// Default implementation filters <see cref="GetDirectoryContentAsync"/>. Implementations
/// may override this to avoid enumerating directories for better performance.
/// </remarks>
async IAsyncEnumerable<IFileStoreEntry> GetFilesAsync(string path = null)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be optimized in the implementations. For instance if a storage contains 1M files and 1 folder , we can now get the folder without listing the files.

However this is not implemented in concrete implementations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the implementation details that are different in a table on the PR description. It will be easier for you to review that way.

Comment thread src/OrchardCore/OrchardCore.FileStorage.AzureBlob/BlobFileStore.cs Outdated
/// Without this override there is no way to use HNS-dependent features (atomic moves, real
/// directory operations) in either of those scenarios.
/// </remarks>
public bool? UseHierarchicalNamespace { get; set; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be necessary. The BlobStore already checks the API availability and falls back to flat namespace. Is also uses this flag but the user shouldn't care.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still needed. The logic is that it is possible that the GetAccountInfo fails to return the storage type "Gen1 or Gen2". Then it is a fallback that will look into the BlobStorageOptions appsettings file to fallback to the value set manually.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue would be if the BlobStore would fail to retrieve the Gen type when doing GetAccountInfo because of account security. Then it would fall back to Gen1 for a Gen2 which would be really bad. Either we keep this fallback option here or we remove the automatic discovery of Gen type.

Comment thread src/OrchardCore/OrchardCore.FileStorage.AzureBlob/BlobFileStore.cs Outdated
@Skrypt
Copy link
Copy Markdown
Contributor Author

Skrypt commented Apr 30, 2026

@copilot can you take a look at @sebastienros review / comments and check why we don't have anything specific about the IFileStoreCapabilities since we don't seem to make use of it in this PR.

Skrypt and others added 7 commits April 30, 2026 15:11
Not necessary for this PR.
The Azurite fork now serves DFS requests on the same port as blob,
matching how Azure Blob Storage works in production (single endpoint,
different hostnames). DataLakeServiceClient is created directly from
the connection string, eliminating the need for a separate DfsEndpoint.

- Remove DfsEndpoint from BlobStorageOptions and BlobFileStore
- Remove ParseCredentialsFromConnectionString (no longer needed)
- Update integration tests: auto-detect HNS via GetAccountInfo for Gen2,
  use UseHierarchicalNamespaceOverride=false only for Gen1
- Update CI: single port per Azurite container, no separate DFS port

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…string

Gen1 tests use UseHierarchicalNamespaceOverride=false against an HNS-enabled
Azurite, which triggers the expected warning and forces flat-namespace code paths.
Gen2 tests rely on auto-detection. No separate port or connection string needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gen2 tests now create their containers using DataLakeFileSystemClient,
which sends x-ms-namespace-enabled: true and stamps the container as
an HNS filesystem — matching how Gen2 filesystems are created in
production Azure. Gen1 tests continue using BlobContainerClient.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gen1: TryCreateDirectoryAsync writes a marker blob (OrchardCore.Media.txt)
      to simulate directories in flat namespace storage.
Gen2: TryCreateDirectoryAsync uses the DataLake API to create a real
      directory object — no marker blob is written.

These tests assert the raw blob listing directly, proving Azurite is
routing to the correct pipeline for each storage generation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Skrypt
Copy link
Copy Markdown
Contributor Author

Skrypt commented Apr 30, 2026

I updated the Azurite ADLS pull request and the related docker image to clean up this PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Skrypt
Copy link
Copy Markdown
Contributor Author

Skrypt commented May 1, 2026

@sebastienros I'm done with changes. Mostly needed to fix the Azurite emulator to simplify the code here. The remaining HierarchicalNamespace fields are there as a fallback for when the automatic discovery fails. This can happen when using a connection string to a specific container and that the storage account is protected. The GetAccountInfo() Azure blob storage / Azurite function could return a 403 since the user account would not be allowed to do a GetAccountInfo() to the root of the storage. Something like it ... keeping the fields was suggested by AI for that matter. Maybe it is completely wrong but we would need to do tests with a real Azure Blob Storage and see how it behaves with that use case.

I don't like needing an override from appsettings like this but that's a fallback because we have auto-discovery of the Gen type. I may investigate further when I get time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants