Skip to content

Port to FUSE 3, performance features, test suite, CMake build, and bug fixes#603

Open
matejk wants to merge 35 commits into
LinearTapeFileSystem:mainfrom
matejk:fuse3-port
Open

Port to FUSE 3, performance features, test suite, CMake build, and bug fixes#603
matejk wants to merge 35 commits into
LinearTapeFileSystem:mainfrom
matejk:fuse3-port

Conversation

@matejk

@matejk matejk commented Jun 11, 2026

Copy link
Copy Markdown

Summary

Port of LTFS to libfuse 3 with a compile-time libfuse 2 fallback, an integration test suite, a CMake build coexisting with autotools, and a series of independently committed bug fixes found by review and static analysis during the work (35 commits, based on main).

FUSE 3 support

  • Builds against fuse3 (>= 3.4) by default on Linux, with automatic fallback to fuse2; --with-fuse2 / -DLTFS_WITH_FUSE2=ON forces the legacy API (still the default on macOS and the BSDs).
  • Handlers use FUSE 3 signatures: getattr/truncate absorb the handle variants, rename handles RENAME_NOREPLACE/RENAME_EXCHANGE, use_ino/hard_remove/nullpath_ok move into fuse_config, and FUSE_CAP_ASYNC_READ is cleared to keep tape reads ordered (-o sync_read no longer exists).

Performance

  • 1 MiB FUSE requests (-o max_write=N): measured 1 MiB writes and O_DIRECT reads reaching the daemon, vs 128 KiB on fuse2.
  • -o direct_io: all I/O bypasses the kernel page cache; ~70 % higher sequential write throughput on the file backend and a flat page cache during large streaming jobs.
  • readdirplus: ls -l over 100 files needs 2 getattr requests instead of 102 (effective with libfuse >= 3.17).

Tests and CI

  • 14 integration tests against the file tape backend (no hardware needed), run by both make check and ctest; a Docker wrapper covers non-Linux hosts.
  • CI matrix builds autotools and CMake, each with fuse3 and fuse2; deprecated actions updated (checkout v6, codeql-action v4, create-pull-request v8).

CMake build

  • Coexists with autotools. Shared single sources of truth: the package version (VERSION file) and the message-bundle compile script. ICU is found via CMake's module instead of the removed icu-config. Plugin names and install layout match autotools.

Bug fixes (one commit each)

  • Data loss on writes larger than the tape block size: the unified scheduler stored one block but reported the full count. Latent upstream; activated by 1 MiB requests. Regression test included.
  • arch_open passed the Windows share flag as the open(2) mode, creating write-only (0200) files; the file backend was unusable as a non-root user.
  • Malformed-index hardening (untrusted tape data): NULL dereference on an empty WORM xattr value, unbounded directory recursion, unchecked realloc, missing malloc check, NULL XML tag name.
  • Release of unheld meta_lock in ltfs_fsops_unlink/rename WORM error paths; double free of the new name on a late rename failure.
  • Drive encryption detection treated MODE SENSE success as failure, so is_ame/sg_set_key could never succeed.
  • Inverted exit status of ltfs --device-list.
  • Reads of uninitialized values (pews, profiler return, dump transfer length); use of unset ICU output pointers after failed normalization; realloc leak in the simple KMI.
  • NULL-path crash in chmod/utimens under nullpath_ok; pthread_yield replaced with sched_yield; stray 2> /dev/null token in the SNMP link flags that silently discarded linker errors; parallel_direct_writes feature-detected instead of version-checked (bitfield, libfuse >= 3.15).

Verification

All 14 tests pass in the four matrix configurations on Ubuntu 24.04 (libfuse 3.14) and 25.10 (libfuse 3.17); the CMake install tree matches the autotools layout; ltfs --version reports the identical version string from both builds.

Issues addressed

Fixes #465
Fixes #521
Fixes #591

Related: #498 (the MODE SENSE fix repairs encryption setup on supported
drives, but the LTO capability rejection reported there is unchanged)

matejk added 30 commits June 11, 2026 10:47
Shell-based tests driven by automake (make check) covering mount and
unmount, data roundtrips, truncate, rename, directory operations,
xattrs, symlinks, large directories, and a tar roundtrip. Tests mount
a file-backend volume from the build tree, so no installation or tape
hardware is needed; they are skipped on hosts without /dev/fuse.

tests/run-in-docker.sh builds and runs the suite in an Ubuntu
container for development on non-Linux hosts.

This establishes a regression baseline for the upcoming FUSE 3 port.
The POSIX arch_open macro passed the Windows-style share flag as the
mode argument of open(2) and ignored the actual permission argument.
Files created by the file tape backend got mode 0200 (write-only), so
a non-root user could not reopen records it had just written; mounting
a freshly formatted file-backend volume failed with EDEV_RW_PERM.
Running as root masked the problem.
configure builds against fuse3 (>= 3.4.0) by default on Linux;
--with-fuse2 selects the libfuse 2 API and remains the default on
macOS, FreeBSD, and NetBSD. FUSE_USE_VERSION becomes 31 for fuse3
builds.

API changes for FUSE 3:
- getattr/truncate absorb fgetattr/ftruncate via the fuse_file_info
  argument; chmod, chown, and utimens gain the argument.
- rename handles flags: RENAME_NOREPLACE returns EEXIST when the
  target exists, RENAME_EXCHANGE is rejected with EINVAL.
- readdir and the directory filler gain flag arguments.
- init receives struct fuse_config; use_ino, hard_remove, and
  nullpath_ok move there from mount options.
- FUSE_CAP_ASYNC_READ is cleared in init to keep tape reads ordered;
  FUSE 3 removed the -o sync_read option and enables asynchronous
  reads by default.
- big_writes is gone (always enabled); fuse_parse_cmdline uses
  struct fuse_cmdline_opts.

The fuse2 code paths are unchanged and selected with --with-fuse2.
A small C helper drives renameat2() with RENAME_NOREPLACE and
RENAME_EXCHANGE and ftruncate() on an open descriptor. Inode numbers
must survive a remount, which verifies that the LTFS index UIDs are
passed through (use_ino) on both FUSE versions.
Set conn->max_write in init (libfuse >= 3.6 negotiates the matching
max_pages with the kernel), raising request sizes from 128 KiB to
1 MiB: two default-size tape blocks per kernel round trip. The new
-o max_write option overrides the size for kernels that allow more
via fs.fuse.max_pages_limit.

The request-size test asserts that 512 KiB+ requests reach the daemon
on FUSE 3 builds (measured: 1 MiB writes and O_DIRECT reads; 128 KiB
on FUSE 2 with big_writes).
Sets FOPEN_DIRECT_IO on every open: reads and writes bypass the kernel
page cache and reach the daemon at the application's I/O size (up to
the negotiated maximum), so streaming a large archive does not fill or
churn the host page cache and data is not buffered twice (the LTFS
write pool remains the only buffering layer). On libfuse >= 3.14
parallel_direct_writes is set as well.

The trade-offs are documented in the help text: no mmap on direct-I/O
files and no kernel readahead, so small-block applications pay one
round trip per call. The default behavior is unchanged.

Applications can keep using O_DIRECT on individual files without this
option, as before.
ltfs_fsops_readdir_attr lists a directory and hands each entry's
attributes to the filler, copied from the in-memory index without a
path lookup per entry. The FUSE readdir handler uses it to answer
READDIRPLUS requests with FUSE_FILL_DIR_PLUS, and init clears
FUSE_CAP_READDIRPLUS_AUTO so every listing chunk carries attributes
(the kernel heuristic only requests them for the first chunk, which
left most entries to individual getattr calls).

Measured with the new test: ls -l over 100 files needs 2 getattr
requests on FUSE 3, down from 102 on FUSE 2.
configure now auto-detects: fuse3 is preferred on Linux with a
fallback to fuse2 when the fuse3 development files are absent, so
existing build environments keep working; --with-fuse2 forces the
legacy API. README documents the new prerequisites, mount options,
and the test suite. A new workflow builds both configurations on
Ubuntu and runs make check.
_unified_insert_new_request copies at most one cache block (the tape
block size) but returned the full requested count, so the append loop
in unified_write advanced past data that was never stored. Writes
larger than the block size silently lost everything after the first
block while reporting success.

Latent upstream because libfuse 2 caps requests at 128 KiB, below the
default 512 KiB block. This branch negotiates 1 MiB requests, so a
single large write (e.g. dd bs=1M) corrupts data in the default
configuration. Return the number of bytes actually stored.

The new test writes multi-block single writes of random data and
verifies the content, which the existing tests missed because they
either wrote in block-sized chunks (cp, tar) or did not verify
content (dd from /dev/zero).
_xml_parse_dirtree recurses once per directory level, and the index
reader sets XML_PARSE_HUGE, which removes libxml2's own nesting limit.
A crafted index with deeply nested <directory> elements could recurse
until the C stack overflows while mounting an untrusted cartridge.

Add a depth bound (a stack-safety guard, not an LTFS format limit; the
format defines no maximum depth) set well above any tree that fits in a
conventional PATH_MAX, so it cannot reject a volume produced from a real
filesystem. Introduces LTFS_XML_DEEP_NESTING (5051).
An <xattr> with the key ltfs.vendor.IBM.immutable or appendonly and an
empty (self-closing) value sets xattr->value to NULL, which was then
passed to strcmp when deciding the WORM flags. A crafted index crashes
the mount. Guard the value before comparing.
The glob_patterns array was grown with realloc without checking the
result, then dereferenced immediately. On allocation failure the old
pointer leaked and the next write dereferenced NULL. Use a temporary
and fail cleanly.
The percent-decode buffer was allocated from an attacker-controlled
name length and written without checking the allocation, so allocation
failure caused a NULL write. Return -LTFS_NO_MEMORY instead.
libxml2 returns a NULL node name for some node types; the loop
condition passed it straight to strcmp. Skip NULL names instead of
dereferencing them while scanning untrusted index XML.
The shared out: path releases parent->meta_lock via
fs_release_dentry_unlocked(), but the lock was only acquired after the
WORM and non-empty-directory checks, so those error paths unlocked a
lock they never held (undefined behaviour; can corrupt the rwlock
state). Acquire parent->meta_lock once before those checks so every
path to out: holds it; lock ordering is preserved (parent contents
before parent meta before child meta).
The directory WORM checks ran right after lookup, before todir's
meta_lock was acquired, then jumped to out_release which releases
todir->meta_lock via fs_release_dentry_unlocked() whenever todir !=
fromdir. Renaming into or out of a WORM directory therefore unlocked a
lock that was never held, and the immutable/appendonly fields were read
without meta_lock. Move the check to after both directory meta_locks
are held, mirroring the existing source/target entry WORM check.
After the destination name buffers are assigned to fromdentry, a later
failure (e.g. fs_add_key_to_hash_table) reached out_free, which freed
those same buffers because ret < 0 — leaving fromdentry with dangling
name/platform_safe_name pointers that are freed again when the dentry is
disposed. Clear the locals once ownership moves to fromdentry.
show_device_list returns 0 on success and non-zero on failure, but the
caller returned 0 on failure and 1 on success, so scripts checking the
exit status saw the opposite result.
The init callback sets fuse_config.nullpath_ok, so FUSE 3 may invoke
the setattr handlers with path == NULL and a valid file handle for an
open (possibly unlinked) file. chmod and utimens passed that NULL path
into ltfsmsg ("%s") and the path-based fsops, which is undefined
behaviour and fails the operation. Dispatch on the file handle like
getattr and truncate already do, using ltfs_fsops_set_readonly() and
ltfs_fsops_utimens().
pthread_yield has been deprecated since glibc 2.34 and is not provided
by some C libraries (e.g. musl). sched_yield is the POSIX standard and
was already used on the macOS and BSD branches; use it everywhere.
sg_modesense returns the transferred byte count (> 0) on success, but
is_ame and sg_set_key compared the result against 0/DEVICE_GOOD. is_ame
therefore always reported the drive as non-AME, and sg_set_key bailed
out before issuing SECURITY PROTOCOL OUT, so setting a data key always
failed on encrypting drives. Compare against < 0 like the other
sg_modesense callers, and normalize sg_set_key's success return.

Not verified on encrypting tape hardware; the logic follows the
documented sg_modesense return convention.
The ICU normalize helpers leave their output pointer unset when they
return an error, but the callers compared that pointer against the
input (to decide whether to free a no-op result) before checking the
return code. On an ICU failure this read uninitialized memory and could
leak the input buffer. Check the return code first in all five
callers.
Three sites used a value that may never have been set:
- tape_get_pews leaves *pews unset on -LTFS_UNSUPPORTED, which the
  caller treats as non-fatal before computing pews + 10.
- ltfs_profiler_set left ret unset when the volume had neither an
  iosched nor a device handle, then tested it.
- _get_dump derived a transfer length from cap_buf without checking
  the READ BUFFER result, so a failed read produced a garbage length.

Initialize pews and ret, and check the READ BUFFER result.
realloc was assigned back to priv.dk_list, so a failure overwrote the
only pointer to the existing buffer with NULL and leaked it. Use a
temporary and free the original on failure.
The version string lived only in configure.ac's AC_INIT. Move it to a
neutral VERSION file (line 1: numeric X.Y.Z.W, line 2: optional suffix)
so a second build system can read the same source. AC_INIT now joins
the two lines via m4_esyscmd_s; the resulting PACKAGE_VERSION is
unchanged ("2.4.8.4 (Prelim)").
…stem

make_message_src.sh hardcoded the genrb/pkgdata paths per-OS and
assumed the in-source messages/ layout (cd into the bundle dir, write
the archive to ../../). Honor $GENRB/$PKGDATA from the environment
when set, and accept optional source-bundle and output directories so
an out-of-source build can call the same script. The one-argument form
keeps the historical behavior, so messages/Makefile.am is unchanged.
The harness hardcoded the libtool .libs/ paths for plugins and the
fuse3 detection. Search both the autotools and the plain-subdirectory
(CMake) locations so the same test scripts run under make check and
ctest.
CMake >= 3.18, coexisting with the autotools build; both read the
package version from the shared VERSION file. Linux is fully wired
(libltfs, the ltfs/mkltfs/ltfsck executables, the sg/file/itdtimg tape
backends, both I/O schedulers, both key managers); dependency
detection for the macOS and BSD backends is in place, gated by
platform, so they can be added without re-plumbing.

Notable differences from the autotools implementation:
- ICU is found through CMake's ICU module instead of the icu-config
  tool that ICU removed in 2018, and ICU6x is defined automatically
  for ICU >= 60. libxml2 uses find_package(LibXml2); fuse/uuid/
  net-snmp have no CMake packages and use pkg-config imported targets.
- The message bundles are compiled by the same shared
  messages/make_message_src.sh, invoked with explicit source/output
  directories and the discovered genrb/pkgdata.
- The per-backend source symlinks and the CRC_OPTIMIZE compile rule
  are replaced by direct source paths and one SSE4.2-flagged OBJECT
  library linked into every tape backend.
- Plugins are MODULE libraries named exactly as ltfs.conf expects
  (libtape-sg.so etc.), installed to <libdir>/ltfs.
- ctest runs the same tests/t/*.sh suite via the shared harness
  (SKIP_RETURN_CODE 77 outside Linux/fuse hosts).

Verified in an Ubuntu VM: fuse3 and fuse2 configurations both build
and pass all 14 tests; the install tree matches autotools (bin, lib,
lib/ltfs plugins, etc/ltfs.conf{,.local}, pkgconfig, man, headers);
ltfs --version reports the identical version string; the autotools
build and make check stay green.
The stderr redirect for net-snmp-config --agent-libs was placed outside
the backticks, so the literal string "2> /dev/null" became part of
SNMP_MODULE_LIBS_A and ended up on every libtool link line. The shell
then redirected the linker's stderr to /dev/null, silently discarding
link errors (and making real failures undiagnosable). Only visible on
systems where net-snmp is installed.
matejk added 5 commits June 11, 2026 10:48
The field is only present in libfuse >= 3.15, but the guard tested
FUSE_VERSION >= 3.14, which broke the build on Ubuntu 24.04 (libfuse
3.14.0). Probe the struct member at configure time instead of trusting
version numbers; the CMake check uses a compile test because the
member is a bitfield and sizeof-based probes reject it.
On libfuse 3.14 the kernel never sends READDIRPLUS to the high-level
API, so listing a directory still produces one getattr per entry and
the strict assertion fails. Gate it on the runtime library version
(verified effective on 3.17); older libraries get an informational
message instead.
actions/checkout v4 runs on the deprecated Node.js 20; move all
workflows to v6. codeql-action v2 and create-pull-request v5 are
likewise outdated; move to v4 and v8. The test workflow matrix now
covers both build systems (autotools and CMake) with both FUSE
versions, running the same suite via make check and ctest. The test
container image gets cmake for the same purpose.
xattr.h included fuse.h without ltfs_fuse_version.h. libfuse 2 headers
default to an old API level when FUSE_USE_VERSION is undefined, but
libfuse 3 headers reject it, breaking every translation unit that
pulls in xattr.h. The release branch received the same change as part
of the FreeBSD build fix (b3e3355).
The helper is built by make check / ctest; the binary was committed by
accident.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant