fix(performance): Use bounded HTTP Range requests for indexed BAM queries by TechIsCool · Pull Request #1998 · samtools/htslib

TechIsCool · 2026-04-14T23:58:32Z

Problem

When reading remote BAM files with an index, htslib seeks to each chunk's start offset but issues unbounded Range requests. The server advertises gigabytes of Content-Length even though we only need kilobytes:

Seek	Unbounded Request	Server Advertises	Actual Data Required
chr1	`bytes=8224425-`	17.2 GB	141 KB
chr2	`bytes=1631423494-`	15.6 GB	123 KB
chr7	`bytes=7287649006-`	9.9 GB	167 KB

The client terminates early, but "early termination" isn't free - data in flight still transfers. We have also found that being specific about what is needed improves S3 responsiveness.

Solution

The BAM index already contains chunk end offsets. Pass them through to the HTTP layer:

hts_itr_next()
  → bgzf_seek_limit(fp, chunk.start, SEEK_SET, chunk.end)  // NEW: chunk.end
    → hfile_set_readahead_limit(fp->fp, compressed_limit)
      → CURLOPT_RANGE "bytes=X-Y" instead of CURLOPT_RESUME_FROM_LARGE

samtools view --verbosity 10 -X <bam> <bai> <regions> 2>&1 | grep "Range:"

# Before: Range: bytes=7287649006-
# After:  Range: bytes=7287649006-7287816262

EC2 Benchmark

(35 MB/s bandwidth)
Environment: EC2 m8azn.medium (up to 25 Gbps bandwidth), us-east-1
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)
Measurement: Wall clock time + actual wire transfer via /sys/class/net/<iface>/statistics/rx_bytes

It appears S3 optimizes resource allocation for bounded requests, leading to much faster responses. The time improvement exceeds bandwidth savings, suggesting that S3 can serve bounded requests more efficiently.

Query	Unbounded	Bounded	Bandwidth	Time
1 region	1.04 MB, 0.67s	0.72 MB, 0.38s	31% less	1.8x faster
5 regions	1.94 MB, 1.33s	1.28 MB, 0.45s	34% less	2.9x faster
10 regions	3.88 MB, 2.19s	2.55 MB, 1.11s	34% less	2.0x faster
chr22 (275 MB)	275.8 MB, 7.17s	275.2 MB, 5.07s	~same	1.4x faster

Per-Request Comparison (5 regions)

Seek	Unbounded (before)	Bounded (after)
	`Range: bytes=X-`	`Range: bytes=X-Y`
chr1	Content-Length: 17.2 GB	Content-Length: 141 KB
chr2	Content-Length: 15.6 GB	Content-Length: 123 KB
chr3	Content-Length: 14.3 GB	Content-Length: 155 KB
chr5	Content-Length: 12.2 GB	Content-Length: 122 KB
chr7	Content-Length: 9.9 GB	Content-Length: 167 KB
Total Advertised	69.2 GB	708 KB

Local Benchmark

(2 MB/s bandwidth)
Environment: MacOS M1 (up to 100Mbps), California
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)

Regions	Unbounded	Bounded	Speedup
1 region	2.6s	2.5s	1.04x
5 regions	7.0s	4.5s	1.55x
10 regions	13.0s	7.5s	1.74x

Reproduction

# Measure wall time and wire transfer (on EC2/Linux)
IFACE=$(ls /sys/class/net/ | grep -v lo | head -1)
BEFORE=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
START=$(date +%s.%N)
samtools view --verbosity 10 -X \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam.bai \
  1:1000000-1000100 2:5000000-5000100 3:10000000-10000100 \
  5:50000000-50000100 7:117188547-117188800 >/dev/null 2>&1
END=$(date +%s.%N)
AFTER=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
echo "Time: $(echo "$END - $START" | bc)s, Bytes: $((AFTER - BEFORE))"

When reading indexed BAM files from remote URLs (HTTP, S3, etc.), seeking to a chunk offset would then read unbounded to EOF. For small queries against large files, this downloads far more data than needed. This adds bgzf_seek_limit() which accepts the chunk end offset from the BAM index, enabling bounded Range requests (bytes=X-Y) instead of unbounded ones (bytes=X-) in the libcurl backend. Changes: - hfile.h/hfile.c: Add readahead_limit field and setter - bgzf.h/bgzf.c: Add bgzf_seek_limit() that passes limit to hfile - hfile_libcurl.c: Use CURLOPT_RANGE with bounds when limit is set - hts.c: Call bgzf_seek_limit() with chunk end in hts_itr_next() The limit is cleared after each hseek(), so only affects reads immediately following a seek. Signed-off-by: David Beck <techiscool@gmail.com>

daviesrob · 2026-04-15T09:22:05Z

This changes the public hFILE structure, so it would require an ABI bump if merged. It would be good to avoid that if possible.

This problem has already been fixed in the s3 plug-in, albeit in a different way. If you try using s3:// URLs instead of https://, you'll find that the amount of data requested is much smaller, and the time taken is similar to what you report for your solution.

TechIsCool · 2026-04-29T15:41:51Z

Thanks for the review.

I am unsure if the PR could exist without impacting ABI?

For context we use https:// URLs because our workflow relies on pre-signed S3 URLs - the s3:// protocol requires IAM credentials which aren't available in our environment. I looked into what you suggested by reading up on hfile_s3.c and it appears to use a fixed 1MB part_size. I'm happy to extend this PR to pass readahead_limit through the s3 plugin as well.

Benchmarks

I tested s3:// vs https:// on EC2 (m8azn.medium, us-east-1). The s3:// plugin uses a fixed 1MB part_size, which leads to more data transfer and more HTTP requests for larger queries:

Region	S3	HTTPS	S3 GETs	HTTPS GETs
1:1000000-1000100	0.62s, 3.26 MB	0.27s, 705 KB	4	4
2:5000000-5000100	0.28s, 3.25 MB	0.39s, 683 KB	4	4
3:10000000-10000100	0.22s, 3.25 MB	0.33s, 726 KB	4	4
5:50000000-50000100	0.30s, 3.25 MB	0.29s, 686 KB	4	4
7:117188547-117188800	0.29s, 3.25 MB	0.37s, 712 KB	4	4
22:20000000-25000000	11.5s, ~42 MB	6.7s, ~42 MB	44	4

For small regions, GET counts are similar but S3 transfers ~5x more data (fixed 1MB chunks). For larger regions, the 1MB chunking causes 44 separate requests vs 1 bounded request.

Reproduction

IFACE=$(ls /sys/class/net/ | grep -v lo | head -1)
REGIONS="1:1000000-1000100 2:5000000-5000100 3:10000000-10000100 5:50000000-50000100 7:117188547-117188800"
BAM="1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam"

test_region() {
    local bam=$1 bai=$2 region=$3
    local before=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
    local start=$(date +%s.%N)
    samtools view -X "$bam" "$bai" "$region" >/dev/null 2>&1
    local elapsed=$(echo "$(date +%s.%N) - $start" | bc)
    echo "$region: ${elapsed}s, $(($(cat /sys/class/net/$IFACE/statistics/rx_bytes) - before)) bytes"
}

echo "=== S3 ==="
for r in $REGIONS; do test_region "s3://$BAM" "s3://$BAM.bai" "$r"; done

echo "=== HTTPS ==="
for r in $REGIONS; do test_region "https://s3.amazonaws.com/$BAM" "https://s3.amazonaws.com/$BAM.bai" "$r"; done

Additional finding: On slower links, s3:// can show lower latency than https:// for small queries due to connection reuse. The s3 plugin reuses connections while hfile_libcurl.c creates new connections on each seek.

Adding CURL_LOCK_DATA_CONNECT to the share object in hfile_libcurl.c brings HTTPS latency in line with S3 while maintaining the bandwidth savings. This could be a separate improvement.

daviesrob self-assigned this Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998
TechIsCool wants to merge 1 commit intosamtools:developfrom
TechIsCool:index-aware-range-requests

TechIsCool commented Apr 14, 2026

Uh oh!

daviesrob commented Apr 15, 2026

Uh oh!

TechIsCool commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TechIsCool commented Apr 14, 2026

Problem

Solution

EC2 Benchmark

Per-Request Comparison (5 regions)

Local Benchmark

Reproduction

Uh oh!

daviesrob commented Apr 15, 2026

Uh oh!

TechIsCool commented Apr 29, 2026

Benchmarks

Reproduction

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants