optimize performance of array_to_qualitystring by jchorl · Pull Request #1363 · pysam-developers/pysam

jchorl · 2025-10-02T22:13:06Z

I was profiling some code and found the majority of time is spent in array_to_qualitystring. This is particularly impactful on huge files with tons of reads.

The culprit is the allocation, copying, and computation in python. This optimization should allow the logic to all be compiled down to C.

Bench results:

Before:

---------------------------------------------------------- benchmark: 1 tests ----------------------------------------------------------
Name (time in us)                           Min       Max     Mean  StdDev   Median     IQR   Outliers  OPS (Kops/s)  Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------
test_fasta_iteration_long_sequences     75.7550  126.3460  78.9250  2.1202  78.4110  0.8720  1160;1541       12.6703   11453           1
----------------------------------------------------------------------------------------------------------------------------------------

After:

-------------------------------------------------------- benchmark: 1 tests -------------------------------------------------------
Name (time in us)                          Min      Max    Mean  StdDev  Median     IQR  Outliers  OPS (Kops/s)  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------
test_fasta_iteration_long_sequences     1.2620  14.7180  1.3264  0.1447  1.3130  0.0200  409;1397      753.9372   45268           1
-----------------------------------------------------------------------------------------------------------------------------------

jmarshall · 2025-10-08T21:47:31Z

Thanks, this looks like a good approach.

Eventually I want to add entry points to HTSlib so that we can just call HTSlib's SIMD-optimised versions of these conversions, but this is a big win in the meantime.

jchorl · 2025-10-14T19:12:57Z

@jmarshall what would be the process to get this merged/released?

jchorl · 2026-02-05T21:38:50Z

@jmarshall what would be the process to get this merged/released?

@jmarshall I was just profiling a process and again found this to be a bottleneck. Any chance we can get this merged?

The data is contiguous so use [::1] to omit stride calculations; use size_t rather than ssize_t to omit check for end-relative indexing.

jchorl · 2026-04-17T15:50:15Z

Thank you for pushing this through!

jmarshall · 2026-04-17T20:26:17Z

Thanks for diving into memoryviews! Clearly we should see if there is other pysam code that would benefit from them as well.

The HTSlib approach I mentioned is now samtools/htslib#1974 but it will be a while before that lands.

optimize performance of array_to_qualitystring

21d1384

jmarshall added 2 commits April 17, 2026 20:59

Extend tests and incorporate into AlignedSegment_test.py

e0c8a42

Improve memoryview version of array_to_qualitystring()

24abe66

The data is contiguous so use [::1] to omit stride calculations; use size_t rather than ssize_t to omit check for end-relative indexing.

jmarshall approved these changes Apr 17, 2026

View reviewed changes

jmarshall merged commit 48688fc into pysam-developers:master Apr 17, 2026
16 of 17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize performance of array_to_qualitystring#1363

optimize performance of array_to_qualitystring#1363
jmarshall merged 3 commits intopysam-developers:masterfrom
jchorl:jchorl/perf

jchorl commented Oct 2, 2025

Uh oh!

jmarshall commented Oct 8, 2025

Uh oh!

jchorl commented Oct 14, 2025

Uh oh!

jchorl commented Feb 5, 2026

Uh oh!

Uh oh!

jchorl commented Apr 17, 2026

Uh oh!

jmarshall commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jchorl commented Oct 2, 2025

Uh oh!

jmarshall commented Oct 8, 2025

Uh oh!

jchorl commented Oct 14, 2025

Uh oh!

jchorl commented Feb 5, 2026

Uh oh!

Uh oh!

jchorl commented Apr 17, 2026

Uh oh!

jmarshall commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants