Skip to content

Length of the sequence affects to runtime #11

@BiotechPedro

Description

@BiotechPedro

Hi!

Why would grepq takes longer when "grepqing" a single N according to the length of the sequences?

time grepq --read-gzip N.txt DNA_R1.fastq.gz -c outputs:

237844016

real	0m25.290s
user	0m55.951s
sys	0m1.247s

but,

time grepq --read-gzip N.txt DNA_R2.fastq.gz -c outputs:

237844016

real	1m7.055s
user	1m25.311s
sys	0m3.354s

The lengths each read are {39, 18, 1 18} for DNA_R1.fastq.gz, and {39, 74, 1, 74} for DNA_R2.fastq.gz. If it is helpful, I am allocated in a node with 5 CPUs and 15G of memory.

Thanks,
Pedro

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions