Masurca by LiaOb21 · Pull Request #11049 · nf-core/modules

LiaOb21 · 2026-03-25T18:29:23Z

Masurca module [draft]

Supports assembly with Cabog assembly only. Flye and SOAPdenovo are not supported.

Supported reads:

single end or merged reads
paired end only
paired end + mate pair (jump)
paired end + nanopore
paired end + pacbio
paired end + nanopore + pacbio

Does not support other kinds of reads (Sanger, 454, etc) and synteny guided assembly with reference genome

PR checklist

Closes #10945

chriswyatt1 · 2026-03-26T14:42:53Z

Are the value here pipeline wide, or specific to each genome being run? Is that why they need to be specific channel inpus?
val fragment_mean
val fragment_stdev
val jump_mean
val jump_stdev

chriswyatt1 · 2026-03-26T14:44:33Z

path(other_reads) doesn't appear to be used, so could be removed

chriswyatt1 · 2026-03-26T14:49:25Z

Looks like a specific build type, so I think seqera containers won't be able to handle that.
https://github.com/alekseyzimin/masurca?tab=readme-ov-file#compileinstall-requirements
We will need to build this manually I think. I can have a go

Then we will need to do this:
https://nf-co.re/docs/contributing/guidelines/recommendations/custom_containers

LiaOb21 · 2026-03-26T14:50:54Z

Are the value here pipeline wide, or specific to each genome being run? Is that why they need to be specific channel inpus? val fragment_mean val fragment_stdev val jump_mean val jump_stdev

I think they are meant to be as realistic as possible for specific genomes. However, I ran all the tests locally and manually in a conda env using the same nf-core datasets I'm using in main.test.nf and using the same values, and they all gave me the expected outputs.

path(other_reads) doesn't appear to be used, so could be removed

Yes, I was initially planning to integrate that as well, but then I decided to leave it. I forgot to remove it from the input tuple.

Looks like a specific build type, so I think seqera containers won't be able to handle that. https://github.com/alekseyzimin/masurca?tab=readme-ov-file#compileinstall-requirements We will need to build this manually I think. I can have a go

Thank you @chriswyatt1! 😊

chriswyatt1 · 2026-03-26T14:51:42Z

OK, Just thinking to make this module more atomic, we should have these values as ext args. But if its helpful, to have this in a channel, I think it can be justified. If we think that any user would need to parse this information in for multiple genomes always, rather than set them generally in the modules.config say

chriswyatt1 · 2026-03-26T14:54:02Z

The only other thing of general comments, is about all the echo statements. Is this essential, or maybe we could move some of it into a script maybe. I will check out what other modules do in this case

LiaOb21 · 2026-03-26T17:12:15Z

OK, Just thinking to make this module more atomic, we should have these values as ext args. But if its helpful, to have this in a channel, I think it can be justified. If we think that any user would need to parse this information in for multiple genomes always, rather than set them generally in the modules.config say

I think the ideal use would be to have a specific value per genome. Personally I'm using an average fixed value, as I often don't have information about the fragment size. Also from masurca documentation is not completely clear if they refer to the fragment size or the insert size. I don't know what would be more appropriate for the nf-core module, so I'll follow your suggestion 😊

The only other thing of general comments, is about all the echo statements. Is this essential, or maybe we could move some of it into a script maybe. I will check out what other modules do in this case

Yes, so that's the main reason why I was so confused on how setting up this module. I asked in the community and they suggested to have a look to these two modules:

controlfreec also uses echo commands and I thought that was probably the easiest way to build the config for masurca. But happy to change if there is any better way to deal with this! 😊

chriswyatt1 · 2026-03-26T17:14:52Z

I made a container, not sure it will work, but worth a try:

quay.io/ecoflowucl/masurca so in the module you can just put ecoflowucl/masurca

test with docker pull quay.io/ecoflowucl/masurca:v4.1.4

Dockerfile is here: https://github.com/Eco-Flow/docker-build/tree/main/masurca

chriswyatt1 · 2026-03-27T07:22:39Z

I think the container works. But yes, there are some output files that have time stamps and specific work dir paths, which will change on each run. So we need to make the nf-test more specific. Pick files that won't change
assemble.sh
#!/bin/bash

assemble.sh generated by masurca:

CONFIG_PATH="/Users/cwyatt/Downloads/modules-2/.nf-test/tests/aad480bb9b0561d1bd4bc958001b0e93/work/e4/920dca1167539f25218df125acaf84/test_masurca_config.txt"

test-masurca.log:
[Thu Mar 26 18:02:31 UTC 2026] Processing pe library reads
[Thu Mar 26 18:02:32 UTC 2026] Average PE read length 100
[Thu Mar 26 18:02:32 UTC 2026] Using kmer size of 67 for the graph
[Thu Mar 26 18:02:32 UTC 2026] MIN_Q_CHAR: 33
[Thu Mar 26 18:02:32 UTC 2026] Creating mer database for Quorum

chriswyatt1 · 2026-03-30T21:46:32Z

@LiaOb21 did you manage to test the container? I think it worked,,, but let me know if not, and I can look into it

LiaOb21 · 2026-04-01T14:09:14Z

@LiaOb21 did you manage to test the container? I think it worked,,, but let me know if not, and I can look into it

Hey @chriswyatt1 the container works! 😊 thank you so much!

Looks like a specific build type, so I think seqera containers won't be able to handle that. https://github.com/alekseyzimin/masurca?tab=readme-ov-file#compileinstall-requirements We will need to build this manually I think. I can have a go

Then we will need to do this: https://nf-co.re/docs/contributing/guidelines/recommendations/custom_containers

We can proceed with this, in the meantime I'll try to set up the tests in a way that they can ignore the timestamps and so on

.vscode/extensions.json

.vscode/settings.json

modules/nf-core/masurca/main.nf

LiaOb21 · 2026-04-07T08:27:48Z

@nf-core-bot fix linting please

LiaOb21 · 2026-04-07T11:57:13Z

Hello @mashehu @maxulysse and @chriswyatt1! 😊
I think this is almost ready for review.

As regards the container, here is the conda recipe: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/masurca/meta.yaml

The reason why the biocontainer obtained from conda doesn't work is because some dependencies, like for example file, are not in the recipe although they are used from masurca. The tests work with conda because these dependencies are normally available in a linux environment, but they fail with docker for obvious reasons. However, the container created by @chriswyatt1 works, and this is what I am using at the moment, and I already asked on slack to host it in the nf-core org.

Also, I was having some linting issues:

╭─ [✗] 1 Module Test Failed ───────────────────────────────────────────────────╮
│             ╷                               ╷                                │
│ Module name │ File path                     │ Test message                   │
│╶────────────┼───────────────────────────────┼───────────────────────────────╴│
│ masurca     │ modules/nf-core/masurca/main… │ main_nf_ext_key: Invalid 'ext' │
│             │                               │ keys detected:                 │
│             │                               │ ext.extend_jump_reads,         │
│             │                               │ ext.extend_jump_reads,         │
│             │                               │ ext.graph_kmer_size,           │
│             │                               │ ext.use_linking_mates,         │
│             │                               │ ext.use_linking_mates,         │
│             │                               │ ext.lhe_coverage,              │
│             │                               │ ext.mega_reads_one_pass,       │
│             │                               │ ext.mega_reads_one_pass,       │
│             │                               │ ext.limit_jump_coverage,       │
│             │                               │ ext.ca_parameters,             │
│             │                               │ ext.close_gaps,                │
│             │                               │ ext.close_gaps, ext.jf_size    │
│             ╵                               ╵                                │
╰──────────────────────────────────────────────────────────────────────────────╯

Then I saw this issue: #8716
And I moved the task.ext to inputs. The linting now works. I left the task.ext commented just to double check with you if this is the correct approach and go back quickly if necessary.

And, finally, the tests. These are avoiding the md5 check because masurca works with absolute paths, and these paths change for every run making the md5 inconsistent. Please let me know if I used the right approach or if there is anything else to change.

Thank you so much for your help! 😊

modules/nf-core/masurca/main.nf

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

modules/nf-core/masurca/environment.yml

modules/nf-core/masurca/main.nf

LiaOb21 · 2026-04-10T10:21:37Z

Hi @mashehu, the bioconda recipe has been update and works. I tried to obtain the docker and singularity with wave cli using both pixi:v1 and default template:

---
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json
channels:
  - conda-forge
  - bioconda
dependencies:
  - "bioconda::masurca=4.1.4=ha5bb246_1"

wave --conda-file environment.yml --freeze --platform linux/amd64 -o yaml --build-template conda/pixi:v1 --await
buildId: bd-2788c9bd40859936_1
cached: true
containerImage: community.wave.seqera.io/library/masurca:4.1.4--2788c9bd40859936
duration: PT2M52.608153482S
freeze: true
mirror: false
requestId: 4a0e1f34d473
scanId: sc-e633b339ce626493_1
status: DONE
succeeded: true
targetImage: community.wave.seqera.io/library/masurca:4.1.4--2788c9bd40859936

wave --conda-file environment.yml --freeze --platform linux/amd64 -o yaml --build-template conda/pixi:v1 --singularity --await
buildId: bd-2d0ee6b415d9e77e_1
cached: false
containerImage: oras://community.wave.seqera.io/library/masurca:4.1.4--2d0ee6b415d9e77e
duration: PT3M32.779779428S
freeze: true
mirror: false
requestId: 5fbb5d604d9e
status: DONE
succeeded: true
targetImage: oras://community.wave.seqera.io/library/masurca:4.1.4--2d0ee6b415d9e77e

wave --conda-file environment.yml --freeze --platform linux/amd64 -o yaml --await
buildId: bd-6c6f779135534097_1
cached: false
containerImage: community.wave.seqera.io/library/masurca:4.1.4--6c6f779135534097
duration: PT1M59.355046634S
freeze: true
mirror: false
requestId: 7bc08416d07c
scanId: sc-680889439ee593d6_1
status: DONE
succeeded: true
targetImage: community.wave.seqera.io/library/masurca:4.1.4--6c6f779135534097

wave --conda-file environment.yml --freeze --platform linux/amd64 -o yaml --singularity --await
buildId: bd-40a5894c07082589_1
cached: false
containerImage: oras://community.wave.seqera.io/library/masurca:4.1.4--40a5894c07082589
duration: PT2M39.588376664S
freeze: true
mirror: false
requestId: f7d8cd412b66
status: DONE
succeeded: true
targetImage: oras://community.wave.seqera.io/library/masurca:4.1.4--40a5894c07082589

Again, the new conda recipe works, while both the docker versions (pixi and default) give me the error:
Overlap/unitig failed, check output under CA/ and runCA1.out, try re-generating and re-running assemble.sh for each of the tests.

Is there anything else I can try? I believe I added all the required dependencies to the bioconda recipe now.

LiaOb21 · 2026-04-10T10:32:21Z

I forgot to add that I tested the docker container from bioconda fetch artifacts with same test data and masurca config files and it works without problems.

mahesh-panchal · 2026-04-13T11:12:14Z

modules/nf-core/masurca/main.nf

+    config_lines << "END"
+    config_lines << ""
+
+    // PARAMETERS section


It's been a very long time since I've used Masura so I don't remember much about what's required input. Can these be supplied through ext.args?

The idea would be that you would have

config_lines.addAll(args.collect{ key, value -> "${key}=${value}"})

So ext.args is a map in this case (for example like modules VIBER or MOBSTER).

The user would then supply lines through the ext.args

withName: 'MASURCA' { ext.args = [ LHE_COVERAGE: 35, SOAP_ASSEMBLY: 0, ]

Hi @mahesh-panchal, I moved several arguments to inputs in this commit: LiaOb21@c3c4faf
I made this change because the linter was throwing an error, if I remember well it wanted those arguments documented in meta.yml, though I'm not entirely sure. I believe the map won't trigger the same issue, so we could try that approach.
However, regarding the Masurca inputs, the only guidance available on configuring it comes from comments within the config file itself. Unfortunately, those comments don't clearly specify which parameters are optional. In the past, I attempted to omit some non-required parameters based on the input files I was working with, but this caused the generation of assemble.sh to fail.

LiaOb21 and others added 7 commits March 20, 2026 16:56

masurca started

6851bf6

still working on it

40c1b3b

Merge branch 'nf-core:master' into masurca

1542aa3

probably close

5c56309

still work in progress

c8f84d0

still work in progress

d6e1274

Merge branch 'nf-core:master' into masurca

800d828

LiaOb21 added help wanted Extra attention is needed nf-test container-issue labels Mar 26, 2026

github-project-automation bot added this to nf-test Migration Mar 26, 2026

github-project-automation bot moved this to Todo in nf-test Migration Mar 26, 2026

LiaOb21 removed the nf-test label Mar 26, 2026

Merge branch 'nf-core:master' into masurca

2ea84b6

Merge branch 'nf-core:master' into masurca

7fd2684

Merge branch 'nf-core:master' into masurca

9fc2229

maxulysse reviewed Apr 2, 2026

View reviewed changes

.vscode/extensions.json Show resolved Hide resolved

maxulysse reviewed Apr 2, 2026

View reviewed changes

.vscode/settings.json Show resolved Hide resolved

mashehu reviewed Apr 2, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Outdated Show resolved Hide resolved

Merge branch 'nf-core:master' into masurca

3afe0d3

LiaOb21 commented Apr 2, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Outdated Show resolved Hide resolved

LiaOb21 added 2 commits April 2, 2026 18:05

syntax error fixed

20d7f8f

tests work locally

cc2154d

LiaOb21 and others added 2 commits April 7, 2026 09:33

Merge branch 'master' into masurca

90c2667

moved task.ext to input due to linting errors

c3c4faf

LiaOb21 and others added 2 commits April 8, 2026 12:44

Merge branch 'nf-core:master' into masurca

c7c98ae

Merge branch 'master' into masurca

fd696fd

famosab reviewed Apr 8, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Show resolved Hide resolved

modules/nf-core/masurca/main.nf Show resolved Hide resolved

modules/nf-core/masurca/main.nf Outdated Show resolved Hide resolved

modules/nf-core/masurca/main.nf Outdated Show resolved Hide resolved

Update modules/nf-core/masurca/main.nf

608f407

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

mashehu reviewed Apr 8, 2026

View reviewed changes

modules/nf-core/masurca/environment.yml Outdated Show resolved Hide resolved

LiaOb21 added 2 commits April 8, 2026 15:49

linting masurca main.nf as suggested

6956864

moving config printing to groovy

3f1a2fe

mahesh-panchal reviewed Apr 9, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Outdated Show resolved Hide resolved

LiaOb21 commented Apr 9, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Show resolved Hide resolved

mahesh-panchal reviewed Apr 9, 2026

View reviewed changes

modules/nf-core/masurca/main.nf Show resolved Hide resolved

join operator and /home/lobinu/git_repos/modules adjusted

155a251

LiaOb21 mentioned this pull request Apr 9, 2026

Update masurca: adding missing dependencies bioconda/bioconda-recipes#64293

Merged

LiaOb21 and others added 3 commits April 9, 2026 22:40

Merge branch 'master' into masurca

5338c3b

updating envs

d0b1fa4

Merge branch 'master' into masurca

8df2273

LiaOb21 marked this pull request as ready for review April 10, 2026 10:22

mashehu added 3 commits April 10, 2026 12:55

try to get more of the logs

39d0355

more logs

ae75c79

even more logging

c4d756a

mahesh-panchal reviewed Apr 13, 2026

View reviewed changes

Conversation

LiaOb21 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR checklist

Uh oh!

chriswyatt1 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LiaOb21 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 26, 2026

Uh oh!

LiaOb21 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 26, 2026

Uh oh!

chriswyatt1 commented Mar 27, 2026

assemble.sh generated by masurca:

Uh oh!

chriswyatt1 commented Mar 30, 2026

Uh oh!

LiaOb21 commented Apr 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LiaOb21 commented Apr 7, 2026

Uh oh!

LiaOb21 commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LiaOb21 commented Apr 10, 2026

Uh oh!

LiaOb21 commented Apr 10, 2026

Uh oh!

mahesh-panchal Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

LiaOb21 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

LiaOb21 commented Mar 25, 2026 •

edited

Loading

chriswyatt1 commented Mar 26, 2026 •

edited

Loading