Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
9f1b855
added tests for subworkflow BAM_INFER_SEX for 1 sample
sofiademmou Mar 25, 2026
f2fed59
updated changelog
sofiademmou Mar 25, 2026
6683f38
updated multisample tests for BAM_INFER_SEX
sofiademmou Mar 26, 2026
fdf5540
removed useless parameter for test
sofiademmou Mar 26, 2026
4e49b04
updated snapshot
sofiademmou Mar 26, 2026
f665f27
changed assertions for multisample tets
sofiademmou Mar 26, 2026
2032f05
Merge branch 'dev' of github.com:genomic-medicine-sweden/nallo into s…
sofiademmou Mar 27, 2026
5df5fc0
split hifiasm into 2 processes (bin and assembly)
sofiademmou Mar 27, 2026
ee24447
updated changelog
sofiademmou Mar 27, 2026
4b9d671
removed unrelated update
sofiademmou Mar 27, 2026
183d219
updated changelog
sofiademmou Mar 27, 2026
c135aca
updated config for genome_assembly
sofiademmou Mar 27, 2026
e0fc0a9
Merge branch 'dev' of github.com:genomic-medicine-sweden/nallo into s…
sofiademmou Apr 8, 2026
8fa55cb
added key matching for bins and reads so bins can be used for assembly
sofiademmou Apr 8, 2026
cb43236
updated snapshots
sofiademmou Apr 8, 2026
f16be8d
Merge branch 'dev' into split_hifiasm
sofiademmou Apr 10, 2026
fe3eb5a
changed assertions for unstable md5
sofiademmou Apr 10, 2026
fec8384
updated snapshots
sofiademmou Apr 10, 2026
f43182b
Update conf/modules/genome_assembly.config
sofiademmou Apr 15, 2026
7bbe899
replaced complex keying of bins/reads with join
sofiademmou Apr 15, 2026
3e86f7b
updated hifiasm module
sofiademmou Apr 21, 2026
7261fe4
removed afterscript (not needed anymore with optional raw_unitigs out…
sofiademmou Apr 21, 2026
c3be48e
added yak to ch_hifiasm_assembly_in and --bin-only to HIFIASM_ASSEMBL…
sofiademmou Apr 22, 2026
0f988c3
fixed formatting
sofiademmou Apr 23, 2026
a7643be
condensed hifiasm params into one
sofiademmou Apr 23, 2026
4b39334
fixed bug in config and formatting
sofiademmou Apr 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#962](https://github.com/genomic-medicine-sweden/nallo/pull/962) - Changed peddy inputs from ranked to annotated or unannotated variants, depending on availability
- [#964](https://github.com/genomic-medicine-sweden/nallo/pull/964) - Limit `--snv_calling_processes` to `1` for sentieon due to issues with duplicated variants (see #926)
- [#966](https://github.com/genomic-medicine-sweden/nallo/pull/966) - Refactored the code related to variant ranking to reduce code duplication
- [#969](https://github.com/genomic-medicine-sweden/nallo/pull/969) - Split hifiasm process into two so they can have different resources: first only create bins then assembly with the bins already created
- [#974](https://github.com/genomic-medicine-sweden/nallo/pull/974) - Updated FastQC nf-core module
- [#975](https://github.com/genomic-medicine-sweden/nallo/pull/975) - Moved params from `nallo.nf` to main workflow
- [#975](https://github.com/genomic-medicine-sweden/nallo/pull/975) - Renamed `ch_databases` to `ch_echtvar_databases` for clarity
Expand Down
2 changes: 1 addition & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ process {
withLabel:process_high_memory {
memory = { 200.GB * task.attempt }
}
withName: 'HIFIASM' {
withName: 'HIFIASM_ASSEMBLY|HIFIASM_BINS' {
time = { 36.h * task.attempt }
}
withLabel:error_ignore {
Expand Down
5 changes: 3 additions & 2 deletions conf/modules/genome_assembly.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,13 @@ process {
]
}

withName: '.*:GENOME_ASSEMBLY:HIFIASM' {
withName: '.*:GENOME_ASSEMBLY:HIFIASM_ASSEMBLY|.*:GENOME_ASSEMBLY:HIFIASM_BINS' {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

ext.args = { [
"${params.extra_hifiasm_options}",
"${params.hifiasm_preset}",
'--dual-scaf',
'--telo-m CCCTAA'
'--telo-m CCCTAA',
'--bin-only'
].join(' ') }
}

Expand Down
2 changes: 1 addition & 1 deletion modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@
},
"hifiasm": {
"branch": "master",
"git_sha": "b00cfb8947da7842e8f249cc0c5bb8544cfac18e",
"git_sha": "a20dba808c5ce99afe6d40d7fc072c74ca317c8d",
"installed_by": ["modules"]
},
"hificnv": {
Expand Down
8 changes: 4 additions & 4 deletions modules/nf-core/hifiasm/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions modules/nf-core/hifiasm/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 10 additions & 9 deletions modules/nf-core/hifiasm/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion modules/nf-core/hifiasm/tests/nextflow.config

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

87 changes: 52 additions & 35 deletions subworkflows/local/genome_assembly/main.nf
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
include { CAT_FASTQ } from '../../../modules/nf-core/cat/fastq/main'
include { HIFIASM } from '../../../modules/nf-core/hifiasm'
include { YAK_COUNT } from '../../../modules/nf-core/yak/count/main'
include { GFASTATS } from '../../../modules/nf-core/gfastats/main'
include { CAT_FASTQ } from '../../../modules/nf-core/cat/fastq/main'
include { HIFIASM as HIFIASM_BINS } from '../../../modules/nf-core/hifiasm'
include { HIFIASM as HIFIASM_ASSEMBLY } from '../../../modules/nf-core/hifiasm'
include { YAK_COUNT } from '../../../modules/nf-core/yak/count/main'
include { GFASTATS } from '../../../modules/nf-core/gfastats/main'

// This subworkflow assembles and outputs haplotypes from a set of reads (grouped per sample), using hifiasm and gfastats.
// It assumes that while each sample can have multiple files, each sample belongs to one family at most.
workflow GENOME_ASSEMBLY {

take:
ch_reads // channel: [ val(meta), fastqs ]
trio_binning // bool: Should we use trio binning mode where possible?
ch_reads // channel: [ val(meta), fastqs ]
trio_binning // bool: Should we use trio binning mode where possible?

main:
if (trio_binning) {
// First, we need to branch the samples based on their relationship
ch_reads
.branch { meta, _reads ->
def is_parent = meta.relationship in ['father', 'mother']
paired_parents : is_parent && meta.has_other_parent
children_with_both_parents : meta.relationship == 'child' && meta.two_parents
other : true
paired_parents: is_parent && meta.has_other_parent
children_with_both_parents: meta.relationship == 'child' && meta.two_parents
other: true
}
.set { ch_branched_samples }

Expand All @@ -32,85 +32,102 @@ workflow GENOME_ASSEMBLY {
}
.set { ch_paired_parents_for_yak }

CAT_FASTQ (
CAT_FASTQ(
ch_paired_parents_for_yak.cat
)

YAK_COUNT (
YAK_COUNT(
CAT_FASTQ.out.reads.concat(ch_paired_parents_for_yak.no_cat)
)

YAK_COUNT.out.yak
// Because a parent can have multiple children, and meta.children is a list of all children,
// we need to return one tuple per child.
.flatMap { meta, yak ->
(meta.children ?: []).collect { child_id ->
[child_id, meta, yak]
}
}
.branch { child_id, meta, yak ->
paternal: meta.relationship == 'father'
return [ child_id, yak ]
return [child_id, yak]
maternal: meta.relationship == 'mother'
return [ child_id, yak ]
return [child_id, yak]
}
.set { ch_yak_output }

// Creates the input for trio-binned assemblies (children with both parents)
ch_branched_samples.children_with_both_parents
.map { meta, reads -> [ meta.id, meta, reads ] }
.map { meta, reads -> [meta.id, meta, reads] }
.join(ch_yak_output.paternal)
.join(ch_yak_output.maternal)
.map { _id, meta, reads, yak_paternal, yak_maternal ->
[ meta, reads, yak_paternal, yak_maternal ]
[meta, reads, yak_paternal, yak_maternal]
}
.set { ch_with_both_parents }

// Create the input for hifiasm by combining the non-trio binned samples with the trio-binned samples.
ch_branched_samples.other
.concat(ch_branched_samples.paired_parents)
.map { meta, fastqs ->
[ meta, fastqs, [], [] ]
[meta, fastqs, [], []]
}
.concat(ch_with_both_parents)
.multiMap { meta, reads, yak_paternal, yak_maternal ->
reads : [ meta, reads , [] ]
yak : [ meta, yak_paternal, yak_maternal ]
reads: [meta, reads, []]
yak: [meta, yak_paternal, yak_maternal]
}
.set { ch_hifiasm_in }
} else {
}
else {
ch_reads
.multiMap { meta, reads ->
reads : [ meta, reads, [] ]
yak : [ [], [], [] ]
reads: [meta, reads, []]
yak: [meta, [], []]
}
.set { ch_hifiasm_in }
}

HIFIASM (
HIFIASM_BINS(
ch_hifiasm_in.reads,
ch_hifiasm_in.yak,
[[],[],[]],
[[],[]]
[[], [], []],
[[], []],
)

// Explicitly key bins/reads/yak by sample ID before assembly so each sample gets its own bins and yaks.
ch_hifiasm_in.reads
.join(ch_hifiasm_in.yak, failOnMismatch: true, failOnDuplicate: true)
.join(HIFIASM_BINS.out.bin_files, failOnMismatch: true, failOnDuplicate: true)
.multiMap { meta, reads, ul_reads, yak_paternal, yak_maternal, bin_files ->
reads: [meta, reads, ul_reads]
bins: [meta, bin_files]
yak: [meta, yak_paternal, yak_maternal]
}
.set { ch_hifiasm_assembly_in }
Comment on lines +100 to +105
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a nextflow lint -harshil-alignment -format on this file... I hope we can run it automatically soon 🤞

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ! :) Should I use it on all the main.nf files for nallo in the meantime to make sure they have the right format ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. There is an issue where it removes inline comments, so it's not perfect yet.. we will have to do it manually for a bit longer on the files we modify.


HIFIASM_ASSEMBLY(
ch_hifiasm_assembly_in.reads,
ch_hifiasm_assembly_in.yak,
[[], [], []],
ch_hifiasm_assembly_in.bins,
)

HIFIASM.out.hap1_contigs
.map { meta, fasta -> [ meta + [ 'haplotype': 1 ], fasta ] }
HIFIASM_ASSEMBLY.out.hap1_contigs
.map { meta, fasta -> [meta + ['haplotype': 1], fasta] }
.set { ch_gfastats_paternal_in }

HIFIASM.out.hap2_contigs
.map { meta, fasta -> [ meta + [ 'haplotype': 2 ], fasta ] }
HIFIASM_ASSEMBLY.out.hap2_contigs
.map { meta, fasta -> [meta + ['haplotype': 2], fasta] }
.set { ch_gfastats_maternal_in }

GFASTATS(
ch_gfastats_paternal_in.mix(ch_gfastats_maternal_in),
'fasta',
'',
'',
[[],[]],
[[],[]],
[[],[]],
[[],[]]
[[], []],
[[], []],
[[], []],
[[], []],
)

emit:
Expand Down
Loading
Loading