TrioBinChildLongReads
TrioBinChildLongReads
- description
- A workflow that performs trio-binning of child long reads given parental (short) reads. Based on the trio-canu publication:
De novo assembly of haplotype-resolved genomes with trio binning https://www.nature.com/articles/nbt.4277
We divide the workflow into two parts:
- part one: collect k-mer stats given parental (short) reads
- part two: given the k-mer stats database from part one, classify child long reads
Inputs
Required
child_long_reads_bucket
(String, required): GCS bucket path holding FASTA/FASTQ of child long readsfather_short_reads_bucket
(String, required): GCS bucket path holding FASTA/FASTQ of (short) reads of paternal origingenome_size
(String, required): an esimate on genome size of the specicies (affects k-value picking)long_read_platform
(String, required): platform of long read sequencing; currently only one of [pacbio-raw, nanopore-raw] is supportedmother_short_reads_bucket
(String, required): GCS bucket path holding FASTA/FASTQ of (short) reads of maternal originvm_local_monitoring_script
(File, required): GCS file holding a resouce monitoring script that runs locally and collects info for a very specific purposeworkdir_name
(String, required): name of working directory
Optional
kmerSize
(Int?): [optional] force specifying k-value in collecting k-mer stats on parentsrun_with_debug
(Boolean?): [optional] whether to run in debug mode (takes significantly more disk space and more logs); defaults to falseAssignChildLongReads.runtime_attr_override
(RuntimeAttr?)CollectParentsKmerStats.MerylCount.runtime_attr_override
(RuntimeAttr?)CollectParentsKmerStats.MerylMergeAndSubtract.runtime_attr_override
(RuntimeAttr?)CollectParentsKmerStats.ParentalReadsRepartitionAndMerylConfigure.runtime_attr_override
(RuntimeAttr?)
Defaults
child_read_assign_memoryG_est
(Int, default=32): [default-valued] estimate on how many GB memory to allocate to the child longread classification stepchild_read_assign_threads_est
(Int, default=36): [default-valued] estimate on how many threads to allocate to the child longread classification stepmeryl_operations_threads_est
(Int, default=8): [default-valued] estimate on how many threads to allocate to k-mer stats collection step
Outputs
reads_assigned_to_father
(File)reads_assigned_to_mother
(File)unassigned_reads
(File)