CollectParentsKmerStats

CollectParentsKmerStats

description: A workflow that performs trio-binning of child long reads given parental (short) reads. Based on the trio-canu publication https://www.nature.com/articles/nbt.4277. This holds the sub-workflow for part one: collect k-mer stats given parental (short) reads

Inputs

Required

father_short_reads_bucket (String, required): GCS bucket path holding FASTA/FASTQ of (short) reads of paternal origin
genome_size (String, required): an esimate on genome size of the specicies (affects k-value picking)
mother_short_reads_bucket (String, required): GCS bucket path holding FASTA/FASTQ of (short) reads of maternal origin
workdir_name (String, required): name of working directory

Optional

kmerSize (Int?): [optional] force specifying k-value in collecting k-mer stats on parents
run_with_debug (Boolean?): [optional] whether to run in debug mode (takes significantly more disk space and more logs); defaults to false
MerylCount.runtime_attr_override (RuntimeAttr?)
MerylMergeAndSubtract.runtime_attr_override (RuntimeAttr?)
ParentalReadsRepartitionAndMerylConfigure.runtime_attr_override (RuntimeAttr?)

Defaults

meryl_operations_threads_est (Int, default=8): [default-valued] estimate on how many threads to allocate to k-mer stats collection step

Outputs

Father_haplotype_merylDB (Array[File])
Mother_haplotype_merylDB (Array[File])
Father_reads_statistics (File)
Mother_reads_statistics (File)

Dot Diagram

CollectParentsKmerStats

GitHub « Previous Next »