Utils
ChunkManifest
- description
- Chunk a manifest file into smaller files
Inputs
Required
manifest
(File, required): The manifest file to chunkmanifest_lines_per_chunk
(Int, required): The number of lines to include in each chunk
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Outputs
manifest_chunks
(Array[File])
SortSam
- description
- Sort a BAM file by coordinate order
Inputs
Required
input_bam
(File, required): The BAM file to sortprefix
(String, required): The basename for the output BAM file
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Outputs
output_bam
(File)output_bam_index
(File)
MakeChrIntervalList
- description
- Make a Picard-style list of intervals for each chromosome in the reference genome
Inputs
Required
ref_dict
(File, required): The reference dictionary
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Defaults
filter
(Array[String], default=['random', 'chrUn', 'decoy', 'alt', 'HLA', 'EBV']): A list of strings to filter out of the reference dictionary
Outputs
chrs
(Array[Array[String]])interval_list
(File)contig_interval_strings
(Array[String])contig_interval_list_files
(Array[File])
ExtractIntervalNamesFromIntervalOrBamFile
- description
- Pulls the contig names and regions out of an interval list or bed file.
Inputs
Required
interval_file
(File, required): Interval list or bed file from which to extract contig names and regions.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Outputs
interval_info
(Array[Array[String]])
MakeIntervalListFromSequenceDictionary
- description
- Make a Picard-style list of intervals that covers the given reference genome dictionary, with intervals no larger than the given size limit.
Inputs
Required
ref_dict
(File, required): The reference dictionary
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Defaults
ignore_contigs
(Array[String], default=['random', 'chrUn', 'decoy', 'alt', 'HLA', 'EBV']): A list of strings to filter out of the reference dictionarymax_interval_size
(Int, default=10000)
Outputs
interval_list
(File)interval_info
(Array[Array[String]])
CreateIntervalListFileFromIntervalInfo
- description
- Make a Picard-style interval list file from the given interval info.
Inputs
Required
contig
(String, required): Contig for the interval.end
(String, required): End position for the interval.start
(String, required): Start position for the interval.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Outputs
interval_list
(File)
CountBamRecords
- description
- Count the number of records in a bam file
Inputs
Required
bam
(File, required); localization_optional: true; description: The bam file
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes
Outputs
samools_error
(File?)num_records
(Int)
DownsampleSam
- description
- Downsample the given bam / sam file using Picard/GATK's DownsampleSam tool.
- author
- Jonn Smith
- jonn@broadinstitute.org
Inputs
Required
bam
(File, required): BAM file to be filtered.
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
extra_args
(String, default=""): [Optional] Extra arguments to pass into DownsampleSam.prefix
(String, default="downsampled_reads"): [Optional] Prefix string to name the output file (Default: downsampled_reads).probability
(Float, default=0.01): [Optional] Probability that a read will be emitted (Default: 0.01).random_seed
(Int, default=1)strategy
(String, default="HighAccuracy"): [Optional] Strategy to use to downsample the given bam file (Default: HighAccuracy).
Outputs
output_bam
(File)output_bam_index
(File)
Sum
- description
- Sum a list of integers.
Inputs
Required
ints
(Array[Int], required): List of integers to be summed.
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
prefix
(String, default="sum"): [Optional] Prefix string to name the output file (Default: sum).
Outputs
sum
(Int)sum_file
(File)
Uniq
- description
- Find the unique elements in a list of strings.
Inputs
Required
strings
(Array[String], required): List of strings to be filtered.
Optional
runtime_attr_override
(RuntimeAttr?)
Outputs
unique_strings
(Array[String])
Timestamp
- description
- Get the current timestamp.
Inputs
Required
dummy_dependencies
(Array[String], required): List of dummy dependencies to force recomputation.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Outputs
timestamp
(String)
ConvertReads
- description
- Convert reads from one format to another.
Inputs
Required
output_format
(String, required): Output format.reads
(File, required): Reads to be converted.
Outputs
converted_reads
(File)
BamToBed
- description
- Convert a BAM file to a bed file.
Inputs
Required
bam
(File, required): BAM file to be converted.prefix
(String, required): Prefix for the output bed file.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Outputs
bed
(File)
BamToFastq
- description
- Convert a BAM file to a fastq file.
Inputs
Required
bam
(File, required): BAM file to be converted.prefix
(String, required): Prefix for the output fastq file.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Outputs
reads_fq
(File)
MergeFastqs
- description
- Merge fastq files.
Inputs
Required
fastqs
(Array[File], required): Fastq files to be merged.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Defaults
prefix
(String, default="merged"): Prefix for the output fastq file.
Outputs
merged_fastq
(File)
MergeBams
- description
- Merge several input BAMs into a single BAM.
Inputs
Required
bams
(Array[File], required): Input array of BAMs to be merged.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Defaults
prefix
(String, default="out"): Prefix for the output BAM.
Outputs
merged_bam
(File)merged_bai
(File)
Index
- description
- samtools index a BAM file.
Inputs
Required
bam
(File, required): BAM file to be indexed.
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Outputs
bai
(File)
SubsetBam
- description
- Subset a BAM file to a specified locus.
Inputs
Required
bai
(File, required): index for bam filebam
(File, required); description: bam to subset; localization_optional: truelocus
(String, required): genomic locus to select
Optional
runtime_attr_override
(RuntimeAttr?): Override the default runtime attributes.
Defaults
prefix
(String, default="subset"): prefix for output bam and bai file names
Outputs
subset_bam
(File)subset_bai
(File)
ResilientSubsetBam
- description
- For subsetting a high-coverage BAM stored in GCS, without localizing (more resilient to auth. expiration).
Inputs
Required
bai
(File, required)bam
(File, required); localization_optional: trueinterval_id
(String, required): an ID string for representing the intervals in the interval list fileinterval_list_file
(File, required): a Picard-style interval list file to subset reads withprefix
(String, required): prefix for output bam and bai file names
Optional
runtime_attr_override
(RuntimeAttr?)
Outputs
subset_bam
(File)subset_bai
(File)
Bamtools
- description
- Runs a given bamtools command on a bam file
Inputs
Required
args
(String, required): arguments to pass to bamtoolsbamfile
(File, required): bam file to run bamtools oncmd
(String, required): bamtools command to run
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
prefix
(String, default="out")
Outputs
bam
(File)
DeduplicateBam
- description
- Utility to drop (occationally happening) duplicate records in input BAM
Inputs
Required
aligned_bai
(File, required): input BAM index filealigned_bam
(File, required): input BAM file
Optional
runtime_attr_override
(RuntimeAttr?): override default runtime attributes
Defaults
same_name_as_input
(Boolean, default=true): if true, output BAM will have the same name as input BAM, otherwise it will have the input basename with .dedup suffix
Outputs
corrected_bam
(File)corrected_bai
(File)
Cat
- description
- Utility to concatenates a group of files into a single output file, with headers in the first line if has_header is true. If has_header is false, the script concatenates the files without headers.
Inputs
Required
files
(Array[File], required): text files to combine
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
has_header
(Boolean, default=false): files have a redundant headerout
(String, default="out.txt"): [default-valued] output filename
Outputs
combined
(File)
ComputeGenomeLength
- description
- Utility to compute the length of a genome from a FASTA file
Inputs
Required
fasta
(File, required): FASTA file
Optional
runtime_attr_override
(RuntimeAttr?)
Outputs
length
(Float)
ListFilesOfType
- description
- Utility to list files of a given type in a directory
Inputs
Required
gcs_dir
(String, required): input directorysuffixes
(Array[String], required): suffix(es) for files
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
recurse
(Boolean, default=false): if true, recurse through subdirectories
Outputs
files
(Array[String])manifest
(File)
StopWorkflow
- description
- Utility to stop a workflow
Inputs
Required
reason
(String, required): reason for stopping
Outputs
None
InferSampleName
- description
- Infer sample name encoded on the @RG line of the header section. Fails if multiple values found, or if SM ~= unnamedsample.
Inputs
Required
bai
(File, required)bam
(File, required); localization_optional: true; description: BAM file
Outputs
sample_name
(String)
CheckOnSamplenames
- description
- Makes sure the provided sample names are all same, i.e. no mixture of sample names
Inputs
Required
sample_names
(Array[String], required): sample names
Outputs
None
ComputeAllowedLocalSSD
- description
- Compute the number of LOCAL ssd's allowed by Google
Inputs
Required
intended_gb
(Int, required): intended number of GB
Outputs
numb_of_local_ssd
(Int)
RandomZoneSpewer
- description
- Spews a random GCP zone
Inputs
Required
num_of_zones
(Int, required): number of zones to spew
Outputs
zones
(String)
GetCurrentTimestampString
- volatile
- true
- description
- Get the current timestamp as a string
Inputs
Defaults
date_format
(String, default="%Y%m%d_%H%M%S_%N"): The date format string to use. See the unixdate
command for more info.
Outputs
timestamp_string
(String)
GetRawReadGroup
- description
- Get the raw read group from a bam file (assumed to have 1 read group only)
Inputs
Required
gcs_bam_path
(String, required): path to bam file in GCS
Optional
runtime_attr_override
(RuntimeAttr?): override the runtime attributes
Outputs
rg
(String)
GetReadsInBedFileRegions
- desciption
- Get the reads from the given bam path which overlap the regions in the given bed file.
Inputs
Required
gcs_bam_path
(String, required): GCS URL to bam file from which to extract reads.regions_bed
(File, required): Bed file containing regions for which to extract reads.
Optional
runtime_attr_override
(RuntimeAttr?): Runtime attributes override struct.
Defaults
prefix
(String, default="reads"): [default-valued] prefix for output BAM
Outputs
bam
(File)bai
(File)
MapToTsv
- description
- Convert a map to a tsv file
Inputs
Required
my_map
(Map[String,Float], required): The map to convertname_of_file
(String, required): The name of the file to write to
Outputs
result
(File)
CreateIGVSession
- description
- Create an IGV session given a list of IGV compatible file paths. Adapted / borrowed from https://github.com/broadinstitute/palantir-workflows/blob/mg_benchmark_compare/BenchmarkVCFs .
Inputs
Required
input_bams
(Array[String], required)input_vcfs
(Array[String], required)output_name
(String, required)reference_short_name
(String, required)
Optional
runtime_attr_override
(RuntimeAttr?)
Outputs
igv_session
(File)
SplitContigToIntervals
- author
- Jonn Smith
- notes
- Splits the given contig into intervals of the given size.
Inputs
Required
contig
(String, required)prefix
(String, required)ref_dict
(File, required)ref_fasta
(File, required)ref_fasta_fai
(File, required)
Optional
runtime_attr_override
(RuntimeAttr?)
Defaults
size
(Int, default=200000)
Outputs
full_bed_file
(File)individual_bed_files
(Array[File])
ResolveMapKeysInPriorityOrder
- description
- Gets the first key in the map that exists. If no keys exist, returns an empty string.
Inputs
Required
keys
(Array[String], required): Array[String] of keys to check in order of prioritymap
(Map[String,String], required): Map[String, String] to resolve.
Outputs
key
(String)