DownloadFromWeb
DownloadFromWeb
- description
- This WDL pipeline downloads directories from HTTP/FTP/SFTP servers in parallel and stores the results in the specified GCS dir. This pipeline is essentially a Cromwell/GCP reimagining of the Nextflow/AWS downloading pipeline from @alaincoletta (see: http://broad.io/aws_dl).
Inputs
Required
gcs_out_root_dir
(String, required): GCS bucket to store the reads, variants, and metrics filesmanifest
(File, required): A file with a list of SRA ID(s) to download on each line
Optional
DownloadFiles.runtime_attr_override
(RuntimeAttr?)
Defaults
num_simultaneous_downloads
(Int, default=10): [default-valued] The number of files to fetch simultaneously.prepend_dir_name
(Boolean, default=true): If true, place the files in a subdirectory based on the basename of the FTP dir.DownloadFiles.disk_size_gb
(Int, default=100)DownloadFiles.num_cpus
(Int, default=4)
Outputs
DownloadFiles.out
(Array[String])