DownloadFromFTP
DownloadFromFTP
- description
- Download files from FTP in parallel and store the results in the specified GCS dir. This pipeline is essentially a Cromwell/GCP reimagining of the Nextflow/AWS downloading pipeline from @alaincoletta (see: http://broad.io/aws_dl).
Inputs
Required
ftp_dirs
(Array[String], required): The FTP directories to downloadgcs_out_root_dir
(String, required)
Optional
ComputeDiskSize.runtime_attr_override
(RuntimeAttr?)DownloadFTPFile.runtime_attr_override
(RuntimeAttr?)GetFileManifest.runtime_attr_override
(RuntimeAttr?)
Defaults
exclude
(Array[String], default=[]): [default-valued] Simple substring patterns to exclude from the download.num_simultaneous_downloads
(Int, default=10): [default-valued] The number of files to fetch simultaneously.prepend_dir_name
(Boolean, default=true): If true, place the files in a subdirectory based on the basename of the FTP dir.
Outputs
GetFileManifest.manifest
(File)DownloadFTPFile.out
(Array[String?])ComputeDiskSize.max_size_bytes
(Array[Float])