gnomad_qc.v4.sample_qc.platform_inference
Script to assign platforms based on per interval fraction of bases over DP 0 PCA results using HDBSCAN.
usage: gnomad_qc.v4.sample_qc.platform_inference.py [-h] [-o] [--test]
[--calling-interval-name {ukb,broad,intersection}]
[--calling-interval-padding {0,50}]
[--run-platform-pca]
[--n-platform-pcs N_PLATFORM_PCS]
[--assign-platforms]
[--n-assignment-pcs N_ASSIGNMENT_PCS]
[--hdbscan-min-samples HDBSCAN_MIN_SAMPLES]
[--hdbscan-min-cluster-size HDBSCAN_MIN_CLUSTER_SIZE]
[--slack-channel SLACK_CHANNEL]
Named Arguments
- -o, --overwrite
Overwrite output files (default: False).
Default: False
- --test
Use the v4 test dataset instead of the full dataset.
Default: False
- --calling-interval-name
Possible choices: ukb, broad, intersection
Name of calling intervals to use for interval coverage. One of: ‘ukb’, ‘broad’, or ‘intersection’.
Default: “intersection”
- --calling-interval-padding
Possible choices: 0, 50
Number of base pair padding to use on the calling intervals. One of 0 or 50 bp.
Default: 50
- --run-platform-pca
Runs platform PCA (assumes coverage MatrixTable was computed, –compute-coverage).
Default: False
- --n-platform-pcs
Number of platform PCs to compute.
Default: 30
- --assign-platforms
Assigns platforms based on per interval fraction of bases over DP 0 PCA results using HDBSCAN.
Default: False
- --n-assignment-pcs
Number of platform PCs to use for platform assignment.
Default: 10
- --hdbscan-min-samples
Minimum samples parameter for HDBSCAN. If not specified, –hdbscan-min-cluster-size is used.
- --hdbscan-min-cluster-size
Minimum cluster size parameter for HDBSCAN.
Default: 50
- --slack-channel
Slack channel to post results and notifications to.
Module Functions
Assign platforms based on PCA of per interval fraction of bases over DP 0. |
|
|
Get script argument parser. |
Script to assign platforms based on per interval fraction of bases over DP 0 PCA results using HDBSCAN.