gnomad_qc.v4.resources.release
Script containing release related resources.
Module Functions
|
Return path to file containing dictionary of parameters for site metric histograms. |
Fetch filepath for qual histograms JSON. |
|
Fetch filepath for release (variant-only) Hail Tables. |
|
Retrieve versioned resource for sites-only release Table. |
|
Fetch bucket for release (sites-only) VCFs. |
|
Fetch path to pickle file containing VCF header dictionary. |
|
|
Fetch path to TSV file containing extra fields to append to VCF header. |
Fetch filepath for all sites coverage or allele number release Table. |
|
|
Fetch path to coverage TSV file. |
|
Fetch path to all sites AN TSV file. |
Retrieve versioned resource for coverage release Table. |
|
Retrieve versioned resource for all sites allele number release Table. |
|
|
Fetch filepath for the JSON containing all datasets used in the release. |
Retrieve versioned resource for validated sites-only release Table. |
|
Fetch README for freq array for the specified data_type. |
|
|
Retrieve path for the liftover table containing three genes of interest within a false duplication in GRCh38. |
Script containing release related resources.
- gnomad_qc.v4.resources.release.annotation_hists_params_path(release_version='4.1', data_type='exomes')[source]
Return path to file containing dictionary of parameters for site metric histograms.
The keys of the dictionary are the names of the site quality metrics while the values are: [lower bound, upper bound, number of bins]. For example, “InbreedingCoeff”: [-0.25, 0.25, 50].
- Parameters:
release_version (
str
) – Release version. Defaults to CURRENT RELEASE.data_type (
str
) – Data type of annotation resource. e.g. “exomes” or “genomes”. Default is “exomes”.test – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
Path to file with annotation histograms
- gnomad_qc.v4.resources.release.qual_hists_json_path(release_version='4.1', data_type='exomes', test=False)[source]
Fetch filepath for qual histograms JSON.
- Parameters:
release_version (
str
) – Release version. Defaults to CURRENT RELEASEdata_type (
str
) – Data type ‘exomes’ or ‘genomes’. Default is ‘exomes’.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
File path for histogram JSON
- gnomad_qc.v4.resources.release.release_ht_path(data_type='exomes', release_version='4.1', public=False, test=False)[source]
Fetch filepath for release (variant-only) Hail Tables.
- Parameters:
data_type (
str
) – Data type of release resource to return. Should be one of ‘exomes’, ‘genomes’ or ‘joint’. Default is ‘exomes’.release_version (
str
) – Release version. Default is CURRENT_RELEASE.public (
bool
) – Whether release sites Table path returned is from public or private bucket. Default is False.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
File path for desired release Hail Table.
- gnomad_qc.v4.resources.release.release_sites(data_type='exomes', public=False, test=False)[source]
Retrieve versioned resource for sites-only release Table.
- Parameters:
data_type (
str
) – Data type of release resource to return. Should be one of ‘exomes’, ‘genomes’, or ‘joint’. Default is ‘exomes’.public (
bool
) – Whether release sites Table path returned is from public or private bucket. Default is False.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
VersionedTableResource
- Returns:
Sites-only release Table.
- gnomad_qc.v4.resources.release.release_vcf_path(release_version=None, test=False, data_type='exomes', contig=None)[source]
Fetch bucket for release (sites-only) VCFs.
- Parameters:
release_version (
Optional
[str
]) – Release version. When no release_version is supplied CURRENT_RELEASE is used.test (
bool
) – Whether to use a tmp path for testing. Default is False.data_type (
str
) – Data type of release resource to return. Should be one of ‘exomes’ or ‘genomes’. Default is ‘exomes’.contig (
Optional
[str
]) – String containing the name of the desired reference contig. Default is the full (all contigs) sites VCF path.
- Return type:
str
- Returns:
Filepath for the desired VCF.
- gnomad_qc.v4.resources.release.release_header_path(release_version=None, data_type='exomes', test=False)[source]
Fetch path to pickle file containing VCF header dictionary.
- Parameters:
release_version (
Optional
[str
]) – Release version. When no release_version is supplied CURRENT_RELEASE is useddata_type (
str
) – Data type of release resource to return. Should be one of ‘exomes’ or ‘genomes’. Default is ‘exomes’.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
Filepath for header dictionary pickle.
- gnomad_qc.v4.resources.release.append_to_vcf_header_path(subset=None, release_version='4.1', data_type='exomes')[source]
Fetch path to TSV file containing extra fields to append to VCF header.
Extra fields are VEP and dbSNP versions.
- Parameters:
subset (
str
) – One of the possible release subsets.release_version (
str
) – Release version. Defaults to CURRENT RELEASE.data_type (
str
) – Data type of release resource to return. Should be one of . ‘exomes’ or ‘genomes’. Default is ‘exomes’.
- Return type:
str
- Returns:
Filepath for extra fields TSV file.
- gnomad_qc.v4.resources.release.release_coverage_path(data_type='exomes', release_version='4.1', public=True, test=False, stratify=True, coverage_type='coverage')[source]
Fetch filepath for all sites coverage or allele number release Table.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.release_version (
str
) – Release version.public (
bool
) – Determines whether release coverage Table is read from public or private bucket. Default is public.test (
bool
) – Whether to use a tmp path for testing. Default is False.stratify (
bool
) – Whether to stratify results by platform and subset. Default is True.coverage_type (
str
) – ‘coverage’ or ‘allele_number’. Default is ‘coverage’.
- Return type:
str
- Returns:
File path for desired coverage Hail Table.
- gnomad_qc.v4.resources.release.release_coverage_tsv_path(data_type='exomes', release_version='4.0', test=False)[source]
Fetch path to coverage TSV file.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.release_version (
str
) – Release version. Default is CURRENT_COVERAGE_RELEASE[“exomes”].test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
Coverage TSV path.
- gnomad_qc.v4.resources.release.release_all_sites_an_tsv_path(data_type='exomes', release_version=None, test=False)[source]
Fetch path to all sites AN TSV file.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.release_version (
str
) – Release version. Default is CURRENT_ALL_SITES_AN_RELEASE[data_type].test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
All sites AN TSV path.
- gnomad_qc.v4.resources.release.release_coverage(data_type='exomes', public=False, test=False, stratify=True)[source]
Retrieve versioned resource for coverage release Table.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.public (
bool
) – Determines whether release coverage Table is read from public or private bucket. Default is private.test (
bool
) – Whether to use a tmp path for testing. Default is False.stratify (
bool
) – Whether to stratify results by platform and subset. Default is True.
- Return type:
VersionedTableResource
- Returns:
Coverage release Table.
- gnomad_qc.v4.resources.release.release_all_sites_an(data_type='exomes', public=False, test=False)[source]
Retrieve versioned resource for all sites allele number release Table.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.public (
bool
) – Determines whether release allele number Table is read from public or private bucket. Default is private.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
VersionedTableResource
- Returns:
All sites allele number release Table.
- gnomad_qc.v4.resources.release.included_datasets_json_path(data_type='exomes', test=False, release_version='4.1')[source]
Fetch filepath for the JSON containing all datasets used in the release.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.test (
bool
) – Whether to use a tmp path for testing. Default is False.release_version (
str
) – Release version. Defaults to CURRENT RELEASE
- Return type:
str
- Returns:
File path for release versions included datasets JSON
- gnomad_qc.v4.resources.release.validated_release_ht(test=False, data_type='exomes')[source]
Retrieve versioned resource for validated sites-only release Table.
- Parameters:
test (
bool
) – Whether to use a tmp path for testing. Default is False.data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.
- Return type:
VersionedTableResource
- Returns:
Validated release Table
- gnomad_qc.v4.resources.release.get_freq_array_readme(data_type='exomes')[source]
Fetch README for freq array for the specified data_type.
- Parameters:
data_type (
str
) – ‘exomes’ or ‘genomes’. Default is ‘exomes’.- Return type:
str
- Returns:
README for freq array.
- gnomad_qc.v4.resources.release.get_false_dup_genes_path(release_version='4.1', test=False)[source]
Retrieve path for the liftover table containing three genes of interest within a false duplication in GRCh38.
- Parameters:
release_version (
str
) – Release version. Defaults to CURRENT RELEASE.test (
bool
) – Whether to use a tmp path for testing. Default is False.
- Return type:
str
- Returns:
Combined custom liftover table path for the three genes in false duplication.