gnomad_qc.v5.resources.sample_qc

Script containing sample QC related resources.

Module Functions

gnomad_qc.v5.resources.sample_qc.get_sample_qc_root([...])

Return the root GCS path to sample QC results.

gnomad_qc.v5.resources.sample_qc.get_sample_qc([...])

Get AoU sample QC annotations generated by Hail for the specified stratification.

gnomad_qc.v5.resources.sample_qc.get_aou_mt_union([test])

Get the union of AoU ACAF and exome MatrixTables.

gnomad_qc.v5.resources.sample_qc.get_joint_qc([test])

Get joint (exomes + genomes) gnomAD v4 + AoU dense MatrixTableResource.

gnomad_qc.v5.resources.sample_qc.get_cuking_input_path([...])

Return the path containing the input files read by cuKING.

gnomad_qc.v5.resources.sample_qc.get_cuking_output_path([...])

Return the path containing the output files written by cuKING.

gnomad_qc.v5.resources.sample_qc.relatedness([...])

Get the VersionedTableResource for relatedness results.

Script containing sample QC related resources.

gnomad_qc.v5.resources.sample_qc.get_sample_qc_root(version='5.0', test=False, data_type='genomes', data_set='aou')[source]

Return the root GCS path to sample QC results.

Parameters:
  • version (str) – Sample QC version (default: CURRENT_SAMPLE_QC_VERSION).

  • test (bool) – If True, return a temporary path (e.g., for testing or development).

  • data_type (str) – Data type (e.g., “genomes” or “exomes”).

  • data_set (str) – Dataset identifier (e.g., “aou”, “hgdp_tgp”).

Return type:

str

Returns:

GCS path to the sample QC directory.

gnomad_qc.v5.resources.sample_qc.get_sample_qc(strat='all', test=False)[source]

Get AoU sample QC annotations generated by Hail for the specified stratification.

Possible values for strat:
  • bi_allelic

  • multi_allelic

  • all

Parameters:
  • strat (str) – Which stratification to return.

  • test (bool) – Whether to use a tmp path for analysis of the test VDS instead of the full VDS.

Return type:

VersionedTableResource

Returns:

Sample QC table.

gnomad_qc.v5.resources.sample_qc.get_aou_mt_union(test=True)[source]

Get the union of AoU ACAF and exome MatrixTables.

Parameters:

test (bool) – Whether to use a tmp path for a test resource. Default is True.

Return type:

MatrixTableResource

Returns:

MatrixTableResource containing the union of AoU ACAF and exome MTs.

gnomad_qc.v5.resources.sample_qc.get_joint_qc(test=False)[source]

Get joint (exomes + genomes) gnomAD v4 + AoU dense MatrixTableResource.

Parameters:

test (bool) – Whether to use a tmp path for a test resource.

Return type:

VersionedMatrixTableResource

Returns:

VersionedMatrixTableResource of QC sites.

gnomad_qc.v5.resources.sample_qc.get_cuking_input_path(version='5.0', test=False, environment='rwb')[source]

Return the path containing the input files read by cuKING.

Those files correspond to Parquet tables derived from the dense QC matrix.

Parameters:
  • version (str) – Sample QC version (default: CURRENT_SAMPLE_QC_VERSION).

  • test (bool) – Whether to return a path corresponding to a test subset. Default is False.

  • environment (str) – Compute environment, either ‘dataproc’ or ‘rwb’. Default is ‘rwb’.

Return type:

str

Returns:

Temporary path to hold Parquet input tables for running cuKING.

gnomad_qc.v5.resources.sample_qc.get_cuking_output_path(version='5.0', test=False, environment='rwb')[source]

Return the path containing the output files written by cuKING.

Those files correspond to Parquet tables containing relatedness results.

Parameters:
  • version (str) – Sample QC version (default: CURRENT_SAMPLE_QC_VERSION).

  • test (bool) – Whether to return a path corresponding to a test subset. Default is False.

  • environment (str) – Compute environment, either ‘dataproc’ or ‘rwb’. Default is ‘rwb’.

Return type:

str

Returns:

Temporary path to hold Parquet output tables for running cuKING.

gnomad_qc.v5.resources.sample_qc.relatedness(test=False, raw=False)[source]

Get the VersionedTableResource for relatedness results.

Parameters:
  • test (bool) – Whether to use a tmp path for a test resource.

  • raw (bool) – Whether to return the raw cuKING output in Hail Table format. If False, returns the processed relatedness table. Default is False.

Return type:

VersionedTableResource

Returns:

VersionedTableResource.