gnomad.resources.resource_utils

gnomad.resources.resource_utils.GNOMAD_PUBLIC_BUCKETS

Public buckets used to stage gnomAD data.

gnomad.resources.resource_utils.BaseResource([...])

Generic abstract resource class.

gnomad.resources.resource_utils.TableResource([...])

A Hail Table resource.

gnomad.resources.resource_utils.MatrixTableResource([...])

A Hail MatrixTable resource.

gnomad.resources.resource_utils.VariantDatasetResource([...])

A Hail VariantDataset resource.

gnomad.resources.resource_utils.PedigreeResource([...])

A pedigree resource.

gnomad.resources.resource_utils.BlockMatrixResource([...])

A Hail BlockMatrix resource.

gnomad.resources.resource_utils.ExpressionResource([...])

A Hail Expression resource.

gnomad.resources.resource_utils.BaseVersionedResource(...)

Class for a versioned resource.

gnomad.resources.resource_utils.VersionedTableResource(...)

Versioned Table resource.

gnomad.resources.resource_utils.VersionedMatrixTableResource(...)

Versioned MatrixTable resource.

gnomad.resources.resource_utils.VersionedVariantDatasetResource(...)

Versioned VariantDataset resource.

gnomad.resources.resource_utils.VersionedPedigreeResource(...)

Versioned Pedigree resource.

gnomad.resources.resource_utils.VersionedBlockMatrixResource(...)

Versioned BlockMatrix resource.

gnomad.resources.resource_utils.ResourceNotAvailable

Exception raised if a resource is not available from the selected source.

gnomad.resources.resource_utils.GnomadPublicResource([...])

Base class for the gnomAD project's public resources.

gnomad.resources.resource_utils.GnomadPublicTableResource([...])

Resource class for a public Hail Table published by the gnomAD project.

gnomad.resources.resource_utils.GnomadPublicMatrixTableResource([...])

Resource class for a public Hail MatrixTable published by the gnomAD project.

gnomad.resources.resource_utils.GnomadPublicPedigreeResource([...])

Resource class for a public pedigree published by the gnomAD project.

gnomad.resources.resource_utils.GnomadPublicBlockMatrixResource([...])

Resource class for a public Hail BlockMatrix published by the gnomAD project.

gnomad.resources.resource_utils.DataException

gnomad.resources.resource_utils.import_sites_vcf(...)

Import site-level data from a VCF into a Hail Table.

gnomad.resources.resource_utils.import_gencode(...)

Import GENCODE annotations GTF file as a Hail Table.

gnomad.resources.resource_utils.GNOMAD_PUBLIC_BUCKETS = ('gnomad-public', 'gnomad-public-requester-pays')

Public buckets used to stage gnomAD data.

gnomad-public is a legacy bucket and contains one readme text file.

The gnomAD Production Team writes output data to gnomad-public-requester-pays, and all data in this bucket syncs to the public bucket gcp-public-data–gnomad.

class gnomad.resources.resource_utils.BaseResource(path=None, import_args=None, import_func=None)[source]

Generic abstract resource class.

Parameters:
  • path (Optional[str]) – The resource path

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of (e.g. .vcf path for an imported VCF)

  • import_func (Optional[Callable]) – A function used to import the resource. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = []

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

property path
abstract import_resource(overwrite=True, **kwargs)[source]

Abstract method to import the resource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If True, overwrite an existing file at the destination.

  • kwargs – Any other parameters to be passed to the underlying hail write function (acceptable parameters depend on specific resource types)

Return type:

None

class gnomad.resources.resource_utils.TableResource(path=None, import_args=None, import_func=None)[source]

A Hail Table resource.

Parameters:
  • path (Optional[str]) – The Table path (typically ending in .ht)

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func (e.g. .vcf path for an imported VCF)

  • import_func (Optional[Callable]) – A function used to import the Table. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = ['.ht']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

ht(force_import=False, read_args=None)[source]

Read and return the Hail Table resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.read_table.

Return type:

Table

Returns:

Hail Table resource

import_resource(overwrite=True, **kwargs)[source]

Import the TableResource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If True, overwrite an existing file at the destination.

  • kwargs – Any other parameters to be passed to hl.Table.write

Return type:

None

Returns:

Nothing

class gnomad.resources.resource_utils.MatrixTableResource(path=None, import_args=None, import_func=None)[source]

A Hail MatrixTable resource.

Parameters:
  • path (Optional[str]) – The MatrixTable path (typically ending in .mt)

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func (e.g. .vcf path for an imported VCF)

  • import_func (Optional[Callable]) – A function used to import the MatrixTable. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = ['.mt']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

mt(force_import=False, read_args=None)[source]

Read and return the Hail MatrixTable resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.read_matrix_table.

Return type:

MatrixTable

Returns:

Hail MatrixTable resource

import_resource(overwrite=True, **kwargs)[source]

Import the MatrixTable resource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If set, existing file(s) will be overwritten

  • kwargs – Any other parameters to be passed to hl.MatrixTable.write

Return type:

None

Returns:

Nothing

class gnomad.resources.resource_utils.VariantDatasetResource(path=None, import_args=None, import_func=None)[source]

A Hail VariantDataset resource.

Parameters:
  • path (Optional[str]) – The VariantDataset path (typically ending in .vds)

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func (e.g. .vcf path for an imported VCF)

  • import_func (Optional[Callable]) – A function used to import the VariantDataset. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = ['.vds']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

vds(force_import=False, read_args=None)[source]

Read and return the Hail VariantDataset resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.vds.read_vds.

Return type:

VariantDataset

Returns:

Hail VariantDataset resource

import_resource(overwrite=True, **kwargs)[source]

Import the VariantDataset resource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If set, existing file(s) will be overwritten

  • kwargs – Any other parameters to be passed to hl.vds.VariantDataset.write

Return type:

None

Returns:

Nothing

class gnomad.resources.resource_utils.PedigreeResource(path=None, import_args=None, import_func=None, quant_pheno=False, delimiter='\\\\\\\\s+', missing='NA')[source]

A pedigree resource.

Parameters:
  • path (Optional[str]) – The Pedigree path (typically ending in .fam or .ped)

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func (e.g. .vcf path for an imported VCF)

  • import_func (Optional[Callable[..., Pedigree]]) – A function used to import the Pedigree. import_func will be passed the import_args dictionary as kwargs.

  • quant_pheno (bool) – If True, phenotype is interpreted as quantitative.

  • delimiter (str) – Field delimiter regex.

  • missing (str) – The string used to denote missing values. For case-control, 0, -9, and non-numeric are also treated as missing.

expected_file_extensions: List[str] = ['.fam', '.ped']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

ht()[source]

Read the pedigree into a family HT using hl.import_fam().

Return type:

Table

Returns:

Family table

pedigree()[source]

Read the pedigree into an hl.Pedigree using hl.Pedigree.read().

Parameters:

delimiter – Delimiter used in the ped file

Return type:

Pedigree

Returns:

pedigree

import_resource(overwrite=True, **kwargs)[source]

Import the Pedigree resource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If set, existing file(s) will be overwritten. IMPORTANT: Currently there is no implementation of this method when overwrite is set the False

  • kwargs – Any other parameters to be passed to hl.Pedigree.write

Return type:

None

Returns:

Nothing

class gnomad.resources.resource_utils.BlockMatrixResource(path=None, import_args=None, import_func=None)[source]

A Hail BlockMatrix resource.

Parameters:
  • path (Optional[str]) – The BlockMatrix path (typically ending in .bm)

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func.

  • import_func (Optional[Callable]) – A function used to import the BlockMatrix. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = ['.bm']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

bm(read_args=None)[source]

Read and return the Hail MatrixTable resource.

Parameters:

read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to BlockMatrix.read.

Return type:

BlockMatrix

Returns:

Hail MatrixTable resource

import_resource(overwrite=True, **kwargs)[source]

Import the BlockMatrixResource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If True, overwrite an existing file at the destination.

  • kwargs – Any additional parameters to be passed to BlockMatrix.write

Return type:

None

Returns:

Nothing

class gnomad.resources.resource_utils.ExpressionResource(path=None, import_args=None, import_func=None)[source]

A Hail Expression resource.

Parameters:
  • path (Optional[str]) – The Expression path (typically ending in .he).

  • import_args (Optional[Dict[str, Any]]) – Any sources that are required for the import and need to be kept track of and/or passed to the import_func (e.g. .vcf path for an imported VCF).

  • import_func (Optional[Callable]) – A function used to import the Expression. import_func will be passed the import_args dictionary as kwargs.

expected_file_extensions: List[str] = ['.he']

Expected file extensions for this resource type. If path doesn’t end with one of these, a warning is logged.

he(force_import=False, read_args=None)[source]

Read and return the Hail Expression resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.experimental.read_expression.

Return type:

Expression

Returns:

Hail Expression resource.

import_resource(overwrite=True, **kwargs)[source]

Import the Expression resource using its import_func and writes it in its path.

Parameters:
  • overwrite (bool) – If set, existing file(s) will be overwritten.

  • kwargs – Any other parameters to be passed to hl.experimental. write_expression.

Return type:

None

Returns:

Nothing.

class gnomad.resources.resource_utils.BaseVersionedResource(default_version, versions)[source]

Class for a versioned resource.

The attributes and methods of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute.

Parameters:
  • default_version (str) – The default version of this resource (must be in the versions dict)

  • versions (Dict[str, BaseResource]) – A dict of version name -> resource.

resource_class

alias of BaseResource

default_version
versions
class gnomad.resources.resource_utils.VersionedTableResource(default_version, versions)[source]

Versioned Table resource.

The attributes (path, import_args and import_func) of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute.

Parameters:
  • default_version (str) – The default version of this Table resource (must to be in the versions dict)

  • versions (Dict[str, TableResource]) – A dict of version name -> TableResource.

resource_class

alias of TableResource

class gnomad.resources.resource_utils.VersionedMatrixTableResource(default_version, versions)[source]

Versioned MatrixTable resource.

The attributes (path, import_args and import_func) of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute.

Parameters:
  • default_version (str) – The default version of this MatrixTable resource (must to be in the versions dict)

  • versions (Dict[str, MatrixTableResource]) – A dict of version name -> MatrixTableResource.

resource_class

alias of MatrixTableResource

class gnomad.resources.resource_utils.VersionedVariantDatasetResource(default_version, versions)[source]

Versioned VariantDataset resource.

The attributes (path, import_args and import_func) of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute. :type default_version: str :param default_version: The default version of this VariantDataset resource (must to be in the versions dict)

Parameters:

versions (Dict[str, VariantDatasetResource]) – A dict of version name -> VariantDatasetResource.

resource_class

alias of VariantDatasetResource

class gnomad.resources.resource_utils.VersionedPedigreeResource(default_version, versions)[source]

Versioned Pedigree resource.

The attributes (path, import_args and import_func) of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute.

Parameters:
  • default_version (str) – The default version of this Pedigree resource (must be in the versions dict)

  • versions (Dict[str, PedigreeResource]) – A dict of version name -> PedigreeResource.

resource_class

alias of PedigreeResource

class gnomad.resources.resource_utils.VersionedBlockMatrixResource(default_version, versions)[source]

Versioned BlockMatrix resource.

The attributes (path, import_args and import_func) of the versioned resource are those of the default version of the resource. In addition, all versions of the resource are stored in the versions attribute.

Parameters:
  • default_version (str) – The default version of this BlockMatrix resource (must to be in the versions dict)

  • versions (Dict[str, BlockMatrixResource]) – A dict of version name -> BlockMatrixResource.

resource_class

alias of BlockMatrixResource

exception gnomad.resources.resource_utils.ResourceNotAvailable[source]

Exception raised if a resource is not available from the selected source.

class gnomad.resources.resource_utils.GnomadPublicResource(path=None, import_args=None, import_func=None)[source]

Base class for the gnomAD project’s public resources.

Parameters:
  • path (Optional[str]) –

  • import_args (Optional[Dict[str, Any]]) –

  • import_func (Optional[Callable]) –

is_resource_available()[source]

Check if this resource is available from the selected source.

Return type:

bool

Returns:

True if the resource is available.

class gnomad.resources.resource_utils.GnomadPublicTableResource(path=None, import_args=None, import_func=None)[source]

Resource class for a public Hail Table published by the gnomAD project.

Parameters:
  • path (Optional[str]) –

  • import_args (Optional[Dict[str, Any]]) –

  • import_func (Optional[Callable]) –

ht(force_import=False, read_args=None)

Read and return the Hail Table resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.read_table.

Return type:

Table

Returns:

Hail Table resource

class gnomad.resources.resource_utils.GnomadPublicMatrixTableResource(path=None, import_args=None, import_func=None)[source]

Resource class for a public Hail MatrixTable published by the gnomAD project.

Parameters:
  • path (Optional[str]) –

  • import_args (Optional[Dict[str, Any]]) –

  • import_func (Optional[Callable]) –

mt(force_import=False, read_args=None)

Read and return the Hail MatrixTable resource.

Parameters:
  • force_import (bool) – If True, force the import of the resource even if it already exists.

  • read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to hl.read_matrix_table.

Return type:

MatrixTable

Returns:

Hail MatrixTable resource

class gnomad.resources.resource_utils.GnomadPublicPedigreeResource(path=None, import_args=None, import_func=None, quant_pheno=False, delimiter='\\\\\\\\s+', missing='NA')[source]

Resource class for a public pedigree published by the gnomAD project.

Parameters:
  • path (Optional[str]) –

  • import_args (Optional[Dict[str, Any]]) –

  • import_func (Optional[Callable[..., Pedigree]]) –

  • quant_pheno (bool) –

  • delimiter (str) –

  • missing (str) –

ht()

Read the pedigree into a family HT using hl.import_fam().

Return type:

Table

Returns:

Family table

pedigree()

Read the pedigree into an hl.Pedigree using hl.Pedigree.read().

Parameters:

delimiter – Delimiter used in the ped file

Return type:

Pedigree

Returns:

pedigree

class gnomad.resources.resource_utils.GnomadPublicBlockMatrixResource(path=None, import_args=None, import_func=None)[source]

Resource class for a public Hail BlockMatrix published by the gnomAD project.

Parameters:
  • path (Optional[str]) –

  • import_args (Optional[Dict[str, Any]]) –

  • import_func (Optional[Callable]) –

bm(read_args=None)

Read and return the Hail MatrixTable resource.

Parameters:

read_args (Optional[Dict[str, Any]]) – Any additional arguments to pass to BlockMatrix.read.

Return type:

BlockMatrix

Returns:

Hail MatrixTable resource

exception gnomad.resources.resource_utils.DataException[source]
gnomad.resources.resource_utils.import_sites_vcf(**kwargs)[source]

Import site-level data from a VCF into a Hail Table.

Return type:

Table

gnomad.resources.resource_utils.import_gencode(gtf_path, **kwargs)[source]

Import GENCODE annotations GTF file as a Hail Table.

Parameters:

gtf_path (str) – Path to GENCODE GTF file.

Return type:

Table

Returns:

Table with GENCODE annotation information.