gnomad.utils.reference_genome

gnomad.utils.reference_genome.get_reference_ht(ref)

Create a reference Table with locus and alleles (containing only the reference allele by default) from the given reference genome.

gnomad.utils.reference_genome.add_reference_sequence(ref)

Add the fasta sequence to a Hail reference genome.

gnomad.utils.reference_genome.get_reference_genome(locus)

Return the reference genome associated with the input Locus expression.

gnomad.utils.reference_genome.get_reference_ht(ref, contigs=None, excluded_intervals=None, add_all_substitutions=False, filter_n=True)[source]

Create a reference Table with locus and alleles (containing only the reference allele by default) from the given reference genome.

Note

If the contigs argument is not provided, all contigs (including obscure ones) will be added to the table. This can be slow as contigs are added one by one.

Parameters:
  • ref (ReferenceGenome) – Input reference genome

  • contigs (Optional[List[str]]) – An optional list of contigs that the Table should include

  • excluded_intervals (Optional[List[Interval]]) – An optional list of intervals to exclude

  • add_all_substitutions (bool) – If set, then all possible substitutions are added in the alleles array

  • filter_n (bool) – If set, bases where the reference is unknown (n) are filtered.

Return type:

Table

Returns:

gnomad.utils.reference_genome.add_reference_sequence(ref)[source]

Add the fasta sequence to a Hail reference genome.

Only GRCh37 and GRCh38 references are supported.

Parameters:

ref (ReferenceGenome) – Input reference genome.

Return type:

ReferenceGenome

Returns:

gnomad.utils.reference_genome.get_reference_genome(locus, add_sequence=False)[source]

Return the reference genome associated with the input Locus expression.

Parameters:
Return type:

ReferenceGenome

Returns:

Reference genome