gnomad_toolbox.filtering.vep

Functions to filter gnomAD sites HT by VEP annotations.

Module Functions

gnomad_toolbox.filtering.vep.filter_by_consequence_category([...])

Filter gnomAD variants based on VEP consequence.

gnomad_toolbox.filtering.vep.get_gene_intervals(...)

Get the GENCODE genomic intervals for a given gene symbol.

gnomad_toolbox.filtering.vep.filter_to_high_confidence_loftee([...])

Filter gnomAD variants to high-confidence LOFTEE variants for a gene.

Functions to filter gnomAD sites HT by VEP annotations.

gnomad_toolbox.filtering.vep.filter_by_consequence_category(plof=False, missense=False, synonymous=False, other=False, pass_filters=True, **kwargs)[source]

Filter gnomAD variants based on VEP consequence.

https://gnomad.broadinstitute.org/help/consequence-category-filter

The [VEP](https://useast.ensembl.org/info/docs/tools/vep/index.html) consequences included in each category are:

pLoF:

  • transcript_ablation

  • splice_acceptor_variant

  • splice_donor_variant

  • stop_gained

  • frameshift_variant

Missense / Inframe indel:

  • stop_lost

  • start_lost

  • inframe_insertion

  • inframe_deletion

  • missense_variant

Synonymous:

  • synonymous_variant

Other:

  • All other consequences not included in the above categories.

Parameters:
  • plof (bool) – Whether to include pLoF variants.

  • missense (bool) – Whether to include missense variants.

  • synonymous (bool) – Whether to include synonymous variants.

  • other (bool) – Whether to include other variants.

  • pass_filters (bool) – Boolean if the variants pass the filters.

  • kwargs – Arguments to pass to _get_dataset.

Return type:

Table

Returns:

Table with variants with the specified consequences.

gnomad_toolbox.filtering.vep.get_gene_intervals(gene_symbol, gencode_version=None)[source]

Get the GENCODE genomic intervals for a given gene symbol.

Parameters:
  • gene_symbol (str) – Gene symbol.

  • gencode_version (Optional[str]) – Optional GENCODE version. If not provided, uses the gencode version associated with the gnomAD session.

Return type:

List[Interval]

Returns:

List of GENCODE intervals for the specified gene.

gnomad_toolbox.filtering.vep.filter_to_high_confidence_loftee(gene_symbol=None, no_lof_flags=False, mane_select_only=False, canonical_only=False, version=None, **kwargs)[source]

Filter gnomAD variants to high-confidence LOFTEE variants for a gene.

Parameters:
  • gene_symbol (Optional[str]) – Optional gene symbol to filter by.

  • no_lof_flags (bool) – Whether to exclude variants with LOFTEE flags. Default is False.

  • mane_select_only (bool) – Whether to include only MANE Select transcripts. Default is False.

  • canonical_only (bool) – Whether to include only canonical transcripts. Default is False.

  • version (Optional[str]) – Optional version of the dataset to use.

  • kwargs – Additional arguments to pass to _get_dataset.

Return type:

Table

Returns:

Table with high-confidence LOFTEE variants.