gnomad_toolbox.filtering.variant
Functions to filter the gnomAD sites HT to a specific set of variants.
Module Functions
Get a single variant from the gnomAD HT. |
|
Filter variants by interval(s). |
|
|
Filter variants by gene symbol. |
|
Get the age distribution of a variant. |
Functions to filter the gnomAD sites HT to a specific set of variants.
- gnomad_toolbox.filtering.variant.get_single_variant(variant=None, contig=None, position=None, ref=None, alt=None, dataset='variant', **kwargs)[source]
Get a single variant from the gnomAD HT.
Note
One of variant or all of contig, position, ref, and alt must be provided. If variant is provided, contig, position, ref, and alt are ignored.
- Parameters:
variant (
Optional[str]) – Variant string in the format “chr12-235245-A-C” or “chr12:235245:A:C”. If provided, contig, position, ref, and alt are ignored.contig (
Optional[str]) – Chromosome of the variant. Required if variant is not provided.position (
Optional[int]) – Variant position. Required if variant is not provided.ref (
Optional[str]) – Reference allele. Required if variant is not provided.alt (
Optional[str]) – Alternate allele. Required if variant is not provided.kwargs – Additional arguments to pass to _get_dataset.
dataset (
Optional[str]) –
- Return type:
- Returns:
Table with the single variant.
- gnomad_toolbox.filtering.variant.filter_by_intervals(intervals, padding_bp=0, **kwargs)[source]
Filter variants by interval(s).
- Parameters:
intervals (
Union[str,list[str]]) – Interval string or list of interval strings. The interval string format has to be “contig:start-end”, e.g.,”1:1000-2000” (GRCh37) or “chr1:1000-2000” (GRCh38).padding_bp (
int) – Number of base pairs to pad the intervals. Default is 0bp.kwargs – Arguments to pass to _get_dataset.
- Return type:
- Returns:
Table with variants in the interval(s).
- gnomad_toolbox.filtering.variant.filter_by_gene_symbol(gene, exon_padding_bp=75, **kwargs)[source]
Filter variants by gene symbol.
Note
This function is to match the number of variants that you will get in the gnomAD browser when you search for a gene symbol. The gnomAD browser filters to only variants located in or within 75 base pairs of CDS or non-coding exons of a gene.
- Parameters:
gene (
str) – Gencode gene symbol.exon_padding_bp (
int) – Number of base pairs to pad the intervals. Default is 75bp.kwargs – Arguments to pass to _get_dataset.
- Return type:
- Returns:
Table with variants in the specified gene.
- gnomad_toolbox.filtering.variant.get_age_distribution(variant=None, contig=None, position=None, ref=None, alt=None, **kwargs)[source]
Get the age distribution of a variant.
- Parameters:
variant (
Optional[str]) – Variant string in the format “chr12-235245-A-C” or “chr12:235245:A:C”. If provided, contig, position, ref, and alt are ignored.contig (
Optional[str]) – Chromosome of the variant. Required if variant is not provided.position (
Optional[int]) – Variant position. Required if variant is not provided.ref (
Optional[str]) – Reference allele. Required if variant is not provided.alt (
Optional[str]) – Alternate allele. Required if variant is not provided.kwargs – Additional arguments to pass to _get_dataset.
- Return type:
- Returns:
Table with the age distribution of the variant.