Gamgee
You miserable little maggot. I'll stove your head in!
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Static Public Member Functions | List of all members
gamgee::Genotype Class Reference

Encodes a genotype. More...

#include <genotype.h>

Public Member Functions

 Genotype (const std::shared_ptr< bcf1_t > &body, const bcf_fmt_t *const format_ptr, const uint8_t *data_ptr)
 Constructs a genotype. More...
 
 Genotype (const Genotype &other)=delete
 copying of the Genotype object is not allowed. More...
 
Genotypeoperator= (const Genotype &other)=delete
 copying of the Genotype object is not allowed. More...
 
 Genotype (Genotype &&other)=default
 Explicit default as recommended by many threads on stackoverflow. More...
 
Genotypeoperator= (Genotype &&other)=default
 Explicit default as recommended by many threads on stackoverflow. More...
 
 ~Genotype ()=default
 Explicit default as recommended by many threads on stackoverflow. More...
 
bool operator!= (const Genotype &other) const
 Checks if another genotype does not equal this genotype. More...
 
bool operator== (const Genotype &other) const
 Checks if another genotype equals this genotype. More...
 
bool het () const
 Checks if this genotype vector is any type of heterozygous call. More...
 
bool non_ref_het () const
 Checks if this genotype vector is a heterozygous call and none of the alleles is the reference. More...
 
uint32_t fast_diploid_key_generation () const
 A bit encoding for the first two alleles. More...
 
bool hom_var () const
 Checks if this genotype vector is a homozygous call that is non-reference. More...
 
bool hom_ref () const
 Checks if this genotype vector is a homozygous call that is reference. More...
 
bool missing () const
 Checks if all alleles are missing. More...
 
std::vector< std::string > allele_strings () const
 Returns a vector with all the allele strings. More...
 
std::vector< int32_t > allele_keys () const
 Returns a vector with all the allele keys. More...
 
std::string allele_string (const uint32_t index) const
 Returns the allele string at index. More...
 
int32_t allele_key (const uint32_t index) const
 Returns the allele key within this line. More...
 
int32_t operator[] (const uint32_t index) const
 Returns the allele key within this line. More...
 
uint32_t size () const
 Returns the number of alleles. More...
 
bool snp (const AlleleMask &mask) const
 whether or not this genotype represents a snp More...
 
bool insertion (const AlleleMask &mask) const
 whether or not this genotype represents an insertion More...
 
bool deletion (const AlleleMask &mask) const
 whether or not this genotype represents an deletion More...
 
bool indel (const AlleleMask &mask) const
 whether or not this genotype represents an insertion or deletion More...
 
bool biallelic () const
 whether or not this genotype has at most one alternate allele More...
 
bool complex () const
 literally the negation of biallelic(mask) More...
 
bool mixed () const
 identifies variants with two different types of alleles More...
 
bool variant () const
 

Static Public Member Functions

static void encode_genotype (std::vector< int32_t > &alleles)
 Converts a vector of allele indices representing a genotype into BCF-encoded format suitable for passing to htslib. No phasing is added. More...
 
static void encode_genotype (std::vector< int32_t > &alleles, bool phase_all_alleles)
 Converts a vector of allele indices representing a genotype into BCF-encoded format suitable for passing to htslib, and also allows you to phase all alleles. More...
 
static void encode_genotypes (std::vector< std::vector< int32_t >> &multiple_genotypes)
 Converts multiple vectors of allele indices representing genotypes into BCF-encoded format suitable for passing to htslib. No phasing is added. More...
 
static void encode_genotypes (VariantBuilderMultiSampleVector< int32_t > &multiple_genotypes)
 Converts multiple genotypes stored in a VariantBuilderMultiSampleVector into BCF-encoded format suitable for passing to htslib. No phasing is added. More...
 

Detailed Description

Encodes a genotype.

Constructor & Destructor Documentation

gamgee::Genotype::Genotype ( const std::shared_ptr< bcf1_t > &  body,
const bcf_fmt_t *const  format_ptr,
const uint8_t *  data_ptr 
)

Constructs a genotype.

Parameters
bodyThe shared memory variant "line" from a vcf, or bcf.
format_ptrThe GT field from the line.
data_ptrThe GT for this sample.
Note
Most of the patterns here are directly copied or adapted from htslib.
gamgee::Genotype::Genotype ( const Genotype other)
delete

copying of the Genotype object is not allowed.

gamgee::Genotype::Genotype ( Genotype &&  other)
default

Explicit default as recommended by many threads on stackoverflow.

Parameters
otherOther genotype.
gamgee::Genotype::~Genotype ( )
default

Explicit default as recommended by many threads on stackoverflow.

Member Function Documentation

int32_t gamgee::Genotype::allele_key ( const uint32_t  index) const

Returns the allele key within this line.

Parameters
indexZero based allele index
Returns
the allele key within this line
std::vector< int32_t > gamgee::Genotype::allele_keys ( ) const

Returns a vector with all the allele keys.

Returns
a vector with all the allele keys
std::string gamgee::Genotype::allele_string ( const uint32_t  index) const

Returns the allele string at index.

Parameters
indexZero based allele index
Returns
the allele string
std::vector< std::string > gamgee::Genotype::allele_strings ( ) const

Returns a vector with all the allele strings.

Returns
a vector with all the allele strings
bool gamgee::Genotype::biallelic ( ) const

whether or not this genotype has at most one alternate allele

This function will check whether this is a simple heterozygous or homozygous site where both alleles are either the same, or reference and one alternate allele. No two different alleles would pass.

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not all alleles are either the same, or there is one alt allele mixed with reference alleles.
bool gamgee::Genotype::complex ( ) const
inline

literally the negation of biallelic(mask)

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there are more than one alt allele in this record
bool gamgee::Genotype::deletion ( const AlleleMask mask) const

whether or not this genotype represents an deletion

This function will check that at least one deletion exists and that no other type of allele (snp or insertion) is present. Multiple deletions will return true.

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there is at least one deletion in this genotype and nothing else but deletions and reference alleles
static void gamgee::Genotype::encode_genotype ( std::vector< int32_t > &  alleles)
inlinestatic

Converts a vector of allele indices representing a genotype into BCF-encoded format suitable for passing to htslib. No phasing is added.

Example: if you want to BCF-encode the genotype 0/1, create a vector with {0, 1} and then pass it to this function

Note
Do not call this function yourself before passing genotypes into VariantBuilder – the builder will call it for you as necessary. Unless you are working with low-level BCF data you probably do not ever need to call this function.
static void gamgee::Genotype::encode_genotype ( std::vector< int32_t > &  alleles,
bool  phase_all_alleles 
)
inlinestatic

Converts a vector of allele indices representing a genotype into BCF-encoded format suitable for passing to htslib, and also allows you to phase all alleles.

Example: if you want to BCF-encode the genotype 0|1, create a vector with {0, 1} and then pass it to this function with phase_all_alleles set to true

Note
Do not call this function yourself before passing genotypes into VariantBuilder – the builder will call it for you as necessary. Unless you are working with low-level BCF data you probably do not ever need to call this function.
static void gamgee::Genotype::encode_genotypes ( std::vector< std::vector< int32_t >> &  multiple_genotypes)
inlinestatic

Converts multiple vectors of allele indices representing genotypes into BCF-encoded format suitable for passing to htslib. No phasing is added.

Example: if you want to BCF-encode the genotypes 0/1 and 1/1, create a vector with { {0, 1}, {1, 1} } and pass it to this function

Note
Do not call this function yourself before passing genotypes into VariantBuilder – the builder will call it for you as necessary. Unless you are working with low-level BCF data you probably do not ever need to call this function.
static void gamgee::Genotype::encode_genotypes ( VariantBuilderMultiSampleVector< int32_t > &  multiple_genotypes)
inlinestatic

Converts multiple genotypes stored in a VariantBuilderMultiSampleVector into BCF-encoded format suitable for passing to htslib. No phasing is added.

Note
Do not call this function yourself before passing genotypes into VariantBuilder – the builder will call it for you as necessary. Unless you are working with low-level BCF data you probably do not ever need to call this function.
uint32_t gamgee::Genotype::fast_diploid_key_generation ( ) const

A bit encoding for the first two alleles.

Returns
A bit encoding for the first two alleles.
Note
only for diploids, returns false otherwise.
bool gamgee::Genotype::het ( ) const

Checks if this genotype vector is any type of heterozygous call.

Returns
True if this GT is a het.
Note
only for diploids.
bool gamgee::Genotype::hom_ref ( ) const

Checks if this genotype vector is a homozygous call that is reference.

Returns
True if this GT is a hom ref.
bool gamgee::Genotype::hom_var ( ) const

Checks if this genotype vector is a homozygous call that is non-reference.

Returns
True if this GT is a hom var.
bool gamgee::Genotype::indel ( const AlleleMask mask) const

whether or not this genotype represents an insertion or deletion

This function will check that at least one insertion or deletion exists and that no other type of allele (snp or insertion) is present. Multiple deletions will return true.

Note
this is not the same as insertion(mask) || deletion(mask) because it will also tolerate sites with insertions and deletions, while both other functions would return false to such a site.
Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there is at least one insertion or deletion in this genotype and nothing else but insertions, deletions and reference alleles
bool gamgee::Genotype::insertion ( const AlleleMask mask) const

whether or not this genotype represents an insertion

This function will check that at least one insertion exists and that no other type of allele (snp or deletion) is present. Multiple insertions will return true.

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there is at least one insertion in this genotype and nothing else but insertions and reference alleles
bool gamgee::Genotype::missing ( ) const

Checks if all alleles are missing.

Returns
True if all alleles are missing.
Warning
Missing GT fields are untested.
bool gamgee::Genotype::mixed ( ) const

identifies variants with two different types of alleles

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there are more than one types of alt allele in this record
bool gamgee::Genotype::non_ref_het ( ) const

Checks if this genotype vector is a heterozygous call and none of the alleles is the reference.

Returns
True if this GT is a het and none of the alleles is the reference.
Note
only for diploids.
bool gamgee::Genotype::operator!= ( const Genotype other) const

Checks if another genotype does not equal this genotype.

Parameters
otherThe other genotype to compare.
Returns
False if other genotype equals this genotype.
Genotype& gamgee::Genotype::operator= ( const Genotype other)
delete

copying of the Genotype object is not allowed.

Parameters
otherOther genotype.
Genotype& gamgee::Genotype::operator= ( Genotype &&  other)
default

Explicit default as recommended by many threads on stackoverflow.

Parameters
otherOther genotype.
bool gamgee::Genotype::operator== ( const Genotype other) const

Checks if another genotype equals this genotype.

Parameters
otherThe other genotype to compare.
Returns
True if other genotype equals this genotype.
Note
No string comparison is done. This operation is supposed to be fast.
int32_t gamgee::Genotype::operator[] ( const uint32_t  index) const

Returns the allele key within this line.

Parameters
indexZero based allele index
Returns
the allele key within this line
uint32_t gamgee::Genotype::size ( ) const

Returns the number of alleles.

Returns
the number of alleles
bool gamgee::Genotype::snp ( const AlleleMask mask) const

whether or not this genotype represents a snp

In the true sense of the word, single nucleotide polymorphism restricts the number of loci (nucleotides) that are polymorphic, but not the number of potentially different alleles it may have (as long as they are all single nucleotide polymorphisms). This function will check that at least one of these SNPs exists and that no other type of allele (insertion or deletion) is present. Multiple SNPs will return true.

Warning
complexity is O(n) in the number of alleles. If you are using many of these convenience functions (snp(), insertion(), deletion(), indel(), complex(),...), you will be better off implementing one loop that makes all the checks in one pass instead of calling many O(n) functions.
Returns
whether or not there is at least one snp in this genotype and nothing else but snps and reference alleles
bool gamgee::Genotype::variant ( ) const
inline

The documentation for this class was generated from the following files: