Gamgee
You miserable little maggot. I'll stove your head in!
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Namespaces | Classes | Typedefs | Enumerations | Functions | Variables
gamgee Namespace Reference

Namespaces

 missing_values
 
 utils
 utility functions for the gamgee library
 

Classes

class  BaseQuals
 Utility class to handle the memory management of the sam record object for a read base qualities. More...
 
class  ChromosomeNotFoundException
 an exception class for the case where a chromosome is not found in the reference More...
 
class  ChromosomeSizeException
 an exception class for the case where a chromosome is not found in the reference More...
 
class  Cigar
 Utility class to manage the memory of the cigar structure. More...
 
class  Fastq
 Utility class to hold one FastA or FastQ record. More...
 
class  FastqIterator
 Utility class to enable for-each style iteration in the FastqReader class. More...
 
class  FastqReader
 Utility class to read many Fastq records from a stream (e.g. Fastq file, stdin, ...) in a for-each loop in a for-each loop. More...
 
class  FileOpenException
 Exception for the case where there is an error opening a file for reading/writing. More...
 
class  Genotype
 Encodes a genotype. More...
 
class  HeaderCompatibilityException
 Exception for the case where multiple headers are incompatible in some way. More...
 
class  HeaderReadException
 Exception for the case where a file header could not be read. More...
 
class  HtslibException
 a catchall exception class for htslib errors More...
 
class  IndexedSamIterator
 Utility class to enable for-each style iteration in the IndexedSamReader class. More...
 
class  IndexedSamReader
 Utility class to read a BAM/CRAM file with an appropriate Sam iterator from an indexed file in a for-each loop. Intervals are passed in using a vector of string coordinates compatible with Samtools. When iteration begins, the iterations (re-)starts at the beginning of the first interval. More...
 
class  IndexedVariantIterator
 
class  IndexedVariantReader
 Utility class to read an indexed BCF file by intervals using an appropriate Variant iterator in a for-each loop. More...
 
class  IndexLoadException
 Exception for the case where an index file cannot be opened for a particular file (eg., bam/vcf/bcf) More...
 
class  IndividualField
 A class template to hold the values of a specific Variant's format field for all samples. More...
 
class  IndividualFieldIterator
 iterator for VariantField objects. More...
 
class  IndividualFieldValue
 A class template to hold the values a format field for a particular sample. More...
 
class  IndividualFieldValueIterator
 iterator for FormatFieldGenericValue objects. More...
 
class  Interval
 Utility class to store an genomic location (Interval). More...
 
class  MultipleVariantIterator
 Utility class to enable for-each style iteration in the MultipleVariantReader class. More...
 
class  MultipleVariantReader
 Utility class to read multiple VCF/BCF files with an appropriate iterator in a for-each loop. More...
 
class  ReadBases
 Utility class to handle the memory management of the sam record object for read bases. More...
 
class  ReadGroup
 Helper struct to hold one read group record from a sam file header. More...
 
class  ReferenceBlockSplittingVariantIterator
 Utility class to handle reference blocks while iterating over multiple variant files. More...
 
class  ReferenceIterator
 Utility class to access reference bases in a FastA-formatted reference genome. More...
 
class  ReferenceMap
 Utility class to create a reference object for all reference operations in Foghorn. More...
 
class  Sam
 Utility class to manipulate a Sam record. More...
 
class  SamBuilder
 class to build Sam objects from existing data or from scratch More...
 
class  SamBuilderDataField
 class to hold encoded byte arrays for individual data fields (cigar, bases, etc.) during building of a Sam More...
 
class  SamHeader
 Utility class to hold the header of a sam file. More...
 
class  SamIterator
 Utility class to enable for-each style iteration in the SamReader class. More...
 
class  SamPairIterator
 Utility class to enable for-each style iteration by pairs in the SamReader class. More...
 
class  SamReader
 Utility class to read a SAM/BAM/CRAM file with an appropriate Sam iterator from a stream (e.g. file, stdin, ...) in a for-each loop. More...
 
class  SamTag
 class to represent a Sam TAG:TYPE:VALUE entry More...
 
class  SamWriter
 utility class to write out a SAM/BAM/CRAM file to any stream More...
 
struct  SharedBufferSpan
 Represents a section (range of bytes) in the shared memory pool VariantBuilderSharedRegion::m_shared_buffer that is currently in use to store encoded field data. More...
 
class  SharedField
 A class template to hold the values of a specific Variant's shared field. More...
 
class  SharedFieldIterator
 iterator for SharedField objects. More...
 
class  SingleInputException
 an exception class for the case where a single input is required, but more is provided More...
 
class  SyncedVariantIterator
 Utility class to enable for-each style iteration in the SyncedVariantReader class. More...
 
class  SyncedVariantReader
 Utility class to read multiple VCF.GZ/BCF files with an appropriate iterator in a for-each loop. More...
 
class  Variant
 Utility class to manipulate a Variant record. More...
 
class  VariantBuilder
 VariantBuilder: construct Variant records from scratch (and, coming soon, from existing Variant records) More...
 
class  VariantBuilderCoreField
 Internal VariantBuilder class to represent "core" (non-data-region) fields such as the alignment start and the qual. More...
 
class  VariantBuilderIndividualField
 Helper class for VariantBuilder to manage the storage and encoding of a single multi-sample individual field. More...
 
class  VariantBuilderIndividualRegion
 Helper class for VariantBuilder to manage the fields belonging to the individual region of Variant records. More...
 
class  VariantBuilderMultiSampleVector
 Class that allows you to efficiently prepare multi-sample data for setting individual fields in VariantBuilder. More...
 
class  VariantBuilderSharedRegion
 Helper class for VariantBuilder to manage the fields belonging to the shared region of Variant records. More...
 
class  VariantFilters
 class to manipulate filter field objects without making copies. More...
 
class  VariantFiltersIterator
 simple random-access iterator class for VariantFilters objects More...
 
class  VariantHeader
 Utility class to hold a variant header. More...
 
class  VariantHeaderBuilder
 Utility class to build VariantHeader objects from scratch. More...
 
class  VariantHeaderMerger
 
class  VariantIterator
 Utility class to enable for-each style iteration in the VariantReader class. More...
 
class  VariantReader
 Utility class to read a VCF/BCF file with an appropriate Variant iterator from a stream (e.g. file, stdin, ...) in a for-each loop. More...
 
class  VariantWriter
 utility class to write out a VCF/BCF file to any stream More...
 

Typedefs

using CigarElement = uint32_t
 
using IndexedSingleSamReader = IndexedSamReader< IndexedSamIterator >
 
using SingleSamReader = SamReader< SamIterator >
 
using PairSamReader = SamReader< SamPairIterator >
 
using AlleleMask = std::vector< AlleleType >
 
using VariantIteratorIndexPair = std::pair< std::shared_ptr< VariantIterator >, uint32_t >
 
using VariantIndexPair = std::pair< Variant, uint32_t >
 
using InputOrderedVariantHeaderMerger = VariantHeaderMerger< true, true, true, true >
 
using FieldOrderedVariantHeaderMerger = VariantHeaderMerger< false, false, false, false >
 
using SingleVariantReader = VariantReader< VariantIterator >
 

Enumerations

enum  CigarOperator {
  CigarOperator::M, CigarOperator::I, CigarOperator::D, CigarOperator::N,
  CigarOperator::S, CigarOperator::H, CigarOperator::P, CigarOperator::EQ,
  CigarOperator::X, CigarOperator::B
}
 comprehensive list of valid cigar operators More...
 
enum  Base {
  Base::A = 1, Base::C = 2, Base::G = 4, Base::T = 8,
  Base::N = 15
}
 simple enum to hold all valid bases in the SAM format More...
 
enum  AlleleType { AlleleType::REFERENCE, AlleleType::SNP, AlleleType::INSERTION, AlleleType::DELETION }
 
enum  DiploidPLGenotype { DiploidPLGenotype::HOM_REF = 0, DiploidPLGenotype::HET = 1, DiploidPLGenotype::HOM_VAR = 2 }
 simple enum to keep the indices of the genotypes in the PL field of diploid individuals More...
 
enum  SharedFieldIndex {
  SharedFieldIndex::ID_INDEX = 0, SharedFieldIndex::REF_ALLELE_INDEX = 1, SharedFieldIndex::ALT_ALLELES_INDEX = 2, SharedFieldIndex::FILTERS_INDEX = 3,
  SharedFieldIndex::INFO_START_INDEX = 4
}
 Enum to represent the ordering of the various shared fields as they are physically laid out in the encoded data. More...
 

Functions

void skip_picard_header (istream &infile)
 
Interval parse_interval_record (const string &line)
 
vector< Intervalread_intervals (const string &intervals_file)
 
vector< Intervalread_intervals (istream &input)
 
std::vector< Intervalread_intervals (const std::string &intervals_file)
 utility function to read all Intervals from an Intervals file More...
 
std::vector< Intervalread_intervals (std::istream &input)
 utility function to read all Intervals from an Intervals file More...
 
bool missing (const bool value)
 Returns true if bool is false (missing). More...
 
bool missing (const float value)
 Returns true if float is missing. More...
 
bool missing (const int8_t value)
 Returns true if int8_t is missing. More...
 
bool missing (const int16_t value)
 Returns true if int16_t is missing. More...
 
bool missing (const int32_t value)
 Returns true if int32_t is missing. More...
 
bool missing (const std::string &value)
 Returns true if string is missing. More...
 
bool missing (const char *value)
 Returns true if char* is missing. More...
 
template<class MISSING_TYPE >
bool missing (const MISSING_TYPE &value)
 
template<class VALUE >
bool missing (const std::vector< VALUE > &v)
 
void subset_variant_samples (bcf_hdr_t *hdr_ptr, const std::vector< std::string > &samples, const bool include)
 allows the caller to include only selected samples in a Variant Reader. To create a sites only file, simply pass an empty vector of samples. More...
 
void merge_variant_headers (const std::shared_ptr< bcf_hdr_t > &dest_hdr_ptr, const std::shared_ptr< bcf_hdr_t > &src_hdr_ptr)
 merges a variant header into another More...
 

Variables

const auto PICARD_HEADER_TAG = '@'
 
const auto sep = char_separator<char>{" \t:-"}
 
constexpr auto MATE_CIGAR_TAG = "MC"
 

Typedef Documentation

using gamgee::AlleleMask = typedef std::vector<AlleleType>
using gamgee::CigarElement = typedef uint32_t
using gamgee::FieldOrderedVariantHeaderMerger = typedef VariantHeaderMerger<false,false,false,false>
using gamgee::InputOrderedVariantHeaderMerger = typedef VariantHeaderMerger<true,true,true,true>
using gamgee::VariantIndexPair = typedef std::pair<Variant, uint32_t>
using gamgee::VariantIteratorIndexPair = typedef std::pair<std::shared_ptr<VariantIterator>, uint32_t>

Enumeration Type Documentation

enum gamgee::AlleleType
strong
Enumerator
REFERENCE 
SNP 
INSERTION 
DELETION 
enum gamgee::Base
strong

simple enum to hold all valid bases in the SAM format

Note
enum values used here correspond to the 4-bit base encodings in htslib so that we can cast directly to Base
Enumerator
enum gamgee::CigarOperator
strong

comprehensive list of valid cigar operators

Note
order of the operators in this enum must match the order in BAM_CIGAR_STR from htslib/sam.h
Enumerator
EQ 

simple enum to keep the indices of the genotypes in the PL field of diploid individuals

Enumerator
HOM_REF 
HET 
HOM_VAR 

Enum to represent the ordering of the various shared fields as they are physically laid out in the encoded data.

Enumerator
ID_INDEX 
REF_ALLELE_INDEX 
ALT_ALLELES_INDEX 
FILTERS_INDEX 
INFO_START_INDEX 

Function Documentation

void gamgee::merge_variant_headers ( const std::shared_ptr< bcf_hdr_t > &  dest_hdr_ptr,
const std::shared_ptr< bcf_hdr_t > &  src_hdr_ptr 
)

merges a variant header into another

Parameters
dest_hdr_ptra shared pointer to a bcf_hdr_t containing the header to be merged into
src_hdr_ptra shared pointer to a bcf_hdr_t containing the header to be merged from
bool gamgee::missing ( const bool  value)
inline

Returns true if bool is false (missing).

bool gamgee::missing ( const float  value)
inline

Returns true if float is missing.

bool gamgee::missing ( const int8_t  value)
inline

Returns true if int8_t is missing.

bool gamgee::missing ( const int16_t  value)
inline

Returns true if int16_t is missing.

bool gamgee::missing ( const int32_t  value)
inline

Returns true if int32_t is missing.

bool gamgee::missing ( const std::string &  value)
inline

Returns true if string is missing.

bool gamgee::missing ( const char *  value)
inline

Returns true if char* is missing.

template<class MISSING_TYPE >
bool gamgee::missing ( const MISSING_TYPE &  value)
inline

Returns true if value is missing.

Template Parameters
MISSING_TYPEany class that implements the missing() as a public member function.
Returns
True if value is missing.
template<class VALUE >
bool gamgee::missing ( const std::vector< VALUE > &  v)
inline

Missing overload for functions that return a vector of values. It only applies if the entire vector is missing.

Template Parameters
VALUEany type that can be fit into a container. Any type, really.
Parameters
vany vector
Returns
true if the vector is empty (therefore the value that was returned is missing)
Interval gamgee::parse_interval_record ( const string &  line)
vector<Interval> gamgee::read_intervals ( const string &  intervals_file)
vector<Interval> gamgee::read_intervals ( istream &  input)
std::vector<Interval> gamgee::read_intervals ( const std::string &  intervals_file)

utility function to read all Intervals from an Intervals file

Parameters
intervals_filethe file with the Intervals in one of the supported formats
Returns
a vector of Interval objects
std::vector<Interval> gamgee::read_intervals ( std::istream &  input)

utility function to read all Intervals from an Intervals file

Parameters
inputan input stream (e.g. std::cin)
Returns
a vector of Interval objects
void gamgee::skip_picard_header ( istream &  infile)
void gamgee::subset_variant_samples ( bcf_hdr_t hdr_ptr,
const std::vector< std::string > &  samples,
const bool  include 
)

allows the caller to include only selected samples in a Variant Reader. To create a sites only file, simply pass an empty vector of samples.

Parameters
samplesthe list of samples you want included/excluded from your iteration
whetheryou want these samples to be included or excluded from your iteration.

Variable Documentation

constexpr auto gamgee::MATE_CIGAR_TAG = "MC"
const auto gamgee::PICARD_HEADER_TAG = '@'
const auto gamgee::sep = char_separator<char>{" \t:-"}