Gamgee
You miserable little maggot. I'll stove your head in!
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Friends | List of all members
gamgee::VariantHeader Class Reference

Utility class to hold a variant header. More...

#include <variant_header.h>

Public Member Functions

 VariantHeader ()=default
 initializes a null VariantHeader More...
 
 VariantHeader (const std::shared_ptr< bcf_hdr_t > &header)
 creates a VariantHeader given htslib object. More...
 
 VariantHeader (const VariantHeader &other)
 makes a deep copy of a VariantHeader. Shared pointers maintain state to all other associated objects correctly. More...
 
 VariantHeader (VariantHeader &&other) noexcept
 moves VariantHeader accordingly. Shared pointers maintain state to all other associated objects correctly. More...
 
VariantHeaderoperator= (const VariantHeader &other)
 deep copy assignment of a VariantHeader. Shared pointers maintain state to all other associated objects correctly. More...
 
VariantHeaderoperator= (VariantHeader &&other) noexcept
 move assignment of a VariantHeader. Shared pointers maintain state to all other associated objects correctly. More...
 
 ~VariantHeader ()=default
 
bool operator== (const VariantHeader &rhs) const
 equality operators More...
 
bool operator!= (const VariantHeader &rhs) const
 
std::vector< std::string > samples () const
 builds a vector with the names of the samples More...
 
uint32_t n_samples () const
 
std::vector< std::string > chromosomes () const
 returns the number of samples in the header More...
 
uint32_t n_chromosomes () const
 returns the number of chromosomes declared in this header More...
 
uint32_t field_index_end () const
 returns the last valid field index + 1, to indicate the end of field iteration More...
 
uint32_t n_filters () const
 returns the number of filters declared in this header do not use for iteration over filter indices – use field_index_end() instead More...
 
std::vector< std::string > filters () const
 returns a vector of filter names More...
 
uint32_t n_shared_fields () const
 returns the number of shared fields declared in this header do not use for iteration over filter indices – use field_index_end() instead More...
 
std::vector< std::string > shared_fields () const
 returns a vector of shared field names More...
 
uint32_t n_individual_fields () const
 returns the number of individual fields declared in this header do not use for iteration over filter indices – use field_index_end() instead More...
 
std::vector< std::string > individual_fields () const
 returns a vector of individual field names More...
 
uint8_t shared_field_type (const std::string &tag) const
 returns the type of this shared (INFO) field More...
 
uint8_t shared_field_type (const int32_t index) const
 returns the type of this shared (INFO) field More...
 
uint8_t individual_field_type (const std::string &tag) const
 returns the type of this individual (FORMAT) field More...
 
uint8_t individual_field_type (const int32_t index) const
 returns the type of this individual (FORMAT) field More...
 
uint8_t field_type (const std::string &tag, const int32_t field_category) const
 
uint8_t field_type (const int32_t index, const int32_t field_category) const
 
uint32_t field_length_descriptor (const std::string &tag, const int32_t field_category) const
 
uint32_t field_length_descriptor (const int32_t index, const int32_t field_category) const
 
uint32_t field_length (const std::string &tag, const int32_t field_category) const
 
uint32_t field_length (const int32_t index, const int32_t field_category) const
 
bool has_filter (const std::string &filter_name) const
 checks whether the given filter is present given the filter name More...
 
bool has_filter (const int32_t filter_index) const
 checks whether the given filter is present given the filter index More...
 
bool has_shared_field (const std::string &field_name) const
 checks whether the given shared (INFO) field is present given the field name More...
 
bool has_shared_field (const int32_t field_index) const
 checks whether the given shared (INFO) field is present given the field index More...
 
bool has_individual_field (const std::string &field_name) const
 checks whether the given individual (FORMAT) field is present given the field name More...
 
bool has_individual_field (const int32_t field_index) const
 checks whether the given individual (FORMAT) field is present given the field index More...
 
bool has_field (const std::string &field_name, const int32_t field_category) const
 checks whether the given field is present given the field name and field category (which must be one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT) More...
 
bool has_field (const int32_t field_index, const int32_t field_category) const
 checks whether the given field is present given the field index and field category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT) More...
 
bool has_sample (const std::string &sample_name) const
 checks whether the given sample is present given the sample name More...
 
bool has_sample (const int32_t sample_index) const
 checks whether the given sample is present given the sample index More...
 
int32_t field_index (const std::string &tag) const
 looks up the index of a particular filter, shared or individual field tag, enabling subsequent O(1) random-access lookups for that field throughout the iteration. More...
 
int32_t sample_index (const std::string &sample) const
 looks up the index of a particular sample, enabling subsequent O(1) random-access lookups for that sample throughout the iteration. More...
 
std::string get_field_name (const int32_t field_idx) const
 
std::string get_sample_name (const int32_t sample_idx) const
 

Friends

class Variant
 
class VariantWriter
 
class VariantHeaderBuilder
 
class VariantBuilder
 builder needs access to the internals in order to build efficiently More...
 
class VariantBuilderSharedRegion
 
class VariantBuilderIndividualRegion
 
template<bool fields_forward_LUT_ordering, bool fields_reverse_LUT_ordering, bool samples_forward_LUT_ordering, bool samples_reverse_LUT_ordering>
class VariantHeaderMerger
 

Detailed Description

Utility class to hold a variant header.

It can be used to read headers from a VCF/BCF file, but to create one from scratch you want to use the VariantHeaderBuilder

Note: filters, shared fields, and individual fields all occupy the same index space, and users must be careful to access these values appropriately. As an example, consider one possible indexing scheme for a simple header with two of each:

0 FILTER_1 1 SHARED_FIELD_1 2 SHARED_FIELD_2 3 INDIVIDUAL_FIELD_1 4 FILTER_2 5 INDIVIDUAL_FIELD_2

Counters and string accessors work intuitively:

n_filters() returns 2 n_shared_fields() returns 2 n_individual_fields() returns 2 filters() returns { "FILTER_1", "FILTER_2" } shared_fields() returns { "SHARED_FIELD_1", "SHARED_FIELD_2" } individual_fields() returns { "INDIVIDUAL_FIELD_1", "INDIVIDUAL_FIELD_2" }

Iterating through these filters/fields requires more care. While it's possible to use the string vectors for this, use of those strings to retrieve fields will require an expensive field index lookup. The counters are not usable because they don't tell you which indices are for your desired type.

Instead, use field_index_end() and the has_filter() / has_*_field() method appropriate for the type on each index.

Sample usage: for (auto idx = 0u; idx < header.field_index_end(); ++idx) if (header.has_individual_field(idx)) do_something_with_field(variant.integer_individual_field(idx));

Constructor & Destructor Documentation

gamgee::VariantHeader::VariantHeader ( )
default

initializes a null VariantHeader

Warning
if you need to create a VariantHeader from scratch, use the builder instead
gamgee::VariantHeader::VariantHeader ( const std::shared_ptr< bcf_hdr_t > &  header)
inlineexplicit

creates a VariantHeader given htslib object.

Note
used by all iterators
gamgee::VariantHeader::VariantHeader ( const VariantHeader other)

makes a deep copy of a VariantHeader. Shared pointers maintain state to all other associated objects correctly.

gamgee::VariantHeader::VariantHeader ( VariantHeader &&  other)
noexcept

moves VariantHeader accordingly. Shared pointers maintain state to all other associated objects correctly.

gamgee::VariantHeader::~VariantHeader ( )
default

Member Function Documentation

vector< string > gamgee::VariantHeader::chromosomes ( ) const

returns the number of samples in the header

Note
much faster than getting the actual list of samples builds a vector with the contigs
int32_t gamgee::VariantHeader::field_index ( const std::string &  tag) const
inline

looks up the index of a particular filter, shared or individual field tag, enabling subsequent O(1) random-access lookups for that field throughout the iteration.

Returns
missing_values::int32_t if the tag is not present in the header (you can use missing() on the return value to check)
Note
prefer this to looking up tag names during the iteration if you are looking for shared fields multiple times.
if multiple fields (e.g. shared and individual) have the same tag (e.g. "DP"), they will also have the same index internally, so this function will do the right thing. The accessors for individual and shared field will know how to use the index to retrieve the correct field.
uint32_t gamgee::VariantHeader::field_index_end ( ) const
inline

returns the last valid field index + 1, to indicate the end of field iteration

Sample usage: for (auto idx = 0u; idx < header.field_index_end(); ++idx) if (header.has_individual_field(idx)) do_something_with_field(variant.integer_individual_field(idx));

uint32_t gamgee::VariantHeader::field_length ( const std::string &  tag,
const int32_t  field_category 
) const
inline

returns number of values for the field with the specified name and category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT), 0xfffff for variable length fields

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint32_t gamgee::VariantHeader::field_length ( const int32_t  index,
const int32_t  field_category 
) const
inline

returns number of values for the field with the specified index and category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT), 0xfffff for variable length fields

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint32_t gamgee::VariantHeader::field_length_descriptor ( const std::string &  tag,
const int32_t  field_category 
) const
inline
uint32_t gamgee::VariantHeader::field_length_descriptor ( const int32_t  index,
const int32_t  field_category 
) const
inline

returns one of BCF_VL_* values for the field with the specified index and category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT)

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint8_t gamgee::VariantHeader::field_type ( const std::string &  tag,
const int32_t  field_category 
) const
inline

returns the type of the field with the specified name and category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT)

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint8_t gamgee::VariantHeader::field_type ( const int32_t  index,
const int32_t  field_category 
) const
inline

returns the type of the field with the specified index and category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT)

Note
must check whether the field exists before calling this function, as it doesn't check for you
vector< string > gamgee::VariantHeader::filters ( ) const

returns a vector of filter names

std::string gamgee::VariantHeader::get_field_name ( const int32_t  field_idx) const
inline
std::string gamgee::VariantHeader::get_sample_name ( const int32_t  sample_idx) const
inline
bool gamgee::VariantHeader::has_field ( const std::string &  field_name,
const int32_t  field_category 
) const
inline

checks whether the given field is present given the field name and field category (which must be one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT)

bool gamgee::VariantHeader::has_field ( const int32_t  field_index,
const int32_t  field_category 
) const
inline

checks whether the given field is present given the field index and field category (one of BCF_HL_FMT, BCF_HL_INFO, or BCF_HL_FLT)

bool gamgee::VariantHeader::has_filter ( const std::string &  filter_name) const
inline

checks whether the given filter is present given the filter name

bool gamgee::VariantHeader::has_filter ( const int32_t  filter_index) const
inline

checks whether the given filter is present given the filter index

bool gamgee::VariantHeader::has_individual_field ( const std::string &  field_name) const
inline

checks whether the given individual (FORMAT) field is present given the field name

bool gamgee::VariantHeader::has_individual_field ( const int32_t  field_index) const
inline

checks whether the given individual (FORMAT) field is present given the field index

bool gamgee::VariantHeader::has_sample ( const std::string &  sample_name) const
inline

checks whether the given sample is present given the sample name

bool gamgee::VariantHeader::has_sample ( const int32_t  sample_index) const
inline

checks whether the given sample is present given the sample index

bool gamgee::VariantHeader::has_shared_field ( const std::string &  field_name) const
inline

checks whether the given shared (INFO) field is present given the field name

bool gamgee::VariantHeader::has_shared_field ( const int32_t  field_index) const
inline

checks whether the given shared (INFO) field is present given the field index

uint8_t gamgee::VariantHeader::individual_field_type ( const std::string &  tag) const
inline

returns the type of this individual (FORMAT) field

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint8_t gamgee::VariantHeader::individual_field_type ( const int32_t  index) const
inline

returns the type of this individual (FORMAT) field

Note
must check whether the field exists before calling this function, as it doesn't check for you
vector< string > gamgee::VariantHeader::individual_fields ( ) const

returns a vector of individual field names

uint32_t gamgee::VariantHeader::n_chromosomes ( ) const

returns the number of chromosomes declared in this header

uint32_t gamgee::VariantHeader::n_filters ( ) const

returns the number of filters declared in this header do not use for iteration over filter indices – use field_index_end() instead

uint32_t gamgee::VariantHeader::n_individual_fields ( ) const

returns the number of individual fields declared in this header do not use for iteration over filter indices – use field_index_end() instead

uint32_t gamgee::VariantHeader::n_samples ( ) const
inline
uint32_t gamgee::VariantHeader::n_shared_fields ( ) const

returns the number of shared fields declared in this header do not use for iteration over filter indices – use field_index_end() instead

bool gamgee::VariantHeader::operator!= ( const VariantHeader rhs) const
inline
VariantHeader & gamgee::VariantHeader::operator= ( const VariantHeader other)

deep copy assignment of a VariantHeader. Shared pointers maintain state to all other associated objects correctly.

< shared_ptr assignment will take care of deallocating old sam record if necessary

VariantHeader & gamgee::VariantHeader::operator= ( VariantHeader &&  other)
noexcept

move assignment of a VariantHeader. Shared pointers maintain state to all other associated objects correctly.

other is an r-value reference, so it will disappear into the nether right after the swap

bool gamgee::VariantHeader::operator== ( const VariantHeader rhs) const

equality operators

Parameters
rhsthe other VariantHeader to compare to work in progress: does not fully compare underlying structures
Returns
whether these variant headers are equal
int32_t gamgee::VariantHeader::sample_index ( const std::string &  sample) const
inline

looks up the index of a particular sample, enabling subsequent O(1) random-access lookups for that sample throughout the iteration.

Returns
missing_values::int32_t if the tag is not present in the header (you can use missing() on the return value to check)
Note
prefer this to looking up sample names during the iteration if you are looking for samples multiple times.
vector< string > gamgee::VariantHeader::samples ( ) const

builds a vector with the names of the samples

This implementation is simply transforming the char ** representation of the sample names into a contiguous vector<string>. As efficient as it gets.

uint8_t gamgee::VariantHeader::shared_field_type ( const std::string &  tag) const
inline

returns the type of this shared (INFO) field

Note
must check whether the field exists before calling this function, as it doesn't check for you
uint8_t gamgee::VariantHeader::shared_field_type ( const int32_t  index) const
inline

returns the type of this shared (INFO) field

Note
must check whether the field exists before calling this function, as it doesn't check for you
vector< string > gamgee::VariantHeader::shared_fields ( ) const

returns a vector of shared field names

Friends And Related Function Documentation

friend class Variant
friend
friend class VariantBuilder
friend

builder needs access to the internals in order to build efficiently

friend class VariantBuilderIndividualRegion
friend
friend class VariantBuilderSharedRegion
friend
friend class VariantHeaderBuilder
friend
template<bool fields_forward_LUT_ordering, bool fields_reverse_LUT_ordering, bool samples_forward_LUT_ordering, bool samples_reverse_LUT_ordering>
friend class VariantHeaderMerger
friend
friend class VariantWriter
friend

The documentation for this class was generated from the following files: