Gamgee
You miserable little maggot. I'll stove your head in!
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | List of all members
gamgee::VariantBuilder Class Reference

VariantBuilder: construct Variant records from scratch (and, coming soon, from existing Variant records) More...

#include <variant_builder.h>

Public Member Functions

 VariantBuilder (const VariantHeader &header)
 Construct a new VariantBuilder given a VariantHeader. More...
 
 VariantBuilder (VariantBuilder &&other)=default
 
VariantBuilderoperator= (VariantBuilder &&other)=default
 
 VariantBuilder (const VariantBuilder &other)=delete
 
VariantBuilderoperator= (const VariantBuilder &other)=delete
 
 ~VariantBuilder ()=default
 
VariantHeader header () const
 Return the Variant header for this builder. More...
 
VariantBuilderset_enable_validation (const bool enable_validation)
 Disable or enable all validation checks in this builder. More...
 
VariantBuilderset_chromosome (const uint32_t chromosome)
 Set the chromosome by index. More...
 
VariantBuilderset_chromosome (const std::string &chromosome)
 Set the chromosome by name. More...
 
VariantBuilderset_alignment_start (const uint32_t alignment_start)
 Set the alignment start position. More...
 
VariantBuilderset_alignment_stop (const uint32_t alignment_stop)
 Set the alignment stop position. More...
 
VariantBuilderset_qual (const float qual)
 Set the Phred-scaled site quality (probability that the site is not reference) More...
 
VariantBuilderset_id (const std::string &id)
 Set the variant ID field. More...
 
VariantBuilderset_ref_allele (const std::string &ref_allele)
 Set the reference allele. More...
 
VariantBuilderset_alt_allele (const std::string &alt_allele)
 Set the alt allele. More...
 
VariantBuilderset_alt_alleles (const std::vector< std::string > &alt_alleles)
 Set the alt alleles. More...
 
VariantBuilderset_filters (const std::vector< std::string > &filters)
 Set the filters using filter names. More...
 
VariantBuilderset_filters (const std::vector< int32_t > &filters)
 Set the filters using filter indices. More...
 
VariantBuilderremove_alignment_stop ()
 Clear the alignment stop value (if set) More...
 
VariantBuilderremove_qual ()
 Clear the qual field (if set) More...
 
VariantBuilderremove_id ()
 Clear the ID field (if set) More...
 
VariantBuilderremove_alt_alleles ()
 Clear alt alleles (if set) More...
 
VariantBuilderremove_filters ()
 Clear filters (if set) More...
 
VariantBuilderset_integer_shared_field (const std::string &tag, const int32_t value)
 Set a single-valued integer shared field by field name. More...
 
VariantBuilderset_integer_shared_field (const std::string &tag, const std::vector< int32_t > &values)
 Set a multi-valued integer shared field by field name. More...
 
VariantBuilderset_integer_shared_field (const uint32_t index, const int32_t value)
 Set a single-valued integer shared field by field index. More...
 
VariantBuilderset_integer_shared_field (const uint32_t index, const std::vector< int32_t > &values)
 Set a multi-valued integer shared field by field index. More...
 
VariantBuilderset_float_shared_field (const std::string &tag, const float value)
 Set a single-valued float shared field by field name. More...
 
VariantBuilderset_float_shared_field (const std::string &tag, const std::vector< float > &values)
 Set a multi-valued float shared field by field name. More...
 
VariantBuilderset_float_shared_field (const uint32_t index, const float value)
 Set a single-valued float shared field by field index. More...
 
VariantBuilderset_float_shared_field (const uint32_t index, const std::vector< float > &values)
 Set a multi-valued float shared field by field index. More...
 
VariantBuilderset_string_shared_field (const std::string &tag, const std::string &value)
 Set a string shared field by field name. More...
 
VariantBuilderset_string_shared_field (const uint32_t index, const std::string &value)
 Set a string shared field by field index. More...
 
VariantBuilderset_boolean_shared_field (const std::string &tag)
 Set a boolean (flag) shared field by field name. More...
 
VariantBuilderset_boolean_shared_field (const uint32_t index)
 Set a boolean (flag) shared field by field index. More...
 
VariantBuilderremove_shared_field (const std::string &tag)
 Remove a shared field by field name. More...
 
VariantBuilderremove_shared_field (const uint32_t field_index)
 Remove a shared field by field index. More...
 
VariantBuilderremove_shared_fields (const std::vector< std::string > &tags)
 Remove multiple shared fields by field name. More...
 
VariantBuilderremove_shared_fields (const std::vector< uint32_t > &field_indices)
 Remove multiple shared fields by field index. More...
 
VariantBuilderset_genotypes (const VariantBuilderMultiSampleVector< int32_t > &genotypes_for_all_samples)
 Set the genotypes (GT) field for all samples at once using an efficient flattened (one-dimensional) vector, COPYING the provided values. More...
 
VariantBuilderset_genotypes (VariantBuilderMultiSampleVector< int32_t > &&genotypes_for_all_samples)
 Set the genotypes (GT) field for all samples at once using an efficient flattened (one-dimensional) vector, MOVING the provided values. More...
 
VariantBuilderset_genotypes (const std::vector< std::vector< int32_t >> &genotypes_for_all_samples)
 Set the genotypes (GT) field for all samples at once by nested vector, COPYING the provided values. More...
 
VariantBuilderset_genotypes (std::vector< std::vector< int32_t >> &&genotypes_for_all_samples)
 Set the genotypes (GT) field for all samples at once by nested vector, MOVING the provided values. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, const VariantBuilderMultiSampleVector< int32_t > &values_for_all_samples)
 Set an integer individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, COPYING the provided values. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, VariantBuilderMultiSampleVector< int32_t > &&values_for_all_samples)
 Set an integer individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, MOVING the provided values. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, const std::vector< std::vector< int32_t >> &values_for_all_samples)
 Set an integer individual field for all samples at once by name using a nested vector, copying the provided values. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, std::vector< std::vector< int32_t >> &&values_for_all_samples)
 Set an integer individual field for all samples at once by name using a nested vector, moving the provided values. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, const VariantBuilderMultiSampleVector< int32_t > &values_for_all_samples)
 Set an integer individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, COPYING the provided values. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, VariantBuilderMultiSampleVector< int32_t > &&values_for_all_samples)
 Set an integer individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, MOVING the provided values. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, const std::vector< std::vector< int32_t >> &values_for_all_samples)
 Set an integer individual field for all samples at once by index using a nested vector, copying the provided values. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, std::vector< std::vector< int32_t >> &&values_for_all_samples)
 Set an integer individual field for all samples at once by index using a nested vector, moving the provided values. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, const VariantBuilderMultiSampleVector< float > &values_for_all_samples)
 Set a float individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, COPYING the provided values. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, VariantBuilderMultiSampleVector< float > &&values_for_all_samples)
 Set a float individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, MOVING the provided values. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, const std::vector< std::vector< float >> &values_for_all_samples)
 Set a float individual field for all samples at once by name using a nested vector, copying the provided values. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, std::vector< std::vector< float >> &&values_for_all_samples)
 Set a float individual field for all samples at once by name using a nested vector, moving the provided values. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, const VariantBuilderMultiSampleVector< float > &values_for_all_samples)
 Set a float individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, COPYING the provided values. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, VariantBuilderMultiSampleVector< float > &&values_for_all_samples)
 Set a float individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, MOVING the provided values. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, const std::vector< std::vector< float >> &values_for_all_samples)
 Set a float individual field for all samples at once by index using a nested vector, copying the provided values. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, std::vector< std::vector< float >> &&values_for_all_samples)
 Set a float individual field for all samples at once by index using a nested vector, moving the provided values. More...
 
VariantBuilderset_string_individual_field (const std::string &tag, const std::vector< std::string > &values_for_all_samples)
 Set a string individual field for all samples at once by name, copying the provided values. More...
 
VariantBuilderset_string_individual_field (const std::string &tag, std::vector< std::string > &&values_for_all_samples)
 Set a string individual field for all samples at once by name, moving the provided values. More...
 
VariantBuilderset_string_individual_field (const uint32_t field_index, const std::vector< std::string > &values_for_all_samples)
 Set a string individual field for all samples at once by index, copying the provided values. More...
 
VariantBuilderset_string_individual_field (const uint32_t field_index, std::vector< std::string > &&values_for_all_samples)
 Set a string individual field for all samples at once by index, moving the provided values. More...
 
VariantBuilderset_genotype (const std::string &sample, const std::vector< int32_t > &genotype)
 Set the genotypes (GT) field for a single sample by sample name, copying the genotype before encoding. More...
 
VariantBuilderset_genotype (const std::string &sample, std::vector< int32_t > &&genotype)
 Set the genotypes (GT) field for a single sample by sample name, moving the genotype into the builder and encoding in-place. More...
 
VariantBuilderset_genotype (const uint32_t sample_index, const std::vector< int32_t > &genotype)
 Set the genotypes (GT) field for a single sample by sample index, copying the genotype before encoding. More...
 
VariantBuilderset_genotype (const uint32_t sample_index, std::vector< int32_t > &&genotype)
 Set the genotypes (GT) field for a single sample by sample index, moving the genotype into the builder and encoding in-place. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, const std::string &sample, const int32_t value)
 Set a single-valued integer individual field for a single sample by field and sample name. More...
 
VariantBuilderset_integer_individual_field (const std::string &tag, const std::string &sample, const std::vector< int32_t > &values)
 Set a multi-valued integer individual field for a single sample by field and sample name. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, const uint32_t sample_index, const int32_t value)
 Set a single-valued integer individual field for a single sample by field and sample index. More...
 
VariantBuilderset_integer_individual_field (const uint32_t field_index, const uint32_t sample_index, const std::vector< int32_t > &values)
 Set a multi-valued integer individual field for a single sample by field and sample index. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, const std::string &sample, const float value)
 Set a single-valued float individual field for a single sample by field and sample name. More...
 
VariantBuilderset_float_individual_field (const std::string &tag, const std::string &sample, const std::vector< float > &values)
 Set a multi-valued float individual field for a single sample by field and sample name. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, const uint32_t sample_index, const float value)
 Set a single-valued float individual field for a single sample by field and sample index. More...
 
VariantBuilderset_float_individual_field (const uint32_t field_index, const uint32_t sample_index, const std::vector< float > &values)
 Set a multi-valued float individual field for a single sample by field and sample index. More...
 
VariantBuilderset_string_individual_field (const std::string &tag, const std::string &sample, const std::string &value)
 Set a string individual field for a single sample by field and sample name. More...
 
VariantBuilderset_string_individual_field (const uint32_t field_index, const uint32_t sample_index, const std::string &value)
 Set a string individual field for a single sample by field and sample index. More...
 
VariantBuilderremove_individual_field (const std::string &tag)
 Remove an individual field by field name. More...
 
VariantBuilderremove_individual_field (const uint32_t field_index)
 Remove an individual field by field index. More...
 
VariantBuilderremove_individual_fields (const std::vector< std::string > &tags)
 Remove multiple individual fields by field name. More...
 
VariantBuilderremove_individual_fields (const std::vector< uint32_t > &field_indices)
 Remove multiple individual fields by field index. More...
 
VariantBuilderMultiSampleVector
< int32_t > 
get_genotype_multi_sample_vector (const uint32_t num_samples, const uint32_t max_values_per_sample) const
 Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient GT field bulk setters. More...
 
VariantBuilderMultiSampleVector
< int32_t > 
get_integer_multi_sample_vector (const uint32_t num_samples, const uint32_t max_values_per_sample) const
 Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient integer individual field bulk setters. More...
 
VariantBuilderMultiSampleVector
< float > 
get_float_multi_sample_vector (const uint32_t num_samples, const uint32_t max_values_per_sample) const
 Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient float individual field bulk setters. More...
 
Variant build () const
 Create a new Variant record using the current state of the builder. More...
 
VariantBuilderclear ()
 Clear all field values in this builder to prepare it for the next build operation. More...
 

Detailed Description

VariantBuilder: construct Variant records from scratch (and, coming soon, from existing Variant records)

To use, first create a VariantHeader appropriate for the final record(s) you intend to create (with all shared/individual fields, samples, and contigs pre-declared), then use that header to instantiate a builder:

auto builder = VariantBuilder{header};

You should create ONE builder per file you intend to output, and re-use it across records, calling clear() in between each record. Do NOT create a new builder for each record – creating/destroying builders is an expensive process involving many memory allocations and deallocations of internal data structures as well as costly header lookups. Ie., the correct way to use a builder is:

auto builder = VariantBuilder{header}; for ( each record you want to create ) { // use existing builder to build new record builder.clear(); }

Once you have a builder you can call setter functions in a chained fashion as follows:

auto variant = builder.set_chromosome(0).set_alignment_start(5).set_ref_allele("A") .set_alt_alleles({"C", "T"}).set_genotypes(std::move(my_genotypes)) .build();

See the comments further down for instructions and tips on using the various kinds of setters.

You must at a minimum set the required chromosome, alignment start, and ref allele fields (unless you have disabled validation, but then you will just get an invalid Variant record).

The VariantBuilder API is designed to allow you to be efficient when you want to be (eg., moving existing data into the builder, setting individual fields in bulk rather than by sample, etc.), and lazy when you don't care about efficiency. In general, the more efficient API functions require a bit more work to use than the less efficient ones. See the discussions below of the efficiency of the various options available to you for setting fields.

Some general guidelines:

-Setting by field/sample index is more efficient than setting by field/sample name, provided that you look up the index for each field/sample ONCE in the header and cache it at traversal start instead of looking it up for every record.

-For setting single-valued fields it's more efficient to use the functions that take a scalar value (int, float, etc.) instead of a vector.

-For removing fields you generally have the option of either calling the appropriate remove_* API function, OR passing in a missing/empty value to the appropriate set_* function. Both options are equivalent and will result in the field being removed.

-Disabling validation is possible, and will certainly improve performance, however if you attempt to perform an action that would have been prevented by validation checks (such as setting a non-existent field) you WILL get undefined behavior in your program. You should ONLY disable validation if you're extremely confident that your program will not take any invalid/incorrect actions, and that the data you pass to the builder will always be valid.

Notes on setting individual fields:

-Setting individual fields by move/r-value is more efficient than setting by l-value, as the functions that take an l-value reference assume that you want to keep your data and therefore make a copy of it. Note that you will usually have to invoke std::move() explicitly to avoid a copy – eg.,

builder.set_integer_individual_field(field_index, std::move(my_vector));

-Setting individual fields in bulk (ie., all samples at once) is more efficient than setting one sample at a time.

-It is an error to request both bulk and per-sample changes to the same field (without calling clear() in between). This is because it would be too costly to reconcile the two kinds of changes.

-The bulk-setting functions that take a VariantBuilderMultiSampleVector (which internally is a pre-padded one-dimensional vector of values) are much more efficient than the functions that take a two-dimensional vector, but the VariantBuilderMultiSampleVector approach requires a bit more work to use, and also requires that you know the maximum number of values per sample for the field in advance.

For example, with 4 samples and an integer individual field with a varied number of values per sample, you could pass in the following two-dimensional vector:

{ {1, 2}, {3}, {}, {5, 6, 7} }

with each inner vector representing the values for one sample. However, this nested vector is a fairly inefficient data structure with poor data locality/cache performance.

If high performance is desired, you can use a VariantBuilderMultiSampleVector instead of a two-dimensional vector:

First determine the number of samples and the maximum number of values per sample for the field. Then get a pre-initialized VariantBuilderMultiSampleVector from the builder:

auto multi_sample_vector = builder.get_integer_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (NOTE: set_sample_value() is MUCH more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, pass your multi-sample vector into a setter function (favoring the functions that take field indices and use move semantics for high performance):

builder.set_integer_individual_field(field_index, std::move(multi_sample_vector));

The advantage of the VariantBuilderMultiSampleVector approach is much greater efficiency in terms of data locality and memory allocations.

-The genotype setter functions DO NOT require you to encode your genotype data with one of the Genotype::encode_genotype() functions before passing it in – they will call the appropriate encode function for you. Pass your genotypes by rvalue (ie., use std::move()) if you want to avoid an extra copy during genotype encoding and don't need to re-use your genotype data. See the comments to the Genotype setter functions for examples of how to construct and pass in genotypes.

Constructor & Destructor Documentation

gamgee::VariantBuilder::VariantBuilder ( const VariantHeader header)
explicit

Construct a new VariantBuilder given a VariantHeader.

Parameters
headerVariantHeader to use for constructing and validating new Variant records
Note
header will become the header for all Variant records created, and so must be appropriate for the final Variant records you intend to create (with all shared/individual fields, samples, and contigs pre-declared)
It's much more efficient to create a single VariantBuilder and re-use it across records (calling clear() as needed) instead of creating a new VariantBuilder for each record. VariantBuilder construction involves many memory allocations and is not cheap!
gamgee::VariantBuilder::VariantBuilder ( VariantBuilder &&  other)
default
gamgee::VariantBuilder::VariantBuilder ( const VariantBuilder other)
delete
gamgee::VariantBuilder::~VariantBuilder ( )
default

Member Function Documentation

Variant gamgee::VariantBuilder::build ( ) const

Create a new Variant record using the current state of the builder.

Returns
a new Variant object reflecting the field values currently set in this builder
Note
Can be called multiple times
It is an error to attempt to build a Variant without having set at least the minimum required fields (chromosome, alignment start, and reference allele)
VariantBuilder & gamgee::VariantBuilder::clear ( )

Clear all field values in this builder to prepare it for the next build operation.

Note
It's much more efficient to create one VariantBuilder and clear() it between records instead of creating a new VariantBuilder for each record
VariantBuilderMultiSampleVector< float > gamgee::VariantBuilder::get_float_multi_sample_vector ( const uint32_t  num_samples,
const uint32_t  max_values_per_sample 
) const

Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient float individual field bulk setters.

Parameters
num_samplesnumber of samples whose values will be stored in the vector
max_values_per_samplemaximum number of values per sample (field width)
Returns
a pre-initialized/pre-padded VariantBuilderMultiSampleVector<float> with all samples set to a missing value, ready for sample values to be set for non-missing samples
Note
For usage, see comments to the float individual field setters that take a VariantBuilderMultiSampleVector
VariantBuilderMultiSampleVector< int32_t > gamgee::VariantBuilder::get_genotype_multi_sample_vector ( const uint32_t  num_samples,
const uint32_t  max_values_per_sample 
) const

Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient GT field bulk setters.

Parameters
num_samplesnumber of samples whose genotypes will be stored in the vector
max_values_per_samplemaximum ploidy (field width)
Returns
a pre-initialized/pre-padded VariantBuilderMultiSampleVector<int32_t> with all samples set to a missing value, ready for genotypes to be set for non-missing samples
Note
For usage, see comments to the GT field setters that take a VariantBuilderMultiSampleVector
VariantBuilderMultiSampleVector< int32_t > gamgee::VariantBuilder::get_integer_multi_sample_vector ( const uint32_t  num_samples,
const uint32_t  max_values_per_sample 
) const

Get a pre-initialized/padded VariantBuilderMultiSampleVector for use with the more-efficient integer individual field bulk setters.

Parameters
num_samplesnumber of samples whose values will be stored in the vector
max_values_per_samplemaximum number of values per sample (field width)
Returns
a pre-initialized/pre-padded VariantBuilderMultiSampleVector<int32_t> with all samples set to a missing value, ready for sample values to be set for non-missing samples
Note
For usage, see comments to the integer individual field setters that take a VariantBuilderMultiSampleVector
VariantHeader gamgee::VariantBuilder::header ( ) const
inline

Return the Variant header for this builder.

Returns
Variant header this builder was constructed with (and which becomes the header for any Variant records built)
VariantBuilder& gamgee::VariantBuilder::operator= ( VariantBuilder &&  other)
default
VariantBuilder& gamgee::VariantBuilder::operator= ( const VariantBuilder other)
delete
VariantBuilder & gamgee::VariantBuilder::remove_alignment_stop ( )

Clear the alignment stop value (if set)

VariantBuilder & gamgee::VariantBuilder::remove_alt_alleles ( )

Clear alt alleles (if set)

VariantBuilder & gamgee::VariantBuilder::remove_filters ( )

Clear filters (if set)

VariantBuilder & gamgee::VariantBuilder::remove_id ( )

Clear the ID field (if set)

VariantBuilder & gamgee::VariantBuilder::remove_individual_field ( const std::string &  tag)

Remove an individual field by field name.

Parameters
tagname of the individual field to remove
Note
Less efficient than removing using the field index
VariantBuilder & gamgee::VariantBuilder::remove_individual_field ( const uint32_t  field_index)

Remove an individual field by field index.

Parameters
field_indexindex of the individual field to remove (from a header lookup)
VariantBuilder & gamgee::VariantBuilder::remove_individual_fields ( const std::vector< std::string > &  tags)

Remove multiple individual fields by field name.

Parameters
tagsnames of the individual fields to remove
Note
Less efficient than removing using the field indices
VariantBuilder & gamgee::VariantBuilder::remove_individual_fields ( const std::vector< uint32_t > &  field_indices)

Remove multiple individual fields by field index.

Parameters
field_indicesindices of the individual fields to remove (from header lookups)
VariantBuilder & gamgee::VariantBuilder::remove_qual ( )

Clear the qual field (if set)

VariantBuilder & gamgee::VariantBuilder::remove_shared_field ( const std::string &  tag)

Remove a shared field by field name.

Parameters
tagname of the shared field to remove
Note
Less efficient than removing using the field index
VariantBuilder & gamgee::VariantBuilder::remove_shared_field ( const uint32_t  field_index)

Remove a shared field by field index.

Parameters
field_indexindex of the shared field to remove (from a header lookup)
VariantBuilder & gamgee::VariantBuilder::remove_shared_fields ( const std::vector< std::string > &  tags)

Remove multiple shared fields by field name.

Parameters
tagsnames of the shared fields to remove
Note
Less efficient than removing using the field indices
VariantBuilder & gamgee::VariantBuilder::remove_shared_fields ( const std::vector< uint32_t > &  field_indices)

Remove multiple shared fields by field index.

Parameters
field_indicesindices of the shared fields to remove (from header lookups)
VariantBuilder & gamgee::VariantBuilder::set_alignment_start ( const uint32_t  alignment_start)

Set the alignment start position.

Parameters
alignment_start1-based alignment start position (as you would see in a VCF file)
Note
the internal encoding is 0-based to mimic that of the BCF files
VariantBuilder & gamgee::VariantBuilder::set_alignment_stop ( const uint32_t  alignment_stop)

Set the alignment stop position.

Parameters
alignment_stop1-based alignment stop position (as you would see in a VCF INFO END tag)
Note
the internal encoding is 0-based to mimic that of the BCF files
VariantBuilder & gamgee::VariantBuilder::set_alt_allele ( const std::string &  alt_allele)

Set the alt allele.

Parameters
alt_allelealt allele as a string
Note
Alt allele passed in replaces any previous value(s) for the alt field
It's more efficient to use this function when there's only a single alt allele instead of the vector-based setter
VariantBuilder & gamgee::VariantBuilder::set_alt_alleles ( const std::vector< std::string > &  alt_alleles)

Set the alt alleles.

Parameters
alt_allelesone string per alt allele
Note
Alt alleles passed in replace any previous value(s) for the alt field
It's more efficient to use the setter that takes a single string when there's only one alt allele
VariantBuilder & gamgee::VariantBuilder::set_boolean_shared_field ( const std::string &  tag)

Set a boolean (flag) shared field by field name.

Parameters
tagname of the shared field to set
Note
Field is set to true/present. To set a boolean field to false call one of the remove_shared_field() functions.
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_boolean_shared_field ( const uint32_t  index)

Set a boolean (flag) shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
Note
Field is set to true/present. To set a boolean field to false call one of the remove_shared_field() functions.
VariantBuilder & gamgee::VariantBuilder::set_chromosome ( const uint32_t  chromosome)

Set the chromosome by index.

Parameters
chromosomechromosome index (from a header lookup)
VariantBuilder & gamgee::VariantBuilder::set_chromosome ( const std::string &  chromosome)

Set the chromosome by name.

Parameters
chromosomechromosome name
VariantBuilder & gamgee::VariantBuilder::set_enable_validation ( const bool  enable_validation)

Disable or enable all validation checks in this builder.

Passing in false disables validation, passing in true enables it. Validation is on by default in new builders.

Warning
Disabling validation will improve performance at the cost of safety. Only do so if you're extremely confident that your program will not take any invalid/incorrect actions, and that the data you pass to the builder will always be valid.
VariantBuilder & gamgee::VariantBuilder::set_filters ( const std::vector< std::string > &  filters)

Set the filters using filter names.

Parameters
filtersvector of filter names
Note
It's more efficient to set using filter indices
VariantBuilder & gamgee::VariantBuilder::set_filters ( const std::vector< int32_t > &  filters)

Set the filters using filter indices.

Parameters
filtersvector of filter indices (from header lookups)
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
const VariantBuilderMultiSampleVector< float > &  values_for_all_samples 
)

Set a float individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, COPYING the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_float_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, pass your multi-sample vector into this function:

builder.set_float_individual_field(field_name, multi_sample_vector);

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than moving the values into the builder
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
VariantBuilderMultiSampleVector< float > &&  values_for_all_samples 
)

Set a float individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, MOVING the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_float_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, MOVE your multi-sample vector into this function:

builder.set_float_individual_field(field_name, std::move(multi_sample_vector));

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
const std::vector< std::vector< float >> &  values_for_all_samples 
)

Set a float individual field for all samples at once by name using a nested vector, copying the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1.5, 2.5}, {3.5}, {}, {5.5, 6.5, 7.5} }

Note
Less efficient than using a flattened vector
Less efficient than moving the values into the builder
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
std::vector< std::vector< float >> &&  values_for_all_samples 
)

Set a float individual field for all samples at once by name using a nested vector, moving the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1.5, 2.5}, {3.5}, {}, {5.5, 6.5, 7.5} }

Note
Less efficient than using a flattened vector
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
const VariantBuilderMultiSampleVector< float > &  values_for_all_samples 
)

Set a float individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, COPYING the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_float_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, pass your multi-sample vector into this function:

builder.set_float_individual_field(field_index, multi_sample_vector);

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
VariantBuilderMultiSampleVector< float > &&  values_for_all_samples 
)

Set a float individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, MOVING the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_float_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, MOVE your multi-sample vector into this function:

builder.set_float_individual_field(field_index, std::move(multi_sample_vector));

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
const std::vector< std::vector< float >> &  values_for_all_samples 
)

Set a float individual field for all samples at once by index using a nested vector, copying the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1.5, 2.5}, {3.5}, {}, {5.5, 6.5, 7.5} }

Note
Less efficient than using a flattened vector
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
std::vector< std::vector< float >> &&  values_for_all_samples 
)

Set a float individual field for all samples at once by index using a nested vector, moving the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1.5, 2.5}, {3.5}, {}, {5.5, 6.5, 7.5} }

Note
Less efficient than using a flattened vector
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
const std::string &  sample,
const float  value 
)

Set a single-valued float individual field for a single sample by field and sample name.

Parameters
tagname of the individual field to set
samplename of the sample whose value to set
valuefield value for the specified sample
Note
It's more efficient to use this setter instead of a vector-based setter when a sample has just a single value for a field
Less efficient than setting using the field/sample indices
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const std::string &  tag,
const std::string &  sample,
const std::vector< float > &  values 
)

Set a multi-valued float individual field for a single sample by field and sample name.

Parameters
tagname of the individual field to set
samplename of the sample whose value to set
valuesfield values for the specified sample
Note
Less efficient than setting using the field/sample indices
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
const uint32_t  sample_index,
const float  value 
)

Set a single-valued float individual field for a single sample by field and sample index.

Parameters
field_indexindex of the individual field to set (from a header lookup)
sample_indexindex of the sample whose value to set (from a header lookup)
valuefield value for the specified sample
Note
It's more efficient to use this setter instead of a vector-based setter when a sample has just a single value for a field
VariantBuilder & gamgee::VariantBuilder::set_float_individual_field ( const uint32_t  field_index,
const uint32_t  sample_index,
const std::vector< float > &  values 
)

Set a multi-valued float individual field for a single sample by field and sample index.

Parameters
field_indexindex of the individual field to set (from a header lookup)
sample_indexindex of the sample whose value to set (from a header lookup)
valuesfield values for the specified sample
VariantBuilder & gamgee::VariantBuilder::set_float_shared_field ( const std::string &  tag,
const float  value 
)

Set a single-valued float shared field by field name.

Parameters
tagname of the shared field to set
valuenew value for the field
Note
It's more efficient to use this setter instead of a vector-based setter when a field has just a single value
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_shared_field ( const std::string &  tag,
const std::vector< float > &  values 
)

Set a multi-valued float shared field by field name.

Parameters
tagname of the shared field to set
valuesnew values for the field
Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_float_shared_field ( const uint32_t  index,
const float  value 
)

Set a single-valued float shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
valuenew value for the field
Note
It's more efficient to use this setter instead of a vector-based setter when a field has just a single value
VariantBuilder & gamgee::VariantBuilder::set_float_shared_field ( const uint32_t  index,
const std::vector< float > &  values 
)

Set a multi-valued float shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
valuesnew values for the field
VariantBuilder & gamgee::VariantBuilder::set_genotype ( const std::string &  sample,
const std::vector< int32_t > &  genotype 
)

Set the genotypes (GT) field for a single sample by sample name, copying the genotype before encoding.

Parameters
samplename of the sample whose genotype to set
genotypegenotype for the specified sample (see notes below)

Examples: For genotype 0/1, create vector {0, 1} For genotype 1/., create vector {1, -1} For genotype ./., create vector {-1, -1}

Note
Does not support genotypes with phased alleles
Less efficient than moving the genotype into the builder
Less efficient than setting using the sample index
VariantBuilder & gamgee::VariantBuilder::set_genotype ( const std::string &  sample,
std::vector< int32_t > &&  genotype 
)

Set the genotypes (GT) field for a single sample by sample name, moving the genotype into the builder and encoding in-place.

Parameters
samplename of the sample whose genotype to set
genotypegenotype for the specified sample (see notes below)

Examples: For genotype 0/1, create vector {0, 1} For genotype 1/., create vector {1, -1} For genotype ./., create vector {-1, -1}

Note
Does not support genotypes with phased alleles
Less efficient than setting using the sample index
VariantBuilder & gamgee::VariantBuilder::set_genotype ( const uint32_t  sample_index,
const std::vector< int32_t > &  genotype 
)

Set the genotypes (GT) field for a single sample by sample index, copying the genotype before encoding.

Parameters
sample_indexindex of the sample whose genotype to set (from a header lookup)
genotypegenotype for the specified sample (see notes below)

Examples: For genotype 0/1, create vector {0, 1} For genotype 1/., create vector {1, -1} For genotype ./., create vector {-1, -1}

Note
Does not support genotypes with phased alleles
Less efficient than moving the genotype into the builder
VariantBuilder & gamgee::VariantBuilder::set_genotype ( const uint32_t  sample_index,
std::vector< int32_t > &&  genotype 
)

Set the genotypes (GT) field for a single sample by sample index, moving the genotype into the builder and encoding in-place.

Parameters
sample_indexindex of the sample whose genotype to set (from a header lookup)
genotypegenotype for the specified sample (see notes below)

Examples: For genotype 0/1, create vector {0, 1} For genotype 1/., create vector {1, -1} For genotype ./., create vector {-1, -1}

Note
Does not support genotypes with phased alleles
VariantBuilder & gamgee::VariantBuilder::set_genotypes ( const VariantBuilderMultiSampleVector< int32_t > &  genotypes_for_all_samples)

Set the genotypes (GT) field for all samples at once using an efficient flattened (one-dimensional) vector, COPYING the provided values.

Parameters
genotypes_for_all_samplesgenotypes for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum ploidy across all samples, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_genotype_multi_sample_vector(num_samples, max_ploidy);

This vector will have missing values for all samples, with appropriate padding to the maximum ploidy.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values, however you should represent each no-call allele with -1.

For example, to set sample 0's genotype to 0/1, you could call multi_sample_vector.set_sample_values(0, {0, 1}), or to set it to 1/., you could call multi_sample_vector.set_sample_values(0, {1, -1}). As noted above, setting one allele at a time via set_sample_value() will be more efficient than set_sample_values() since it doesn't require a vector allocation.

Finally, pass your multi-sample vector into this function:

builder.set_genotypes(multi_sample_vector);

If this process is too inconvenient, or you can't know the maximum ploidy in advance, use a less-efficient function that takes a nested vector.

Note
Does not support genotypes with phased alleles
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_genotypes ( VariantBuilderMultiSampleVector< int32_t > &&  genotypes_for_all_samples)

Set the genotypes (GT) field for all samples at once using an efficient flattened (one-dimensional) vector, MOVING the provided values.

Parameters
genotypes_for_all_samplesgenotypes for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum ploidy across all samples, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_genotype_multi_sample_vector(num_samples, max_ploidy);

This vector will have missing values for all samples, with appropriate padding to the maximum ploidy.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values, however you should represent each no-call allele with -1.

For example, to set sample 0's genotype to 0/1, you could call multi_sample_vector.set_sample_values(0, {0, 1}), or to set it to 1/., you could call multi_sample_vector.set_sample_values(0, {1, -1}). As noted above, setting one allele at a time via set_sample_value() will be more efficient than set_sample_values() since it doesn't require a vector allocation.

Finally, MOVE your multi-sample vector into this function:

builder.set_genotypes(std::move(multi_sample_vector));

If this process is too inconvenient, or you can't know the maximum ploidy in advance, use a less-efficient function that takes a nested vector.

Note
Does not support genotypes with phased alleles
VariantBuilder & gamgee::VariantBuilder::set_genotypes ( const std::vector< std::vector< int32_t >> &  genotypes_for_all_samples)

Set the genotypes (GT) field for all samples at once by nested vector, COPYING the provided values.

Parameters
genotypes_for_all_samplesgenotypes for all samples in order of sample index, with one inner vector per sample (no padding necessary)
Note
With a nested vector, each inner vector represents the genotypes for the sample with the corresponding index. There is no need to manually pad to the maximum ploidy, but you do need to add a missing value (-1) for each missing/no-call allele.

For example, if you had Sample1=0/1 Sample2=./. Sample3=. Sample4=0/1/2 you would need to create the following nested vector:

{ {0, 1}, {-1, -1}, {}, {0, 1, 2} }

Note
Does not support genotypes with phased alleles
Less efficient than using a flattened vector
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_genotypes ( std::vector< std::vector< int32_t >> &&  genotypes_for_all_samples)

Set the genotypes (GT) field for all samples at once by nested vector, MOVING the provided values.

Parameters
genotypes_for_all_samplesgenotypes for all samples in order of sample index, with one inner vector per sample (no padding necessary)
Note
With a nested vector, each inner vector represents the genotypes for the sample with the corresponding index. There is no need to manually pad to the maximum ploidy, but you do need to add a missing value (-1) for each missing/no-call allele.

For example, if you had Sample1=0/1 Sample2=./. Sample3=. Sample4=0/1/2 you would need to create the following nested vector:

{ {0, 1}, {-1, -1}, {}, {0, 1, 2} }

Note
Does not support genotypes with phased alleles
Less efficient than using a flattened vector
VariantBuilder & gamgee::VariantBuilder::set_id ( const std::string &  id)

Set the variant ID field.

Parameters
idvariant ID (typically DBSNP ID)
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
const VariantBuilderMultiSampleVector< int32_t > &  values_for_all_samples 
)

Set an integer individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, COPYING the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_integer_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, pass your multi-sample vector into this function:

builder.set_integer_individual_field(field_name, multi_sample_vector);

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than moving the values into the builder
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
VariantBuilderMultiSampleVector< int32_t > &&  values_for_all_samples 
)

Set an integer individual field for all samples at once by name using an efficient flattened (one-dimensional) vector, MOVING the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_integer_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, MOVE your multi-sample vector into this function:

builder.set_integer_individual_field(field_name, std::move(multi_sample_vector));

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
const std::vector< std::vector< int32_t >> &  values_for_all_samples 
)

Set an integer individual field for all samples at once by name using a nested vector, copying the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1, 2}, {3}, {}, {5, 6, 7} }

Note
Less efficient than using a flattened vector
Less efficient than moving the values into the builder
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
std::vector< std::vector< int32_t >> &&  values_for_all_samples 
)

Set an integer individual field for all samples at once by name using a nested vector, moving the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1, 2}, {3}, {}, {5, 6, 7} }

Note
Less efficient than using a flattened vector
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
const VariantBuilderMultiSampleVector< int32_t > &  values_for_all_samples 
)

Set an integer individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, COPYING the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_integer_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, pass your multi-sample vector into this function:

builder.set_integer_individual_field(field_index, multi_sample_vector);

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

Note
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
VariantBuilderMultiSampleVector< int32_t > &&  values_for_all_samples 
)

Set an integer individual field for all samples at once by index using an efficient flattened (one-dimensional) vector, MOVING the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples as a VariantBuilderMultiSampleVector (see note below)
Note
To create a multi-sample flattened vector for use with this function, first determine the number of samples and the maximum number of values per sample for this field, then get a pre-initialized vector from the builder:

auto multi_sample_vector = builder.get_integer_multi_sample_vector(num_samples, max_values_per_sample);

This vector will have missing values for all samples, with appropriate padding to the maximum field width.

Then, fill in the values for each non-missing sample by invoking the set_sample_value() and/or set_sample_values() functions on your multi-sample vector (set_sample_value() is more efficient than set_sample_values() since it doesn't require a vector construction/destruction for each call). You don't have to worry about samples with no values, since all samples start out with missing values.

Finally, MOVE your multi-sample vector into this function:

builder.set_integer_individual_field(field_index, std::move(multi_sample_vector));

If this process is too inconvenient, or you can't know the maximum number of values per sample in advance, use a less-efficient function that takes a nested vector.

VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
const std::vector< std::vector< int32_t >> &  values_for_all_samples 
)

Set an integer individual field for all samples at once by index using a nested vector, copying the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1, 2}, {3}, {}, {5, 6, 7} }

Note
Less efficient than using a flattened vector
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
std::vector< std::vector< int32_t >> &&  values_for_all_samples 
)

Set an integer individual field for all samples at once by index using a nested vector, moving the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index, with one inner vector per sample (no special padding necessary)
Note
With a nested vector, each inner vector represents the values for the sample with the corresponding index. There is no need to manually pad with missing/vector end values. For example:

{ {1, 2}, {3}, {}, {5, 6, 7} }

Note
Less efficient than using a flattened vector
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
const std::string &  sample,
const int32_t  value 
)

Set a single-valued integer individual field for a single sample by field and sample name.

Parameters
tagname of the individual field to set
samplename of the sample whose value to set
valuefield value for the specified sample
Note
It's more efficient to use this setter instead of a vector-based setter when a sample has just a single value for a field
Less efficient than setting using the field/sample indices
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const std::string &  tag,
const std::string &  sample,
const std::vector< int32_t > &  values 
)

Set a multi-valued integer individual field for a single sample by field and sample name.

Parameters
tagname of the individual field to set
samplename of the sample whose value to set
valuesfield values for the specified sample
Note
Less efficient than setting using the field/sample indices
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
const uint32_t  sample_index,
const int32_t  value 
)

Set a single-valued integer individual field for a single sample by field and sample index.

Parameters
field_indexindex of the individual field to set (from a header lookup)
sample_indexindex of the sample whose value to set (from a header lookup)
valuefield value for the specified sample
Note
It's more efficient to use this setter instead of a vector-based setter when a sample has just a single value for a field
VariantBuilder & gamgee::VariantBuilder::set_integer_individual_field ( const uint32_t  field_index,
const uint32_t  sample_index,
const std::vector< int32_t > &  values 
)

Set a multi-valued integer individual field for a single sample by field and sample index.

Parameters
field_indexindex of the individual field to set (from a header lookup)
sample_indexindex of the sample whose value to set (from a header lookup)
valuesfield values for the specified sample
VariantBuilder & gamgee::VariantBuilder::set_integer_shared_field ( const std::string &  tag,
const int32_t  value 
)

Set a single-valued integer shared field by field name.

Parameters
tagname of the shared field to set
valuenew value for the field
Note
It's more efficient to use this setter instead of a vector-based setter when a field has just a single value
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_shared_field ( const std::string &  tag,
const std::vector< int32_t > &  values 
)

Set a multi-valued integer shared field by field name.

Parameters
tagname of the shared field to set
valuesnew values for the field
Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_integer_shared_field ( const uint32_t  index,
const int32_t  value 
)

Set a single-valued integer shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
valuenew value for the field
Note
It's more efficient to use this setter instead of a vector-based setter when a field has just a single value
VariantBuilder & gamgee::VariantBuilder::set_integer_shared_field ( const uint32_t  index,
const std::vector< int32_t > &  values 
)

Set a multi-valued integer shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
valuesnew values for the field
VariantBuilder & gamgee::VariantBuilder::set_qual ( const float  qual)

Set the Phred-scaled site quality (probability that the site is not reference)

Parameters
qualPhred-scaled site quality (probability that the site is not reference)
VariantBuilder & gamgee::VariantBuilder::set_ref_allele ( const std::string &  ref_allele)

Set the reference allele.

Parameters
ref_allelereference allele as a string
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const std::string &  tag,
const std::vector< std::string > &  values_for_all_samples 
)

Set a string individual field for all samples at once by name, copying the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index
Note
Less efficient than moving the values into the builder
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const std::string &  tag,
std::vector< std::string > &&  values_for_all_samples 
)

Set a string individual field for all samples at once by name, moving the provided values.

Parameters
tagname of the individual field to set
values_for_all_samplesfield values for all samples in order of sample index
Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const uint32_t  field_index,
const std::vector< std::string > &  values_for_all_samples 
)

Set a string individual field for all samples at once by index, copying the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index
Note
Less efficient than moving the values into the builder
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const uint32_t  field_index,
std::vector< std::string > &&  values_for_all_samples 
)

Set a string individual field for all samples at once by index, moving the provided values.

Parameters
field_indexindex of the individual field to set (from a header lookup)
values_for_all_samplesfield values for all samples in order of sample index
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const std::string &  tag,
const std::string &  sample,
const std::string &  value 
)

Set a string individual field for a single sample by field and sample name.

Parameters
tagname of the individual field to set
samplename of the sample whose value to set
valuefield value for the specified sample
Note
Less efficient than setting using the field/sample indices
VariantBuilder & gamgee::VariantBuilder::set_string_individual_field ( const uint32_t  field_index,
const uint32_t  sample_index,
const std::string &  value 
)

Set a string individual field for a single sample by field and sample index.

Parameters
field_indexindex of the individual field to set (from a header lookup)
sample_indexindex of the sample whose value to set (from a header lookup)
valuefield value for the specified sample
VariantBuilder & gamgee::VariantBuilder::set_string_shared_field ( const std::string &  tag,
const std::string &  value 
)

Set a string shared field by field name.

Parameters
tagname of the shared field to set
valuenew value for the field
Note
Less efficient than setting using the field index
VariantBuilder & gamgee::VariantBuilder::set_string_shared_field ( const uint32_t  index,
const std::string &  value 
)

Set a string shared field by field index.

Parameters
indexindex of the shared field to set (from a header lookup)
valuenew value for the field

The documentation for this class was generated from the following files: