Simulation

bvas.simulate.simulate_data(num_alleles=100, duration=26, num_variants=100, num_regions=10, N0=10000, N0_k=10.0, R0=1.0, mutation_density=0.25, k=0.1, seed=0, include_phi=False, sampling_rate=1, strategy='global-mean')[source]

Simulate pandemic data using a discrete time Negative Binomial branching process.

Parameters:
  • num_alleles (int) – The number of alleles to simulate. Defaults to 100.

  • duration (int) – The number of timesteps to simulate. Defaults to 26.

  • num_variants (int) – The number of viral variants to simulate. Defaults to 100.

  • num_regions (int) – The number of geographic regions to simulate. Defaults to 10.

  • N0 (int) – The mean number of infected individuals at the first time step in each region. Defaults to 10000.

  • N0_k (float) – Controls the dispersion of the Negative Binomial distribution that is used to sample the number of infected individuals at the first time step in each region. Defaults to 10.0.

  • R0 (float) – The basic reproduction number of the wild-type variant. Defaults to 1.0.

  • mutation_density (float) – Controls the average number of non-wild-type mutations that appear in each viral variant. Defaults to 0.25.

  • k (float) – Controls the dispersion of the Negative Binomial distribution that underlies the discrete time branching process. Defaults to 0.1. Small k corresponds to the super-spreading limit.

  • seed (int) – Sets the random number seed. Defaults to 0.

  • include_phi (bool) – Whether to include vaccine-dependent effects in the simulation. Defaults to False.

  • sampling_rate (float) – Controls the observation sampling rate, i.e. the percentage of infected individuals whose genomes are sequenced. Defaults to 1, i.e. 1%.

  • strategy (str) – Strategy used for estimating the effective population size. Must be one of: global-mean, global-median, regional. Defaults to global-mean.

Returns dict:

returns a dictionary that contains Y and Gamma as well as the estimated effective population size. Y and Gamma are each scaled using the indicated effective population size estimation strategy.