Initialization¶
simuPOP provides three operators to initialize individual sex, information fields and genotype at the population level. A number of parameter are provided to cover most commonly used initialization scenarios. A Python operator can be used to intialize a population explicitly if none of the operators fits your need.
Initialize individual sex (operator InitSex
)¶
Operator InitSex
() and function initSex()
initialize individual
sex either randomly or using a given sequence. In the first case, individuals
are assigned MALE
or FEMALE
with equal probability unless parameter
maleFreq is used to specify the probability of having a male Individual.
Alternatively, parameter maleProp can be used to specify exact proportions of
male individuals so that you will have exactly 1000 males and 1000 females if
you apply InitSex
(maleProp=0.5
) to a population of 2000
individuals.
Both parameters maleFreq
and maleProp
assigns individual sex randomly.
If for some reason you need to specify individual sex explicitly, you could use
a sequence of sex (MALE
or FEMALE
) to assign sex to individuals
succesively. The list will be reused if needed. If a list of (virtual)
subpopulations are given, this operator will only initialize individuals in
these (virtual) subpopulations. Example InitSex demonstrates
how to use two InitSex
operators to initialize two subpopulations.
Example: Initialize individual sex
>>> import simuPOP as sim
>>> pop = sim.Population(size=[1000, 1000])
>>> sim.initSex(pop, maleFreq=0.3, subPops=0)
>>> sim.initSex(pop, sex=[sim.MALE, sim.FEMALE, sim.FEMALE], subPops=1)
>>> sim.stat(pop, numOfMales=True, vars='numOfMales_sp')
>>> print(pop.dvars(0).numOfMales)
290
>>> print(pop.dvars(1).numOfMales)
334
now exiting runScriptInteractively...
Initialize genotype (operator InitGenotype
)¶
Operator InitGenotype
(and its function form initGenotype
)
initializes individual genotype by allele frequency, allele proportion,
haplotype frequency, haplotype proportions or a list of genotypes:
- By frequency of alleles. For example,
InitGenotype
(freq=(0, 0.2, 0.4, 0.2)
) will assign allele 0, 1, 2, and 3 with probability 0, 0.2, 0.4 and 0.2 respectively. - By proportions of alleles. For example,
InitGenotype
(prop=(0, 0.2, 0.4, 0.2)
) will assign 400 allele 1, 800 allele 2 and 400 allele 3 to a diploid population with 800 individuals. - By frequency of haplotypes. For example,
InitGenotype
(haplotypes=[[0, 0], [1,1], [0,1],[1,1]]
) will assign four haplotypes with equal probabilities.InitGenotype
(haplotypes=[[0, 0], [1,1], [0,1],[1,1]], freq=[0.2, 0.2, 0.3, 0.3]
) will assign these haplotypes with different frequencies. If there are more than two loci, the haplotypes will be repeated. - By frequency of haplotypes. For example,
InitGenotype
(haplotypes=[[0, 0], [1,1], [0,1],[1,1]], prop=[0.2, 0.2, 0.3, 0.3]
) will assign four haplotypes with exact proportions. - By a list of genotype. For example,
InitGenotype
(genotype=[1, 2, 2, 1]
) will assign genotype1
,2
,2
,1
repeatedly to a population. If individuals in this population has two homologous copies of a chromosome with two loci, this operator will assign haplotype1
,2
to the first homologous copy of the chromosome, and2
,1
to the second copy. - By multiple allele frequencies or proportions returned by a function passed to
parameter
freq
orprop
(new in version 1.1.7). This function can accept parametersloc
,subPop
orvsp
and returns locus, subpopopulation or virtual subpopulation specific allele frequencies. For example, if you would like to initialize genotypes with random allele frequency, you can setfreq=lambda : random.random()
so that a new frequency is drawn from an uniform distribution for each new locus. Note that simuPOP expects the return value of this function to be a list of frequencies for alleles 0, 1, …, but treats a single return value x as [x, 1-x] for simplicity.
Parameter loci
and ploidy
can be used to specify a subset of loci and
homologous sets of chromosomes to initialize, and parameter subPops
can be
used to specify subsets of individuals to initialize. Example InitGenotype demonstrates how to use these the InitGenotype
operator, including examples on how to define and use virtual subpopulations to
initialize individual genotype by sex or by proportion.
Example: Initialize individual genotype
>>> import simuPOP as sim
>>> pop = sim.Population(size=[2000, 3000], loci=[5, 7])
>>> # by allele frequency
>>> def printFreq(pop, loci):
... sim.stat(pop, alleleFreq=loci)
... print(', '.join(['{:.3f}'.format(pop.dvars().alleleFreq[x][0]) for x in loci]))
...
>>> sim.initGenotype(pop, freq=[.4, .6])
>>> sim.dump(pop, max=6, structure=False)
SubPopulation 0 (), 2000 Individuals:
0: MU 11000 0011111 | 11111 0101110
1: MU 00000 1111111 | 11101 1111001
2: MU 10111 0111100 | 01111 1011111
3: MU 11011 1101010 | 11010 1011111
4: MU 11011 0011010 | 10011 1001110
5: MU 00001 1010011 | 11111 1111110
SubPopulation 1 (), 3000 Individuals:
2000: MU 10011 0010100 | 01001 0011010
>>> printFreq(pop, range(5))
0.397, 0.404, 0.400, 0.402, 0.406
>>> # by proportion
>>> sim.initGenotype(pop, prop=[0.4, 0.6])
>>> printFreq(pop, range(5))
0.400, 0.400, 0.400, 0.400, 0.400
>>> # by haplotype frequency
>>> sim.initGenotype(pop, freq=[.4, .6], haplotypes=[[1, 1, 0, 1], [0, 0, 1]])
>>> sim.dump(pop, max=6, structure=False)
SubPopulation 0 (), 2000 Individuals:
0: MU 11011 1011101 | 00100 1001001
1: MU 11011 1011101 | 11011 1011101
2: MU 00100 1001001 | 00100 1001001
3: MU 00100 1001001 | 00100 1001001
4: MU 11011 1011101 | 11011 1011101
5: MU 00100 1001001 | 11011 1011101
SubPopulation 1 (), 3000 Individuals:
2000: MU 00100 1001001 | 00100 1001001
>>> printFreq(pop, range(5))
0.597, 0.597, 0.403, 0.597, 0.597
>>> # by haplotype proportion
>>> sim.initGenotype(pop, prop=[0.4, 0.6], haplotypes=[[1, 1, 0], [0, 0, 1, 1]])
>>> printFreq(pop, range(5))
0.600, 0.600, 0.400, 0.000, 0.600
>>> # by genotype
>>> pop = sim.Population(size=[2, 3], loci=[5, 7])
>>> sim.initGenotype(pop, genotype=[1]*5 + [2]*7 + [3]*5 +[4]*7)
>>> sim.dump(pop, structure=False)
SubPopulation 0 (), 2 Individuals:
0: MU 11111 2222222 | 33333 4444444
1: MU 11111 2222222 | 33333 4444444
SubPopulation 1 (), 3 Individuals:
2: MU 11111 2222222 | 33333 4444444
3: MU 11111 2222222 | 33333 4444444
4: MU 11111 2222222 | 33333 4444444
>>> #
>>> # use virtual subpopulation
>>> pop = sim.Population(size=[2000, 3000], loci=[5, 7])
>>> pop.setVirtualSplitter(sim.SexSplitter())
>>> sim.initSex(pop)
>>> sim.initGenotype(pop, genotype=range(10), loci=range(5))
>>> # initialize all males
>>> sim.initGenotype(pop, genotype=[2]*7, loci=range(5, 12),
... subPops=[(0, 0), (1, 0)])
>>> # assign genotype by proportions
>>> pop.setVirtualSplitter(sim.ProportionSplitter([0.4, 0.6]))
>>> sim.initGenotype(pop, freq=[0.2, 0.8], subPops=[(0,0)])
>>> sim.initGenotype(pop, freq=[0.5, 0.5], subPops=[(0,1)])
>>> #
>>> # initialize by random allele frequency
>>> import random
>>> sim.initGenotype(pop, freq=lambda : random.random())
>>> printFreq(pop, range(5))
0.580, 0.239, 0.100, 0.576, 0.674
>>> # initialize with loci specific frequency. here
>>> # lambda loc: 0.01*loc is equivalent to
>>> # lambda loc: [0.01*loc, 1-0.01*loc]
>>> sim.initGenotype(pop,
... freq=lambda loc: 0.01*loc)
>>> printFreq(pop, range(5))
0.000, 0.009, 0.018, 0.029, 0.041
>>> # initialize with VSP-specific frequency
>>> sim.initGenotype(pop,
... freq=lambda vsp: [[0.2, 0.8], [0.5, 0.5]][vsp[1]],
... subPops=[(0, 0), (0, 1)])
>>>
now exiting runScriptInteractively...
Initialize information fields (operator InitInfo
)¶
Operator InitInfo
and its function form initInfo
initialize one or
more information fields of all individuals or Individuals in selected (virtual)
subpopulations using either a list of values or a Python function. If a value or
a list of value is given, it will be used repeatedly to assign values of
specified information fields of all applicable individuals. For example,
initInfo(pop, values=1, infoFields='x')
will assign value 1
to
information field x
of all individuals, and
initInfo(pop, values=[1, 2, 3], infoFields='x', subPops=[(0,1)])
will assign values 1
, 2
, 3
, 1
, 2
, 3
… to information
field x
of individuals in the second virtual subpopulation of subpopulation
0.
The values
parameter also accepts a Python function. This feature is usually
used to assign random values to an information field. For example,
values=random.random
would assign a random value between 0 and 1. If a
function takes parameters, a lambda function can be used. For example,
initInfo(pop, lambda : random.randint(2, 5), infoFields=['x', 'y'])
assigns random integers between 2 and 5 to information fields x
and y
of
all individuals in pop. Example InitInfo demonstrates these
usages.
Example: initialize information fields
>>> import simuPOP as sim
>>> import random
>>> pop = sim.Population(size=[5], loci=[2], infoFields=['sex', 'age'])
>>> pop.setVirtualSplitter(sim.SexSplitter())
>>> sim.initSex(pop)
>>> sim.initInfo(pop, 0, subPops=[(0,0)], infoFields='sex')
>>> sim.initInfo(pop, 1, subPops=[(0,1)], infoFields='sex')
>>> sim.initInfo(pop, lambda: random.randint(20, 70), infoFields='age')
>>> sim.dump(pop, structure=False)
SubPopulation 0 (), 5 Individuals:
0: FU 00 | 00 | 1 39
1: FU 00 | 00 | 1 29
2: MU 00 | 00 | 0 68
3: MU 00 | 00 | 0 50
4: MU 00 | 00 | 0 21
now exiting runScriptInteractively...