Maximum likelihood estimation of effective population size from temporally-spaced genetic data

Share:

Description

The concept of effective population size is imporatnt in population genetics. This function estimates the effective population size N_e from temporally-sapced genetic data using maximum likelihood method with continuous approximation.

Usage

1
2
NB.estimator(infile, alleles, sample.interval, bound = c(50, 1e+07), 
	profile.likelihood=FALSE)

Arguments

infile

The input dataset. It should be a plain text file (*.txt) containing the number of alleles of each allele, at each locus, from each temporal sample. For example, Suppose we have i temporal samples from the focal population, j loci, and K_j alleles at locus j. Denoting the number of copies of allele k at locus j from sample i as n_{k,j,i}, then the input format of data is as follows:
n_{1,1,1}~n_{2,1,1}~...~n_{K_1,1,1}
n_{1,2,1}~n_{2,2,1}~...~n_{K_2,2,1}
...
n_{1,j,1}~n_{2,j,1}~...~n_{K_j, j,1}

n_{1,1,2}~n_{2,1,2}~...~n_{K_1,1,2}
n_{1,2,2}~n_{2,2,2}~...~n_{K_2,2,2}
...
n_{1,j,2}~n_{2,j,2}~...~n_{K_j, j,2}

...

Note: A space is required to separate the allele counts.

alleles

A vector containing the number of alleles at each locus. For example, c(4, 4, 4) would mean that 3 loci are sampled, with 4 alleles each.

sample.interval

A vector stating at which generations the samples were taken. For example, c(0, 8) would indicate that two samples were collected from the 0th and 8th generation.

bound

Lower and upper bound for searching for the effective population size. Default values are c(50, 1e7).

profile.likelihood

TRUE if you would like to plot the log-likelihood function, or to look at the details of the log-likelihood values between your confidence interval. FALSE will return you the lower and upper 95% confidence interval only.

Details

The input arguments above largely follow the input format of MLNE (Wang, 2001; Wang and Whitlock, 2003) to allow users to switch between platforms with the minimal effort. The infile should be in a plain text (*.txt) file format. It contains the same information as the input data as MLNE does. The built-in optim optimiser is wrapped within this function.

Value

N

The point estimate of the effective population size N_e.

CI

The approximate 95% confidence interval calculated from the log-likelihood. They are the region where the log-likelihood is 2 units below the maxima.

log.like

The value of the maximised log-likelihood.

profile.CI

A list of log-likelihood values as a function of N_e within the 95% confidence interval.

Note

If you would like to use your own optimisation algorithm or customerise the parameters within the optim function, please consider using NB.likelihood from this package.

See Also

NB.likelihood. NB.example.dataset.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## CREATE SAMPLE DATASET
NB.example.dataset()

## RUN THE FUNCTION
NB.estimator(infile='sample_data.txt', alleles=rep(4, 50), 
	sample.interval=c(0, 8), profile.likelihood=FALSE)

	#####
	## NUMERICAL RESULTS
	#$N
	# [1] 1241.079
	#
	#$CI
	#[1]  594.195 6375.933
	#
	#$log.like
	#[1] -543.9159
	#####