Demerelate: Demerelate - Algorithms to estimate pairwise relatedness...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/Demerelate.R


Head function of Demerelate. This function should be called if any estimation of relatedness is intended. Additionally, some F-statistics can be calculated. Default parameters are set for convenient usage. Only an input dataframe containing allelic information is necessary. Geographical distances, reference populations or alterations on statistics can be set by adapting parameters.


     Demerelate(inputdata, tab.dist = "NA", ref.pop = "NA", 
                object = FALSE, value = "Mxy", Fis = FALSE,
                file.output = FALSE, p.correct = FALSE,
                iteration = 1000, pairs = 1000, 
       = "relative", NA.rm = TRUE,
                genotype.ref = TRUE)



R object or external file to be read internally with standard Demerelate inputformat. Dataframe will be split by population information and calculations will run separately. If no reference population information is specified (ref.pop = "NA") all information on loci are used as reference by omitting population information.


R object or external file to be read internally with standard Demerelate inputformat. Geographic distances can be defined and will be analysed combined with genetic data. Column three and four of standard inputformat are used for x and y coordinates.


R object or external file to be read internally with standard Demerelate inputformat. Custom reference populations will be loaded for the analysis. Population information of reference file will be omitted so that allele frequencies are calculated from the whole dataset. Optionally allele frequencies can be loaded as reference: The object should be then a list of allele frequencies. For each locus a vector with allele frequencies p and allele names as vector names needs to be combined to a list. The last list object is a vector of sample sizes for each locus.


Information whether inputdata are objects or should be read in as files.


String defining method to calculate allele sharing or similarity estimates. Can be set as "Bxy", "Sxy", "Mxy", "Li", "lxy", "rxy", "loiselle", "wang.fin", "wang", "ritland", "morans.fin" or "morans" allele.sharing.


logical. Should F_{is} values be calculated for each population?


Number of bootstrap iterations in F_{is} calculations.


Number of pairs calculated from reference populations for randomized full siblings, half siblings and non related individuals.


logical. Should a cluster dendogram, histograms and .txt files be sent as standard output in your working directory. In some cases (inflating NA values) it may be necessary that this value has to be set as FALSE due to problems in calculating clusters on pairwise NA values.


logical. Should Yates correction from prop.test(...) be used in χ^2 statistics when calculating p-values on differences between empirical and randomized relatedness in populations.

The kind of data to be used as distance measure. Can be "relative" - relative x and y coordinates should be given in tab.dist or "decimal" for geographic decimal degrees.


logical. If set as TRUE samples with NA in any position are removed from the calculation. If set as FALSE you may get an error message telling you to remove some individuals to run through the procedure. Always be aware that if your calculations are successful although you have NA values in your populations your may be biased by missing data.


logical. If set as TRUE random non related populations are generated from genotypes of the reference population. If set as false allele frequencies are used for reference population generation. If ref.pop is given as list of allele frequencies genotype.ref = FALSE is forced.


Pairwise relatedness is calculated from inputdata. Be sure to fit exactly the inputformat. Missing values are omitted when flagged as NA. If no additional reference populations are defined, inputdata omitting population information are used to calculate references. If no good reference populations are available you need to take care of bias in calculations. In any case you should consult for example Oliehoek et al. 2006 to get an idea of bias in relatedness calculations.
Geographic distances between individual pairs are calculated when tab.dist = ... . Distances calculated from x-y coordinates by simple Pythagorean mathematics can be applied to any metrical positions in sampling. Geographic coordinates from e.g. GPS need to be transformed to decimal GPS coordinates. Be sure to have positions for each individual or remove missing values from inputdata.
Each calculation will have its unique bar-code and is named with the date and population name. Calculations are performed for each population in the inputdata.


Function returns files in a folder named with a bar-code and date of analysis as follows if file.output is set as TRUE:


Matrix of relatedness values for each population.


Matrix of geographic distances for each population.

Depends on selected estimators and mode of analysis. Either a summary of correlation of relatedness with geographic distance for each population or a summary of tests for relatedness within populations compared to reference populations is written to the file.


Matrix of relatedness values calculated from randomized reference population for half siblings.


Matrix of relatedness values calculated from randomized reference population for non related individuals.


Matrix of relatedness values calculated from randomized reference population for full siblings.

Containing an UPGMA cluster dendogram of relatedness values and a histogram of relatedness values per locus and for loci overall.


Containing regression plot and linear fit for geographic distance and genetic relatedness.

Summary of analysis of F statistics and allele/genotype frequencies.

Function returns via return following objects as one list:


Settings of the calculation are passed to this list object.


Mean relatedness for empirical population over all loci.


Summarized relatedness statistics with thresholds and randomized populations from the dataset.


Statistical analysis of the number of siblings found for each population.


Thresholds for relatedness if "Bxy" or "Mxy" are selected as estimators


F_{is} values and statistics for each population if Fis==TRUE


Summary of linear regression of distance data are provided.


Philipp Kraemer, <[email protected]>


Armstrong, W. (2012) fts: R interface to tslib (a time series library in c++). by R package version 0.7.7.
Blouin, M., Parsons, M., Lacaille, V. and Lotz, S. (1996) Use of microsatellite loci to classify indi- viduals by relatedness. Molecular Ecology, 5, 393-401.
Hardy, O.J. and Vekemans, X. (1999) Isolation by distance in a contiuous population: reconciliation between spatial autocorrelation analysis and population genetics models. Heredity, 83, 145-154.
Li, C.C., Weeks, D.E. and Chakravarti, A. (1993) Similarity of DNA fingerprints due to chance and relatedness. Human Heredity, 43, 45-52.
Li, C.C. and Horvitz, D.G. (1953) Some methods of estimating the inbreeding coefficient. Ameri- can Journal of Human Genetics, 5, 107-17.
Loiselle, B.A., Sork, V.L., Nason, J. and Graham, C. (1995) Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). American Journal of Botany, 82, 1420-1425.
Lynch, M. (1988) Estimation of relatedness by DNA fingerprinting. Molecular Biology and Evolu- tion, 5(5), 584-599.
Lynch, M. and Ritland, K. (1999) Estimation of pairwise relatedness with molecular markers. Ge- netics, 152, 1753-1766.
Oliehoek, P. A. et al. (2006) Estimating relatedness between individuals in general populations with a focus on their use in conservation programs. Genetics, 173, 483-496.
Queller, D.C. and Goodnight, K.F. (1989) Estimating relatedness using genetic markers. Evolution, 43, 258-275.
Ritland, K. (1999) Estimators for pairwise relatedness and individual inbreeding coefficients. Ge- netics Research, 67, 175-185.
Wang, J. (2002) An estimator for pairwise relatedness using molecular markers. Genetics, 160, 1203-1215.

See Also

inputformat Emp.calc stat.pops F.stat



     ## Data set is used to calculate Blouins allele sharing index on  
     ## population data. Pairs are set to 10 for convenience.
     ## For statistical reason for your final results you may want to 
     ## use more pairs to model relatedness (1000 pairs are recommended).

     dem.results <- Demerelate(demerelpop[,1:6], value="Mxy", 
                    file.output=FALSE, object=TRUE, pairs=10)

     ## Demerelate can be executed with several different values 
     ## should consult the references to decided which estimator may 
     ## be useful in your case. 
     ## Be careful some estimators may be biased in situations with
     ## no reference populations or violatin of Hardy-Weinberg
     ## Equilibrium.

Demerelate documentation built on May 30, 2017, 8:14 a.m.