knitr::opts_chunk$set(collapse = TRUE, comment = ">", dev = 'pdf')

"Eager Beginners" Manual for RClone package

RClone data format: one population


 

A. Introduction to RClone

 

RClone is a R package version of GenClone program (Arnaud-Haond & Belkhir 2007): to analyse data (SSR, SNP, ...), test for clonality and describe spatial clonal organisation. Major improvements are multi-populations handling and definition of MLLs (Multilocus Lineages, i.e. slightly distinct Multi Locus Genotypes) through simulations.

 

RClone allows:

  1. Description of data set
  2. discrimination of MLG (MultiLocus Genotypes);
  3. test for reliability of data (in terms of loci and sampling).
  4. Determination of MLL (MultiLocus Lineages)
  5. psex/psex Fis with pvalue computation;
  6. genetic distance matrix computation and threshold definition.
  7. Genotypic diversity and evenness indices calculation
  8. Simpson complement;
  9. Shannon-Wiener diversity and evenness indices;
  10. Hill's Simpson reciprocal;
  11. Pareto index.
  12. Spatial organisation of MLG/MLL
  13. spatial autocorrelation methods;
  14. clonal subrange estimation;
  15. Aggregation and Edge Effect indices estimation.

Some of these analysis can be applied to dataset without clones.

 

B. RClone data format: one population

RClone functions works on diploid/haploid, one or several populations dataset.

If you have several populations in your dataset, go to other vignette RClone_qmsevpops.

 

C. General format

If you have haploid data, you can skip to 4, For GenClone users * or D. Description of data set*.

To use RClone functions, your data table must look like:

library(RClone)
data(posidonia)
knitr::kable(posidonia[1:10,1:8], align = "c")

There is only one allele per column and, per locus, alleles are sorted by increasing order.

This is mandatory for all RClone functions.

As formatting can be source of error, we included functions to help formatting your diploid data:

 

1, The simple case: you already have a one-allele per column table

data(posidonia)

sort_all(posidonia)

2, The classic infile you could have: one locus per column

#Let's create your example table:
test <- matrix("232/231", ncol = 2, nrow = 2)
colnames(test) <- paste("locus", 1:2, sep = "_")


#Use :
data1 <- convert_GC(as.data.frame(test), 3, "/")
data1
knitr::kable(data1, align = "c")

We used "3" because this is the length of the allele (with 3 numbers).
For allele separation, we used "/" because, of course, it was the separator.

3, You already work with Adegenet

Similar to case number 2, except you have to export your genind data into table first:

#library(adegenet)
#with data1, a genind object from Adegenet:

test <- genind2df(data1)
data2 <- convert_GC(test, 3, "/") 
#only if yours alleles are of length "3"

4, For GenClone users

Warning: your infile file must include all the informations available, as locus names and ploidy level (which is not mandatory for GenClone).

data(infile)

#This is nearly a GenClone file, type:
write.table(infile, "infile.csv", col.names = FALSE, row.names = FALSE, sep = ";")

#Now you have a formatted GenClone file:
res <- transcript_GC("infile.csv", ";", 2, 7, 3)
posidonia <- res$data_genet
coord_posidonia <- res$data_coord

You might need to edit your "infile.txt" into "infile.csv" and check if there's "." and not "," for geographic coordinates, and use ";" as separator element.
- "2" is for the ploidy level; should have been "1" for haploid data;
- "7" here is the number of loci;
- "3" is for allele length. Posidonia alleles are always of length "3".

 

D. Description of data set

D.1 Discrimination of MLG

List unique alleles per locus:

Basic commands:

data(posidonia)

list_all_tab(posidonia)

or, for haploid data:

list_all_tab(haplodata, haploid = TRUE)

Results:

list_all_tab(posidonia)
data(posidonia)
knitr::kable(list_all_tab(posidonia), align = "c")

List MLG:

Basic commands:

MLG_tab(posidonia)

or, for haploid data:

MLG_tab(haplodata)

Results:

MLG_tab(posidonia)
knitr::kable(MLG_tab(posidonia)[1:5,], align = "c")

Allelic frequencies:

Basic commands:

freq_RR(posidonia)

or, for haploid data:

freq_RR(haplodata, haploid = TRUE)

Options:

freq_RR(posidonia) #on ramets
freq_RR(posidonia, genet = TRUE) #on genets
freq_RR(posidonia, RR = TRUE) #Round-Robin methods

Results:

freq_RR(posidonia)
res <- cbind(freq_RR(posidonia), freq_RR(posidonia, genet = TRUE)[,3], freq_RR(posidonia, RR = TRUE)[,3])[1:7,]
colnames(res)[3:5] <- c("freq_ramet", "freq_genet", "freq_RR")
knitr::kable(res, align = "c")

 

D.2 Tests for reliability of loci and subsampling of individuals

On loci

Basic commands:

sample_loci(posidonia, nbrepeat = 1000)

or, for haploid data:

sample_loci(haplodata, haploid = TRUE, nbrepeat = 1000)

Options:

sample_loci(posidonia, nbrepeat = 1000, He = TRUE) #with He results
sample_loci(posidonia, nbrepeat = 1000, graph = TRUE) #graph displayed
sample_loci(posidonia, nbrepeat = 1000, bar = TRUE) #progression bar
                                                    #could be time consuming
sample_loci(posidonia, nbrepeat = 1000, export = TRUE) #graph export in .eps

Results:

res <- sample_loci(posidonia, nbrepeat = 1000, He = TRUE) #time consuming
names(res)
data(resvigncont)
names(resvigncont$resvigncont$res_SU1)
#Results: MLG
res$res_MLG
knitr::kable(resvigncont$res_SU1$res_MLG, align = "c")
#Results: alleles
res$res_alleles
knitr::kable(resvigncont$res_SU1$res_alleles, align = "c")
#Results: raw data
#res$raw_He
#res$raw_MLG
#res$raw_all
boxplot(res$raw_MLG, main = "Genotype accumulation curve",
    xlab = "Number of loci sampled", ylab = "Number of multilocus genotypes") 
boxplot(resvigncont$res_SU1$raw_MLG, main = "Genotype accumulation curve", xlab = "Number of loci sampled", ylab = "Number of multilocus genotypes") 

Same on units

Basic commands:

sample_units(posidonia, nbrepeat = 1000)

or, for haploid data:

sample_units(haplodata, haploid = TRUE, nbrepeat = 1000)

This sub-sampling analysis deliver basic estimates of richness and diversity for an increasing number of sampling units.
They can be used to standardise estimates of populations with different sampling effort.

 

E Determination of MLL

E.1 psex/psex Fis with pvalue computation

pgen, psex and p-values

Basic commands:

pgen(posidonia)
psex(posidonia)

or, for haploid data:

pgen(haplodata, haploid = TRUE)
psex(haplodata, haploid = TRUE)

Options: (idem on psex and pgen)

#allelic frequencies computation:
psex(posidonia) #psex on ramets
psex(posidonia, genet = TRUE) #psex on genets
psex(posidonia, RR = TRUE) #psex with Round-Robin method
#psex computation
psex(posidonia) #psex with one psex per replica
psex(posidonia, MLGsim = TRUE) #psex MLGsim method
#pvalues:
psex(posidonia, nbrepeat = 100) #with p-values
psex(posidonia, nbrepeat = 1000, bar = TRUE) #with p-values and a progression bar

Results:

res <- psex(posidonia, RR = TRUE, nbrepeat = 1000)
res[[1]] #if nbrepeat != 0, res contains a table of psex values 
                                    #and a vector of sim-psex values
knitr::kable(resvigncont$res_PS2, align = "c")
res[[2]] #sim psex values
resvigncont$res_PS1[[2]]

 

Fis, pgen Fis, psex Fis and p-values

Not for haploid data !

Fis

Basic commands:

Fis(posidonia)

Options:

Fis(posidonia) #Fis on ramets
Fis(posidonia, genet = TRUE) #Fis on genets
Fis(posidonia, RR = TRUE) #Fis with Round-Robin methods
#RR = TRUE contains two results : a table with allelic frequencies 
                             #and a table with Fis results

Results:

Fis(posidonia, RR = TRUE)[[2]]
knitr::kable(Fis(posidonia, RR = TRUE)[[2]], align = "c")

pgen Fis, psex Fis and p-values

Basic commands: (idem for pgen_Fis and psex_Fis)

pgen_Fis(posidonia)

Options:

#allelic frequencies:
psex_Fis(posidonia) #psex Fis on ramets
psex_Fis(posidonia, genet = TRUE) #psex Fis on genets
psex_Fis(posidonia, RR = TRUE) #psex Fis with Round-Robin method
#psex computation
psex_Fis(posidonia) #psex Fis, one for each replica
psex_Fis(posidonia, MLGsim = TRUE) #psex Fis with MLGsim method
#pvalues
psex_Fis(posidonia, nbrepeat = 100) #with p-values
psex_Fis(posidonia, nbrepeat = 1000, bar = TRUE) #with p-values and a progression bar

Results:

res <- psex_Fis(posidonia, RR = TRUE, nbrepeat = 1000)
res[[1]] 
#if nbrepeat != 0, res contains a table of psex values 
                           #and a vector of sim-psex Fis values
knitr::kable(resvigncont$res_PS4, align = "c")
res[[2]] #sim psex Fis values
resvigncont$res_PS3[[2]]

 

E.2 Tests for MLLs occurrence and assessment of their memberships

Genetic distance matrix computation and threshold definition

On a theoretical diploid population with c = 0.9999 (c, clonality rate).

data(popsim)

#genetic distances computation, distance on allele differences:
respop <- genet_dist(popsim)
ressim <- genet_dist_sim(popsim, nbrepeat = 1000) #theoretical distribution: 
                                                  #sexual reproduction
ressimWS <- genet_dist_sim(popsim, genet = TRUE, nbrepeat = 1000) #idem, without selfing
data(popsim)
respop <- resvigncont$respop
ressim <- resvigncont$ressim
ressimWS <- resvigncont$ressimWS
#graph prep.:
p1 <- hist(respop$distance_matrix, freq = FALSE, col = rgb(0,0.4,1,1), main = "popsim", 
            xlab = "Genetic distances", breaks = seq(0, max(respop$distance_matrix)+1, 1))
p2 <- hist(ressim$distance_matrix, freq = FALSE, col = rgb(0.7,0.9,1,0.5), main = "popSR", 
            xlab = "Genetic distances", breaks = seq(0, max(ressim$distance_matrix)+1, 1))
p3 <- hist(ressimWS$distance_matrix, freq = FALSE, col = rgb(0.9,0.5,1,0.3), 
            main = "popSRWS", xlab = "Genetic distances", 
            breaks = seq(0, max(ressimWS$distance_matrix)+1, 1))
limx <- max(max(respop$distance_matrix), max(ressim$distance_matrix), 
            max(ressimWS$distance_matrix))

#graph superposition: 
plot(p1, col = rgb(0,0.4,1,1), freq = FALSE, xlim = c(0,limx), main = "", 
        xlab = "Genetic distances")
plot(p2, col = rgb(0.7,0.9,1,0.5), freq = FALSE, add = TRUE)
plot(p3, col = rgb(0.9,0.5,1,0.3), freq = FALSE, add = TRUE)

#adding a legend:
leg.txt <- c("original data","simulated data", "without selfing")
col <- c(rgb(0,0.4,1,1), rgb(0.7,0.9,1,0.5), rgb(0.9,0.5,1,0.3))
legend("top", fill = col, leg.txt, plot = TRUE, bty = "o", box.lwd = 1.5, 
bg = "white")
#determining alpha2
table(respop$distance_matrix)
#alpha2 = 4
#creating MLL list:
MLLlist <- MLL_generator(popsim, alpha2 = 4)
#or
res <- genet_dist(popsim, alpha2 = 4)
MLLlist <- MLL_generator2(res$potential_clones, MLG_list(popsim))

For haploid data, theoretical example:

respop <- genet_dist(haplodata, haploid = TRUE)
ressim <- genet_dist_sim(haplodata, haploid = TRUE, nbrepeat = 1000)
MLLlist <- MLL_generator(haplodata, haploid = TRUE, alpha2 = 4)
#or
res <- genet_dist(haplodata, haploid = TRUE, alpha2 = 4)
MLLlist <- MLL_generator2(res$potential_clones, haploid = TRUE, MLG_list(haplodata))

 

F. Genotypic diversity, richness and evenness indices calculation

F.1 Classic genotypic indices

Basic commands:

clonal_index(posidonia)

or, with MLL:

clonal_index(popsim, listMLL = MLLlist)

or, for haploid data:

clonal_index(haplodata)

Results:

clonal_index(posidonia)
knitr::kable(resvigncont$rescl, align = "c")
data(coord_posidonia)

 

F.2 Pareto index

Basic commands:

Pareto_index(posidonia)

or, with MLL:

Pareto_index(popsim, listMLL = MLLlist)

or, for haploid data:

Pareto_index(haplodata)

Options:

Pareto_index(posidonia, graph = TRUE) #classic graphic
Pareto_index(posidonia, legends = 2, export = TRUE) #export option
Pareto_index(posidonia, full = TRUE) #all results

Results:

res <- Pareto_index(posidonia, full = TRUE, graph = TRUE, legends = 2)
names(res)
res$Pareto
res$c_Pareto
#res$regression_results
#res$coords_Pareto #points coordinates

 

G. Spatial components of clonality

G.1 Spatial autocorrelation

Basic commands:

autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE)

or, with MLL:

autocorrelation(popsim, coords = coord_sim, Loiselle = TRUE, listMLL = MLLlist)

or, for haploid data:

autocorrelation(haplodata, haploid = TRUE, coords = coord_haplo, Loiselle = TRUE)

Lot's of options:

data(posidonia)
data(coord_posidonia)

#kinship distances:
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE)
autocorrelation(posidonia, coords = coord_posidonia, Ritland = TRUE)

#ramets/genets methods:
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE) #ramets
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                    genet = TRUE, central_coords = TRUE) 
                                            #genets, central coordinates of each MLG
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                genet = TRUE, random_unit = TRUE) #genets, one random unit per MLG
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                genet = TRUE, weighted = TRUE) #genets, with weighted matrix on kinships

#distance classes construction:
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE) 
                                                    #10 equidistant classes
distvec <- c(0,10,15,20,30,50,70,76.0411074) 
                        #with 0, min distance and 76.0411074, max distance
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                    vecdist = distvec) #custom distance vector
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                    class1 = TRUE, d = 7) #7 equidistant classes
autocorrelation(posidonia, coords = coord_posidonia, Loiselle = TRUE, 
                    class2 = TRUE, d = 7) 
                    #7 distance classes with the same number of units in each

#graph options:
autocorrelation(posidonia, coords = coord_posidonia, Ritland = TRUE, graph = TRUE) 
                                                                    #displays graph
autocorrelation(posidonia, coords = coord_posidonia, Ritland = TRUE, export = TRUE) 
                                                                    #export graph

#pvalues computation
autocorrelation(posidonia, coords = coord_posidonia, Ritland = TRUE, nbrepeat = 1000)

Results:

res <- autocorrelation(posidonia, coords = coord_posidonia, Ritland = TRUE, 
                        nbrepeat = 1000, graph = TRUE)
plot(resvigncont$resauto$Main_results[,3], resvigncont$resauto$Main_results[,6], main = "Spatial aucorrelation analysis",
ylim = c(-0.2,0.2), type = "l", xlab = "Spatial distance", ylab = "Coancestry (Fij)")
points(resvigncont$resauto$Main_results[,3], resvigncont$resauto$Main_results[,6], pch = 20)
abline(h = 0, lty = 3)
names(res)
names(resvigncont$resauto)
res$Main_results #enables graph reproduction
knitr::kable(resvigncont$resauto$Main_results, align = "c")
apply(res$Main_results, 2, mean)[6] #mean Fij
apply(resvigncont$resauto$Main_results, 2, mean)[6] #mean Fij
res$Slope_and_Sp_index #gives b and Sp indices
knitr::kable(resvigncont$resauto$Slope_and_Sp_index, align = "c")
#raw data:
#res$Slope_resample
#res$Kinship_resample
#res$Matrix_kinship_results
#res$Class_kinship_results 
#res$Class_distance_results

 

G.2 Clonal subrange

Basic commands:

clonal_sub(posidonia, coords = coord_posidonia)

or, with MLL:

clonal_sub(popsim, coords = coord_sim, listMLL = MLLlist)

or, for haploid data:

clonal_sub(haplodata, haploid = TRUE, coords = coord_haplo)

Options: same distance classes definition as autocorrelation:

clonal_sub(posidonia, coords = coord_posidonia) #basic, with 10 equidistant classes
distvec <- c(0,10,15,20,30,50,70,76.0411074) 
                        #with 0, min distance and 76.0411074, max distance
clonal_sub(posidonia, coords = coord_posidonia, vecdist = distvec) 
                                                #custom distance classes
clonal_sub(posidonia, coords = coord_posidonia, class1 = TRUE, d = 7) 
                                                #7 equidistant classes
clonal_sub(posidonia, coords = coord_posidonia, class1 = TRUE, d = 7) 
                #7 distance classes with the same number of units in each

Results:

res <- clonal_sub(posidonia, coords = coord_posidonia)
res[[1]] #Global clonal subrange
resvigncont$rescs[[1]]
res$clonal_sub_tab  #details per class
knitr::kable(resvigncont$rescs$clonal_sub_tab, align ="c")

 

G.3 Aggregation index

Basic commands:

agg_index(posidonia, coords = coord_posidonia)

or, with MLL:

agg_index(popsim, coords = coord_sim, listMLL = MLLlist)

or, for haploid data:

agg_index(haplodata, coords = coord_haplo)

Options:

agg_index(posidonia, coords = coord_posidonia, nbrepeat = 100) #pvalue computation
agg_index(posidonia, coords = coord_posidonia, nbrepeat = 1000, bar = TRUE) 
                                                            #could be time consuming

Results:

res <- agg_index(posidonia, coords = coord_posidonia, nbrepeat = 1000) 
res$results #Aggregation index
knitr::kable(resvigncont$resagg$results, align = "c")
#res$simulation #vector of sim aggregation index

 

G.4 Edge Effect

Basic commands:

#for posidonia, center of quadra is at 40,10
edge_effect(posidonia, coords = coord_posidonia, center = c(40,10))

or, with MLL:

edge_effect(popsim, coords = coord_sim, center = c(40,10), listMLL = MLLlist)

or, for haploid data:

edge_effect(haplodata, coords = coord_haplo, center = c(40,10))

Options:

edge_effect(posidonia, coords = coord_posidonia, center = c(40,10), nbrepeat = 100) 
                                                                    #pvalue computation
edge_effect(posidonia, coords = coord_posidonia, center = c(40,10), nbrepeat = 1000, 
                                                    bar = TRUE) #could be time consuming

Results:

res <- edge_effect(posidonia, coords = coord_posidonia, center = c(40,10), nbrepeat = 1000)
res$results #Aggregation index
knitr::kable(resvigncont$resee$results, align = "c")
#res$simulation #vector of sim aggregation index

 

H. BONUS: "Ready to use" Table

Summary function of main results:

Basic commands:

genclone(posidonia, coords = coord_posidonia)

or, with MLL:

genclone(popsim, coords = coord_sim, listMLL = MLLlist)

or, for haploid data:

genclone(haplodata, haploid = TRUE, coords = coord_haplo)

Options:

GenClone(posidonia, coords = coord_posidonia, nbrepeat = 100) #pvalues
GenClone(posidonia, coords = coord_posidonia, nbrepeat = 1000, bar = TRUE) 
                                                    #could be time consuming

Results:

GenClone(posidonia, coords = coord_posidonia)
knitr::kable(resvigncont$resgen[,1:10], longtable = TRUE, align = "c")
knitr::kable(resvigncont$resgen[,11:17], longtable = TRUE, align = "c")
knitr::kable(resvigncont$resgen[,18:24], longtable = TRUE, align = "c")


Try the RClone package in your browser

Any scripts or data that you put into this service are public.

RClone documentation built on May 2, 2019, 4:18 a.m.