Detection of outlier points in time courses of an experiment in PhenoArch greenhouse. Use locfit smoothing function from the locfit library [2]. For each time course of a dataset, a locfit smoothin is applied, predictive confidence interval calculated (Y$_$hat +/- threshold*Y$_$hat$_$se). Points are declared outlier if outside this confidence interval. the user choose the threshold.
library(lubridate) library(dplyr) library(locfit) library(phisStatR)
In this vignette, we use a toy data set of the phisStatR library (anonymized real data set).
mydata<-plant1 str(mydata) mydata<-filter(mydata,!is.na(mydata$thermalTime))
I have chosen a smoothing parameter of 30 and a threshold of 8 to detect the outlier points.
resu1<-flagPointLocfit(datain=mydata,trait="biovolume",xvar="thermalTime",loopID="Ref", locfit.h=30,threshold=8)
The output report can be over-sized (more than 1Mb), for size of sub-directories in packages purposes, I choose to represent only the first genotypes...
myindex<-as.character(unique(resu1[[1]][,"Ref"])) myindex<-myindex[1:30] for (i in seq(1,length(myindex),by=15)){ myvec<-myindex[seq(i,i+14,1)] plotFlagPoint(smoothin=resu1[[1]],loopID="Ref",myselect=myvec) }
filter(resu1[[1]],outlier==1)
# Please change the Ref column by the one in your dataframe if(is.null(resu1[[3]])){ print("All the time courses have more than 4 points.") } else { ggplot(data=resu1[[3]],aes(x=x,y=y)) + geom_point() + facet_wrap(~Ref) }
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.