Description Usage Arguments Details Value Author(s) See Also Examples
Read the file of signal_intensities, calculate the beta value, filter the unqualified samples and sites. Plot the heat map, box plot, density plot and density bean plot of CpG sites, and bar plot for detect P-value of samples.
1 2 3 4 5 6 | loaddata(fileDir,is_beta=FALSE,beta_method=c("M/(M+U)","M/(M+U+100)"),groupfile
,samplefilter = FALSE,contin=c("ON","OFF"),samplefilterperc = 0.75, XYchrom =
c(FALSE, "X","Y", c("X", "Y")),sitefilter = FALSE, sitefilterperc = 0.75,
filterDecetP =0.05, normalization = FALSE,transfm = c(FALSE, "arcsinsqr", "logit")
,snpfilter=c(FALSE,"within_10","prob_snp"),gcase="case",gcontrol ="control",skip=0
,imputation=c("mean","min","knn"),knn.k=10)
|
fileDir |
The folder name of samples' signal_intensities files. |
is_beta |
Logical. The signal_intensities is beta value or not. |
beta_method |
The method for calculating the beta. |
groupfile |
The name of phenotype file. |
samplefilter |
Logical. Filter the samples whose most detection P values aren't significative or not. |
contin |
'ON' means the phenotype is continuous,just like age etc. 'OFF' means it is discontinued. |
samplefilterperc |
A number in [0,1]. The samples whose percent of the significative detection P values less than this number will be filtered. |
XYchrom |
The CpG sites in X or Y chromosome should or shuoldn't be filtered. |
sitefilter |
Logical. Filter the sites whose most detection P values aren't significative or not. |
sitefilterperc |
A number in [0,1]. The sites whose percent of the significative detection P values less than this number will be filtered. |
filterDecetP |
Threshold: value of significative detection P. Always 0.05 or 0.01. |
normalization |
Logical. Normalization for the different chips or not. |
transfm |
Data transformation for beta or not. Contains 'arcsinsqr' and 'logit'. |
snpfilter |
The CpG sites that contain SNP sites with 10bp or 50bp shuold or shouldn't be filetered. |
gcase |
The name of case group while contin is 'OFF'. |
gcontrol |
The name of case group while contin is 'OFF'. |
skip |
integer: the number of lines of the data file to skip before beginning to read signal_intensities data, the first row must be signal values. |
imputation |
The method to fill the NA.Contains 'mean', 'min' and 'knn'. |
knn.k |
The K number if imputation is 'knn'. |
Loaddata is designed to load and process the methylated data for the package. It provides two methods to calculate the beta value,which means the ratio of methylation,M/(M+U) and M/(M+U+100),M means the intensity of methylation and U means the intensity of unmethylation. For the methylated data, a file per sample. In the signal_intensities file, there are four columns, CpG ID, Methylated_Intensity, Unmethylated_Intensity and Detection_P_value. The groupfile that explain the phenotype of samples. Distinguish the case or control. The samples that at the same group have the same label. The sample IDs are same as the names of corresponding signal_intensities file (without File Suffixes). Loaddata also call the other function to plot the heat map,box plot, density plot and density bean plot of CpG sites, and bar plot for detect P-value of samples.
Loaddata will return an object of class LincMethy450. And return some plots to describe the information of data.
Hui Zhizhihui013201@gmail.com,Yanxun Suhmu_yanxunsu@163.com,Xin Lilixin920126@163.com
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ## Not run:
##the directory of phenotype and 450k methylation's sample data
Dir <- system.file("extdata/localdata",package="LncDM")
setwd(Dir)
###phenotype file's name
groupfile <- "BRCA_pheno.txt"
###our methylation data in the subdirectory "Level_2" is just example data, when you
###run this function, please prepare complete sample files, and change default directory
###to yourself
loadData <- loaddata(fileDir="Level_2",is_beta=FALSE,beta_method="M/(M+U)",groupfile=groupfile,
samplefilter = TRUE,contin="OFF",samplefilterperc = 0.75,XYchrom = c(FALSE, "X","Y"),sitefilter = TRUE,
sitefilterperc = 0.75,filterDecetP=0.05,normalization = FALSE,transfm = FALSE,snpfilter=c(FALSE,"prob_snp"),
gcase="case",gcontrol="control",skip=2,imputation="knn",knn.k=10)
###save the loadData in order to caculate dms,dmr and dme
save(loadData,file="loadData.Rdata",compress="xz")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.