meta3d: Detect rhythmic signals from time-series datasets with...

Description Usage Arguments Details Value References Examples

View source: R/meta3dMainF.R

Description

This is a function that takes use of any one method from ARSER, JTK_CYCLE and Lomb-Scargle to detect rhythmic signals from time-series datasets containing individual information.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
meta3d(datafile, designfile, outdir = "metaout", filestyle,
  design_libColm, design_subjectColm, minper = 20, maxper = 28,
  cycMethodOne = "JTK", timeUnit = "hour", design_hrColm,
  design_dayColm = NULL, design_minColm = NULL,
  design_secColm = NULL, design_groupColm = NULL,
  design_libIDrename = NULL, adjustPhase = "predictedPer",
  combinePvalue = "fisher", weightedMethod = TRUE,
  outIntegration = "both", ARSmle = "auto", ARSdefaultPer = 24,
  dayZeroBased = FALSE, outSymbol = "", parallelize = FALSE,
  nCores = 1)

Arguments

datafile

a character string. The name of data file containing time-series experimental values of all individuals.

designfile

a character string. The name of experimental design file, at least containing the library ID(column names of datafile), subject ID(the individual corresponding to each library ID), and sampling time information of each library ID.

outdir

a character string. The name of directory used to store output files.

filestyle

a character vector(length 1 or 3). The data format of input files, must be "txt", or "csv", or a character vector containing field separator character(sep), quoting character(quote), and the character used for decimal points(dec, for details see read.table).

design_libColm

a numeric value. The order index(from left to right) of the column storing library ID in designfile.

design_subjectColm

a numeric value. The order index(from left to right) of the column storing subject ID in designfile.

minper

a numeric value. The minimum period length of interested rhythms. The default is 20 for circadian rhythms.

maxper

a numeric value. The maximum period length of interested rhythms. The default is 28 for circadian rhythms.

cycMethodOne

a character string. The selected method for analyzing time-series data of each individual, must be one of "ARS"(ARSER), "JTK"(JTK_CYCLE), or "LS"(Lomb-Scargle).

timeUnit

a character string. The basic time-unit, must be one of "day", "hour"(default for circadian study), "minute", or "second" depending on specific experimental design.

design_hrColm

a numeric value. The order index(from left to right) of the column storing time point value-sampling hour information in designfile. If there is no such column in designfile, set it as NULL.

design_dayColm

a numeric value. The order index(from left to right) of the column storing time point value-sampling day information in designfile. If there is no such column in designfile, set it as NULL(default).

design_minColm

a numeric value. The order index(from left to right) of the column storing time point value-sampling minute information in designfile. If there is no such column in designfile, set it as NULL(default).

design_secColm

a numeric value. The order index(from left to right) of the column storing time point value-sampling second information in designfile. If there is no such column in designfile, set it as NULL(default).

design_groupColm

a numeric value. The order index(from left to right) of the column storing experimental group information of each individual in designfile. If there is no such column in designfile, set it as NULL(default) and take all individuals as one group.

design_libIDrename

a character vector(length 2) containing a matchable character string in each library ID of designfile, and a replacement character string. If it is not necessary to replace characters in library ID of designfile, set it as NULL( default).

adjustPhase

a character string. The method used to adjust each calculated phase before getting integrated phase, must be one of "predictedPer"(adjust phase with predicted period length) or "notAdjusted"(not adjust phase).

combinePvalue

a character string. The method used to integrate p-values of multiple individuals, currently only "fisher"( Fisher's method) could be selected.

weightedMethod

logical. If TRUE(default), weighted score based on p-value of each individual will be used to integrate period, phase and amplitude values of multiple individuals.

outIntegration

a character string. This parameter controls what kinds of analysis results will be outputted, must be one of "both", "onlyIntegration", or "noIntegration". See meta2d for more information.

ARSmle

a character string. The strategy of using MLE method in "ARS", must be one of "auto", "mle", or "nomle". See meta2d for more information.

ARSdefaultPer

a numeric value. The expected period length of interested rhythm, which is a necessary parameter for ARS. See meta2d for more information.

dayZeroBased

logical. If TRUE, the first sampling day is recorded as day zero in the designfile.

outSymbol

a character string. A common prefix exists in the names of output files.

parallelize

logical. If TRUE, computation will be done in paralleL Doesn't work in windows machine

nCores

a integer. Bigger or equal to one, number of cores to use

Details

This function is originally aimed to analyze large scale periodic data with individual information. Please pay attention to the data format of datafile and designfile(see Examples part). Time-series experimental values(missing values as NA) from all individuals should be stored in datafile, with the first row containing all library ID(unique identification number for each sample) and the first column containing all detected molecular names(eg. transcript or gene name). The designfile should at least have three columns-library ID, subject ID and sampling time column. Experimental group information of each subject ID may be in another column. In addition, sampling time information may be stored in multiple columns instead of one column. For example, sampling time-"36 hours" may be recorded as "day 2"(sampling day column, design_dayColm) plus "12 hours"(sampling hour column, design_hrColm). The library ID in datafile and designfile should be same. If there are different characters between library ID in these two files, try design_libIDrename to keep them same.

ARS, JTK or LS could be used to analyze time-series profiles individual by individual. meta3d requires that all individuals should be analyzed by the same method before integrating calculated p-value, period, phase, baseline value, amplitude and relative amplitude values group by group. However, the sampling pattern among individuals may be different and the requirement of sampling pattern for each method is not same(see more information about these methods and their limitations in meta2d). Please carefully select a proper method for the specific dataset. meta3d also help users select the suitable method through warning notes.

P-values from different individuals are integrated with Fisher's method ("fisher")(Fisher,1925; implementation code from MADAM).For short time-series profiles(eg. 10 time points or less), p-values given by Lomb-Scargle may be over conservative, which will also lead to conservative integrated p-values. The integrated period, baseline, amplitude and relative amplitude values are arithmetic mean of multiple individuals, respectively. The phase is mean of circular quantities(adjustPhase = "predictedPer") or a arithmetic mean (adjustPhase = "notAdjusted") of multiple individual phases. For completely removing the potential problem of averaging phases with quite different period length(also mentioned in meta2d), setting minper, maxper and ARSdefaultPer to a same value may be the only known way. If weightedMethod = TRUE is selected, weighted scores( -log10(p-values)) will be taken into account in integrating period, phase, baseline, amplitude and relative amplitude.

Value

meta3d will write analysis results to outdir instead of returning them as objects. Output files with "meta3dSubjectID" in the file name are analysis results for each individual. Files named with "meta3dGroupID" store integrated p-values, period, phase, baseline, amplitude and relative amplitude values from multiple individuals of each group and calculated FDR values based on integrated p-values.

References

Glynn E. F., Chen J., and Mushegian A. R. (2006). Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms. Bioinformatics, 22(3), 310–316

Fisher, R.A. (1925). Statistical methods for research workers. Oliver and Boyd (Edinburgh).

Kugler K. G., Mueller L.A., and Graber A. (2010). MADAM - an open source toolbox for meta-analysis. Source Code for Biology and Medicine, 5, 3.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# write 'cycHumanBloodData' and 'cycHumanBloodDesign' into two 'csv' files
write.csv(cycHumanBloodData, file="cycHumanBloodData.csv",
  row.names=FALSE)
write.csv(cycHumanBloodDesign, file="cycHumanBloodDesign.csv",
  row.names=FALSE)

# detect circadian transcripts with JTK in studied individuals
meta3d(datafile="cycHumanBloodData.csv", cycMethodOne="JTK",
  designfile="cycHumanBloodDesign.csv", outdir="example",
  filestyle="csv", design_libColm=1, design_subjectColm=2,
  design_hrColm=4, design_groupColm=3)

Example output

The 'meta3d' is processing AF0004SleepExtension, the 1 in total 42 subjects.
The 'meta3d' is processing AF0010SleepExtension, the 2 in total 42 subjects.
The 'meta3d' is processing AF0033SleepExtension, the 3 in total 42 subjects.
The 'meta3d' is processing AF0069SleepExtension, the 4 in total 42 subjects.
The 'meta3d' is processing AF0089SleepExtension, the 5 in total 42 subjects.
The 'meta3d' is processing AF0094SleepExtension, the 6 in total 42 subjects.
The 'meta3d' is processing AF0101SleepExtension, the 7 in total 42 subjects.
The 'meta3d' is processing AF0114SleepExtension, the 8 in total 42 subjects.
The 'meta3d' is processing AF0122SleepExtension, the 9 in total 42 subjects.
The 'meta3d' is processing AF0139SleepExtension, the 10 in total 42 subjects.
The 'meta3d' is processing AF0145SleepExtension, the 11 in total 42 subjects.
The 'meta3d' is processing AF0164SleepExtension, the 12 in total 42 subjects.
The 'meta3d' is processing AF0191SleepExtension, the 13 in total 42 subjects.
The 'meta3d' is processing AF0205SleepExtension, the 14 in total 42 subjects.
The 'meta3d' is processing AF0212SleepExtension, the 15 in total 42 subjects.
The 'meta3d' is processing AF0214SleepExtension, the 16 in total 42 subjects.
The 'meta3d' is processing AF0267SleepExtension, the 17 in total 42 subjects.
The 'meta3d' is processing AF0286SleepExtension, the 18 in total 42 subjects.
The 'meta3d' is processing AF0295SleepExtension, the 19 in total 42 subjects.
The 'meta3d' is processing AF0318SleepExtension, the 20 in total 42 subjects.
The 'meta3d' is processing AF0335SleepExtension, the 21 in total 42 subjects.
The 'meta3d' is processing AF0004SleepRestriction, the 22 in total 42 subjects.
The 'meta3d' is processing AF0010SleepRestriction, the 23 in total 42 subjects.
The 'meta3d' is processing AF0069SleepRestriction, the 24 in total 42 subjects.
The 'meta3d' is processing AF0079SleepRestriction, the 25 in total 42 subjects.
The 'meta3d' is processing AF0089SleepRestriction, the 26 in total 42 subjects.
The 'meta3d' is processing AF0091SleepRestriction, the 27 in total 42 subjects.
The 'meta3d' is processing AF0094SleepRestriction, the 28 in total 42 subjects.
The 'meta3d' is processing AF0101SleepRestriction, the 29 in total 42 subjects.
The 'meta3d' is processing AF0114SleepRestriction, the 30 in total 42 subjects.
The 'meta3d' is processing AF0139SleepRestriction, the 31 in total 42 subjects.
The 'meta3d' is processing AF0145SleepRestriction, the 32 in total 42 subjects.
The 'meta3d' is processing AF0164SleepRestriction, the 33 in total 42 subjects.
The 'meta3d' is processing AF0182SleepRestriction, the 34 in total 42 subjects.
The 'meta3d' is processing AF0191SleepRestriction, the 35 in total 42 subjects.
The 'meta3d' is processing AF0205SleepRestriction, the 36 in total 42 subjects.
The 'meta3d' is processing AF0212SleepRestriction, the 37 in total 42 subjects.
The 'meta3d' is processing AF0214SleepRestriction, the 38 in total 42 subjects.
The 'meta3d' is processing AF0267SleepRestriction, the 39 in total 42 subjects.
The 'meta3d' is processing AF0286SleepRestriction, the 40 in total 42 subjects.
The 'meta3d' is processing AF0318SleepRestriction, the 41 in total 42 subjects.
The 'meta3d' is processing AF0335SleepRestriction, the 42 in total 42 subjects.
DONE! The analysis about 'cycHumanBloodData.csv' and 'cycHumanBloodDesign.csv' has been finished.
                user.self     sys.self      elapsed   user.child    sys.child 
"Time used:"      "2.384"       "0.02"      "2.407"          "0"          "0" 

MetaCycle documentation built on May 2, 2019, 9:14 a.m.