DEGexp2: DEGexp2: Identifying Differentially Expressed Genes from gene...

Description Usage Arguments References See Also Examples

View source: R/MainFunction2.R

Description

This function is another (old) version of DEGexp. It takes the gene expression files as input instead of gene expression matrixs.

Usage

1
2
3
4
5
6
7
DEGexp2(geneExpFile1, geneCol1=1, expCol1=2, depth1=rep(0, length(expCol1)), groupLabel1="group1",
        geneExpFile2, geneCol2=1, expCol2=2, depth2=rep(0, length(expCol2)), groupLabel2="group2",
        header=TRUE, sep="", method=c("LRT", "CTR", "FET", "MARS", "MATR", "FC"), 
        pValue=1e-3, zScore=4, qValue=1e-3, foldChange=4, 
        thresholdKind=1, outputDir="none", normalMethod=c("none", "loess", "median"),
        replicate1="none", geneColR1=1, expColR1=2, depthR1=rep(0, length(expColR1)), replicateLabel1="replicate1",
        replicate2="none", geneColR2=1, expColR2=2, depthR2=rep(0, length(expColR2)), replicateLabel2="replicate2", rawCount=TRUE)

Arguments

geneExpFile1

file containing gene expression values for replicates of sample1 (or replicate1 when method="CTR").

geneCol1

gene id column in geneExpFile1.

expCol1

expression value columns in geneExpFile1 for replicates of sample1 (numeric vector).
Note: Each column corresponds to a replicate of sample1.

depth1

the total number of reads uniquely mapped to genome for each replicate of sample1 (numeric vector),
default: take the total number of reads mapped to all annotated genes as the depth for each replicate.

groupLabel1

label of group1 on the plots.

geneExpFile2

file containing gene expression values for replicates of sample2 (or replicate2 when method="CTR").

geneCol2

gene id column in geneExpFile2.

expCol2

expression value columns in geneExpFile2 for replicates of sample2 (numeric vector).
Note: Each column corresponds to a replicate of sample2.

depth2

the total number of reads uniquely mapped to genome for each replicate of sample2 (numeric vector),
default: take the total number of reads mapped to all annotated genes as the depth for each replicate.

groupLabel2

label of group2 on the plots.

header

a logical value indicating whether geneExpFile1 and geneExpFile2 contain the names of the variables as its first line. See ?read.table.

sep

the field separator character. If sep = "" (the default for read.table) the separator is white space, that is one or more spaces, tabs, newlines or carriage returns. See ?read.table.

method

method to identify differentially expressed genes. Possible methods are:

  • "LRT": Likelihood Ratio Test (Marioni et al. 2008),

  • "CTR": Check whether the variation between Technical Replicates can be explained by the random sampling model (Wang et al. 2009),

  • "FET": Fisher's Exact Test (Joshua et al. 2009),

  • "MARS": MA-plot-based method with Random Sampling model (Wang et al. 2009),

  • "MATR": MA-plot-based method with Technical Replicates (Wang et al. 2009),

  • "FC" : Fold-Change threshold on MA-plot.

pValue

pValue threshold (for the methods: LRT, FET, MARS, MATR).
only used when thresholdKind=1.

zScore

zScore threshold (for the methods: MARS, MATR).
only used when thresholdKind=2.

qValue

qValue threshold (for the methods: LRT, FET, MARS, MATR).
only used when thresholdKind=3 or thresholdKind=4.

thresholdKind

the kind of threshold. Possible kinds are:

  • 1: pValue threshold,

  • 2: zScore threshold,

  • 3: qValue threshold (Benjamini et al. 1995),

  • 4: qValue threshold (Storey et al. 2003),

  • 5: qValue threshold (Storey et al. 2003) and Fold-Change threshold on MA-plot are both required (can be used only when method="MARS").

foldChange

fold change threshold on MA-plot (for the method: FC).

outputDir

the output directory.

normalMethod

the normalization method: "none", "loess", "median" (Yang et al. 2002).
recommend: "none".

replicate1

file containing gene expression values for replicate batch1 (only used when method="MATR").
Note: replicate1 and replicate2 are two (groups of) technical replicates of a sample.

geneColR1

gene id column in the expression file for replicate batch1 (only used when method="MATR").

expColR1

expression value columns in the expression file for replicate batch1 (numeric vector) (only used when method="MATR").

depthR1

the total number of reads uniquely mapped to genome for each replicate in replicate batch1 (numeric vector),
default: take the total number of reads mapped to all annotated genes as the depth for each replicate (only used when method="MATR").

replicateLabel1

label of replicate batch1 on the plots (only used when method="MATR").

replicate2

file containing gene expression values for replicate batch2 (only used when method="MATR").
Note: replicate1 and replicate2 are two (groups of) technical replicates of a sample.

geneColR2

gene id column in the expression file for replicate batch2 (only used when method="MATR").

expColR2

expression value columns in the expression file for replicate batch2 (numeric vector) (only used when method="MATR").

depthR2

the total number of reads uniquely mapped to genome for each replicate in replicate batch2 (numeric vector),
default: take the total number of reads mapped to all annotated genes as the depth for each replicate (only used when method="MATR").

replicateLabel2

label of replicate batch2 on the plots (only used when method="MATR").

rawCount

a logical value indicating the gene expression values are based on raw read counts or normalized values.

References

Benjamini,Y. and Hochberg,Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289-300.

Jiang,H. and Wong,W.H. (2008) Statistical inferences for isoform expression in RNA-seq. Bioinformatics, 25, 1026-1032.

Bloom,J.S. et al. (2009) Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays. BMC Genomics, 10, 221.

Marioni,J.C. et al. (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res., 18, 1509-1517.

Storey,J.D. and Tibshirani,R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. 100, 9440-9445.

Wang,L.K. and et al. (2010) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data, Bioinformatics 26, 136 - 138.

Yang,Y.H. et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30, e15.

See Also

DEGexp, DEGseq, getGeneExp, readGeneExp, GeneExpExample1000, GeneExpExample5000.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  
  ## kidney: R1L1Kidney, R1L3Kidney, R1L7Kidney, R2L2Kidney, R2L6Kidney 
  ## liver: R1L2Liver, R1L4Liver, R1L6Liver, R1L8Liver, R2L3Liver
  
  geneExpFile <- system.file("extdata", "GeneExpExample5000.txt", package="DEGseq")
  outputDir <- file.path(tempdir(), "DEGexpExample")
  exp <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,12,15,18))
  exp[30:35,]
  exp <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,11,13,16))
  exp[30:35,]
  DEGexp2(geneExpFile1=geneExpFile, geneCol1=1, expCol1=c(7,9,12,15,18), groupLabel1="kidney",
          geneExpFile2=geneExpFile, geneCol2=1, expCol2=c(8,10,11,13,16), groupLabel2="liver",
          method="MARS", outputDir=outputDir)
  cat("outputDir:", outputDir, "\n")

Example output

Loading required package: qvalue
Loading required package: samr
                EnsemblGeneID     R1L1Kidney R1L3Kidney R1L7Kidney R2L2Kidney
ENSG00000188976 "ENSG00000188976" "73"       "77"       "68"       "70"      
ENSG00000187961 "ENSG00000187961" "15"       "15"       "13"       "12"      
ENSG00000187583 "ENSG00000187583" "1"        "1"        "3"        "0"       
ENSG00000187642 "ENSG00000187642" "4"        "5"        "12"       "9"       
ENSG00000188290 "ENSG00000188290" "9"        "10"       "12"       "10"      
ENSG00000187608 "ENSG00000187608" "12"       "11"       "7"        "16"      
                R2L6Kidney
ENSG00000188976 "82"      
ENSG00000187961 "15"      
ENSG00000187583 "3"       
ENSG00000187642 "9"       
ENSG00000188290 "13"      
ENSG00000187608 "13"      
                EnsemblGeneID     R1L2Liver R1L4Liver R1L6Liver R1L8Liver
ENSG00000188976 "ENSG00000188976" "34"      "56"      "45"      "55"     
ENSG00000187961 "ENSG00000187961" "8"       "13"      "11"      "12"     
ENSG00000187583 "ENSG00000187583" "0"       "1"       "0"       "0"      
ENSG00000187642 "ENSG00000187642" "0"       "0"       "2"       "1"      
ENSG00000188290 "ENSG00000188290" "2"       "3"       "1"       "2"      
ENSG00000187608 "ENSG00000187608" "15"      "17"      "5"       "10"     
                R2L3Liver
ENSG00000188976 "42"     
ENSG00000187961 "20"     
ENSG00000187583 "2"      
ENSG00000187642 "4"      
ENSG00000188290 "1"      
ENSG00000187608 "16"     
Please wait...

geneExpFile1:  /usr/local/lib/R/site-library/DEGseq/extdata/GeneExpExample5000.txt 
gene id column in geneExpFile1:  1 
expression value column(s) in geneExpFile1: 7 9 12 15 18 
total number of reads uniquely mapped to genome obtained from sample1: 345504 354981 334557 366041 372895 

geneExpFile2:  /usr/local/lib/R/site-library/DEGseq/extdata/GeneExpExample5000.txt 
gene id column in geneExpFile2:  1 
expression value column(s) in geneExpFile2: 8 10 11 13 16 
total number of reads uniquely mapped to genome obtained from sample2: 274430 274486 264999 255041 284205 

method to identify differentially expressed genes:  MARS 
pValue threshold: 0.001 
output directory: /work/tmp/tmp/RtmpSIH2f5/DEGexpExample 

Please wait ...
Identifying differentially expressed genes ...
Please wait patiently ...
output ...

Done ...
The results can be observed in directory:  /work/tmp/tmp/RtmpSIH2f5/DEGexpExample 
outputDir: /work/tmp/tmp/RtmpSIH2f5/DEGexpExample 

DEGseq documentation built on Nov. 8, 2020, 5:33 p.m.