DESeq2_ysx: One step DEseq2 DE genes analysis for salmon output.
In Tong-Chen/YSX: For Yishengxin Training

One step DEseq2 DE genes analysis for salmon output.

DESeq2_ysx(
  file,
  sampleFile,
  design,
  type,
  covariate = NULL,
  tx2gene = NULL,
  filter = NULL,
  output_prefix = "ehbio",
  rlog = T,
  vst = F,
  comparePairFile = NULL,
  padj = 0.05,
  log2FC = 1,
  dropCol = c("lfcSE", "stat")
)

`file`	A file containing salmon output file lists if "type=salmon" with format described in `salmon2deseq`. Or reads count matrix file if "type=readscount" with format described in `readscount2deseq`.
`sampleFile`	A file containing at least two columns. The first column is sample name just like the first column of `salmon_file_list`. Other columns are sample attributes. Normally one of sample attributes should contain the group information each sample belongs to. One simple example (conditions represent group information) Samp conditions untrt_N61311 untrt untrt_N052611 untrt untrt_N080611 untrt untrt_N061011 untrt trt_N61311 trt trt_N052611 trt trt_N080611 trt trt_N061011 trt Another example (3rd column meaning samples from two batches) Samp conditions batch untrt_N61311 untrt A untrt_N052611 untrt A untrt_N080611 untrt B untrt_N061011 untrt B trt_N61311 trt A trt_N052611 trt A trt_N080611 trt B trt_N061011 trt B
`design`	A column name from "sampleFile" like "conditions" in example. This will be used as group variable for DE tests. Currently only simple design is allowed. If one wants to model multiple variables, construct one representation of super variable as indicated in https://support.bioconductor.org/p/67600/#67612 may be useful.
`type`	Specify input file type, either "salmon" or "readscount". "tx2gene" currently has no effects for "type=readscount".
`covariate`	Names of columns containing informations maybe covariates like batch effects or other sample info. Multiple covariates should be supplied as a vector.
`tx2gene`	Optional and only used if one want to get gene expression instead of transcript expression. A two-column file with the first column as transcript names and second column as gene names. Header line is required but column names do not matter. Below is an example of file contents. txname gene ENST00000456328 ENSG00000223972 ENST00000450305 ENSG00000223972 ENST00000488147 ENSG00000227232 ENST00000619216 ENSG00000278267 ENST00000473358 ENSG00000243485
`filter`	Filter genes with low read counts. Default genes with total reads count lower than half of number of samples will be filtered out. One can give any number here. Normally default is OK. The DESeq2 will ao auto filter too. Check https://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html.
`output_prefix`	A string, will be used as output file name prefix.
`rlog`	Get "rlog" transformed value for downstream correlation like analysis.
`vst`	Get "vst" transformed value for downstream correlation like analysis. Normally faster than "rlog".
`comparePairFile`	A file containing sample groups for comparing. Optional. If not given, the function will use `colData` information in `dds` and perform group compare for all possible combinations. groupA groupB groupA groupC groupC groupB
`padj`	Multiple-test corrected p-value. Default 0.05.
`log2FC`	Log2 transformed fold change. Default 1.
`dropCol`	Columns to drop in final output. Default `c("lfcSE", "stat")`. Other options `"ID", "baseMean", "log2FoldChange", "lfcSE", "stat", "pvalue", "padj"`. This has no specific usages except make the table clearer.