get_expr: Get counts or TPM matrix
In e-myers/rnaseq: Process, Analyze and Visualize RNA-seq Data

Description Usage Arguments Details Value Author(s) Examples

Read counts from featureCounts output into a gene-by-sample matrix. Optionally convert to TPM; optionally write to a csv file.

1	get_expr(inFiles, tpm = FALSE, outFileNamePrefix = NULL, verbose = TRUE)

`inFiles`	Character - List of featureCounts output files (not the ones that end in "summary" or "report")
`tpm`	Logical - If true, convert values to transcripts per million
`outFileNamePrefix`	String - If given, expression matrix will be written to csv with filename outFileNamePrefix.csv (or "_TPM.csv", if TPM=TRUE).
`verbose`	Logical - If true, announce output filename (if there is one).

Note that the featureCounts "Length" field (which is used to get TPM) appears to be the summed lengths of all the transcripts in the annotation file that are tagged as being part of that gene. So if you used a file with all exons, it's the gene's exonic length; for introns, it's the gene's intronic length.

exprMat

Emma Myers

This way of creating the input file list lets you put the columns of the matrix in the order you want
comparisons=c("HTp2_","HTp7_","HTp30_", "KOp2_", "KOp7_", "KOp30_") # underscore at the end prevents confusing p2 with p200
countFiles=vector(mode="character")
for (c in comparisons) { countFiles=c(countFiles, dir("RORb/counts/counts_m20_q20", pattern=c, full.names=TRUE)) }
countFiles = countFiles[ -which( regexpr("summary", countFiles) > 0 ) ]
countFiles = countFiles[ -which( regexpr("display", countFiles) > 0 ) ]
rorbTPM = get_expr(countFiles, tpm=TRUE, outFileNamePrefix="~/Documents/RORb", verbose=FALSE)