TFRankR: TFRankR
In brengong/ConservationtextmineR:

Description Usage Arguments Value Author(s) Examples

Requires a data table containing a column labeled "gene_symbol", "Species", "Targeting_Factor", "Score" Returns a ranked list according to the specified options.

1	TFRankR(DT, sortBy, dec, SPselect, IUPACgreat)

`DT`	the data table to query
`sortBy`	"species", "Target", "abundance", "score", "species & Target", "species & score & Target", "abundance & Target", "Species & abundance & Target", and "species & abundance". When sorting by "species $ Target", ranks greatest number of species first, least number of Targets second. When sorting by "species & score & Target", ranks greatest number of species first, greatest IUPAC consensus score second, and least number of Targets third. When sorting by "abundance & Target", ranks the greatest abundance of consensus sequences for each promoter first and ranks the greatest number of targets for each Targeting_Factor second. When sorting by "species & abundance & Target", ranks the greatest to least number of Species first, the greatest to least number of consensus sequences second, and the greatest to least number of targets third.
`dec:`	used only if sortBy "species", "promoter", or "abundance" are used. It is either TRUE or FALSE and indicates whether to sort in decresing or increasing order respectvely.
`SPselect:`	a single character vector used when sorting by "species & Target", "species & abundance" or by "species & score & Target" to designate the dominant species to use.
`IUPACgreat:`	A logical statement either TRUE or FALSE indicating whether to use the greatest IUPAC consensus score of the lowest IUPAC score respectively when two or more IUPAC sequences are observed for a particular transcription factor.

The sequences in a string format that can be added to a data.table

Brendan Gongol

setwd("C:/Users/Brendan/Desktop/oxidative stress surfactant bioinformatics")
library(data.table)
DT25 <- fread("Raw Transcription factor hits.xls")
DT25 <- DT25[,c(1:6, 9:10), with = FALSE]
setnames(DT25, c("Consensus_Sequence", "start", "end", "Number_Hits", "TX_Factor", "MotifMap Degenerate consensus sequence", "promoter_name", "Species"),
         c("Consensus_Sequence", "start", "end", "Number_Hits", "Targeting_Factor", "MotifMap Degenerate consensus sequence", "gene_symbol", "Species"))
IUPAC <- DT25$`MotifMap Degenerate consensus sequence`
DT25$Score <- IUPAC_ScoreR(IUPAC, stringency = "medium")

TFRankR(DT = DT25, sortBy = "species", dec = TRUE)
TFRankR(DT = DT25, sortBy = "species", dec = TRUE)
TFRankR(DT = DT25, sortBy = "species", dec = FALSE)
TFRankR(DT = DT25, sortBy = "Target", dec = TRUE)
TFRankR(DT = DT25, sortBy = "Target", dec = FALSE)
TFRankR(DT = DT25, sortBy = "abundance", dec = TRUE)
TFRankR(DT = DT25, sortBy = "abundance", dec = FALSE)
TFRankR(DT = DT25, sortBy = "score", dec = TRUE)
TFRankR(DT = DT25, sortBy = "score", dec = FALSE)
TFRankR(DT = DT25, sortBy = "species & Target", dec = FALSE, SPselect = "Human")  # Ranks greatest number of species first, least number of Targets second.
TFRankR(DT = DT25, sortBy = "species & score & Target", dec = FALSE, SPselect = "Human", IUPACgreat = TRUE) # Ranks greatest number of species first, the greatest IUPAC consensus score second, and least number of Targets third.
TFRankR(DT = DT25, sortBy = "species & score & Target", dec = FALSE, SPselect = "Human", IUPACgreat = FALSE) # Ranks greatest number of species first, the greatest IUPAC consensus score second, and least number of Targets third.
TFRankR(DT = DT25, sortBy = "species & abundance", dec = TRUE, SPselect = "Human")
TFRankR(DT = DT25, sortBy = "species & abundance", dec = FALSE, SPselect = "Human")