Remove gene information from sgRNA data.frame
This function is used to remove genes/gene information from a data.frame containing pooled CRISPR screen data. It is meant to exclude genes from the analysis and removes all entries belonging to a gene from the sgRNA data.frame.
data.frame with sgRNA readcounts. Must have one column with sgRNA names and one column with readcounts. Please note that the data must be formatted in a way, that gene names are included within the sgRNA name and can be extracted using the extractpattern expression. e.g. GENE_sgRNA1 -> GENE as gene name, _ as the separator and sgRNA1 as the sgRNA identifier.
integer, indicates in which column the names are stored
Vector of gene names that will be removed from sgRNA dataset. The gene name must be included in the sgRNA names in order to be extracted using the pattern defined in extractpattern. e.g. c=("gene1","gene2")
Regular Expression, used to extract the gene name from the sgRNA name. Please make sure that the gene name extracted is accesible by putting its regular expression in brackets (). The default value expression("^(.+?)_.+") will look for the gene name (.+?) in front of the separator _ and any character afterwards .+ e.g. gene1_anything .
In a table with
calling gene.remove(data.frame, toremove="AAK1", extractpattern = expression("^(.+?)_.+")) will remove all entries shown above, since AAK1 is the gene name, separated by an undescore _ from the sgRNA identifier.
gene.remove returns a data.frame that has the same column dimensions as the input data.frame, however all rows in which toremove=gene is present, are deleted.
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.