probe_ranking: Function to select for genes using one of the available gene...

Description Usage Arguments Value Note Author(s) See Also Examples

View source: R/probe_ranking.R

Description

Function to select for genes using one of the available gene probe ranking options.

Usage

1
2
probe_ranking(input, probe_number, probe_num_selection = "Fixed_Probe_Num",
  data.exp, method = "SD_Rank")

Arguments

input

String indicating the name of the text file containing the gene expression matrix.

probe_number

Positive integer indicating the number of gene probes to be selected as determined by the number_probes function.

probe_num_selection

String indicating the way in which number of probes were selected for. Options include "Fixed_Probe_Num", "Percent_Probe_Num", and "Adaptive_Probe_Num".

data.exp

The object containing the original gene expression matrix. This matrix is outputted by the input_file function.

method

A string indicating the gene probe ranking method to use. Possible options include "CV_Rank", "CV_Guided", "SD_Rank", and "Poly". The default is set to "SD_Rank".

Value

An object containing the selected gene expression matrix for a particular ranking method. In addition a text file containing the selected gene expression data is produced.

Note

CV_Rank is a gene probe ranking method that selects for probes with the highest coefficient of variation within the dataset. CV_Guided is a method that also uses the coefficient of variation of the dataset to select for gene probes. Every probe within the set is then plotted on a mean and standard deviation graph (with SD being the y-axis). A line is plotted starting from the origin with a slope of the coefficient of variation. The mean and standard deviation cutoff moves along this line until an equal or less then number of desired probes is above the cutoff. SD_Rank is a gene probe ranking method that selects for probes with the highest standard deviation within the dataset. Poly is a ranking method that fits three second degree polynomial functions of mean and standard deviation to the dataset to select the most variable probes in the dataset.

Author(s)

Peiyong Guan, Alec Fabbri, Nathan Lawlor

See Also

number_probes, input_file

Examples

1
2
3
4
5
6
7
8
# Producing a selected gene expression matrix using one of the
   # probe ranking options
# Load in a test file
data_file <- system.file("extdata", "GSE2034.normalized.expression.txt",
    package="multiClust")
data <- input_file(data_file)
selected_probes <- probe_ranking(input=data_file, probe_number=300,
   probe_num_selection="Fixed_Probe_Num", data.exp=data, method="CV_Rank")

multiClust documentation built on Nov. 8, 2020, 5:23 p.m.