Description Usage Arguments Value Author(s)
It's common for microarrays to have multiple probes per gene. They tend to
represent different isoforms.
Most geneset testing is done at the gene symbol level & ignores isoforms,
so you need to choose 1 probe
for each gene. How? 2 common approaches are to take the most abundant
probe, or the most variable probe, considered
across the cohort.
I quite like doing t-stats on each gene & selecting the best performing
probe - ie the one with the largest t-stat in
either direction. Why? On the Affy 133+2 array, there can be lots of poor
probes for each gene. If 5 probes for a gene
have these t-stats: 1.2, 0.9, 0.1, -0.1, -10; then IMO, the one that scored
-10 is the best probe, since it had a really
strong t-stat score. thus method="maxabs"
combined with a rnk.file
1 2 |
gct.file |
the path to a gct file |
chip.file |
the path to a chip file |
gct.outfile |
the path to the gct output file |
rnk.file |
[optional] path to a rnk file (eg a t-statistic for each probe, where you want to select best probe from this score) NB currently UNUSED |
method |
“mean”, “median” select the probe with highest average/median level, or “var”: select the probe with highest variance across samples; “maxabs” select the probe with the large absolute score in the rnk file (see details). |
reverse |
[default=FALSE] reverse the ordering selected by method arg. so instead of most variable, it would be least variable. |
filter |
Filter out (ie exclude) those probes that don't have a gene
symbol (as determined by probes that have a gene symbol of |
A gct file is created with 1 row per gene symbol & now the ‘probe ids’ in column 1 are actually gene symbols.
Mark Cowley, 2011-02-27
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.