Creates a DEMIClust object

Share:

Description

A DEMIClust object clusters probes by their expression profile. The clustering is done with a function defined by the clust.method parameter. One could also define custom clusters by defining the cluster parameter with a list of probes. It then stores the clusters of probes as a DEMIClust object.

Usage

1
2
DEMIClust(experiment = "DEMIExperiment", group = character(),
  clust.method = function() { }, cluster = list(), cutoff.pvalue = 0.05)

Arguments

experiment

A DEMIExperiment object. Holds the DEMIExperiment object whose metadata (such as normalized expression values) is used to cluster the probes.

group

A character. Defines the groups that are used for clustering (e.g 'group = c("TEST", "CONTROL")'). It uses grep function to locate the group names from the CEL file names and then builds index vectors determining which files belong to which groups.

clust.method

A function. Defines the function used for clustering. The user can build a custom clustering function. The input of the custom function needs to be the same DEMIClust object and the output is a list of probes, where each list corresponds to a specific cluster. The default function is demi.wilcox.test that implements the wilcox.test function. However we recommend to use the function demi.wilcox.test.fast that uses a custom wilcox.test and runs a lot faster.

cluster

A list. Holds the probes of different clusters in a list.

cutoff.pvalue

A numeric. Sets the cut-off p-value used for determining statistical significance of the probes when clustering the probes into clusters. Default is 0.05.

Details

Instead of automatically clustered probes DEMIClust object can use user defined lists of probes for later calculation of differential expression. This is done by setting the cluster parameter. It overrides the default behaviour of the DEMIClust object and no actual clustering occurs. Instead the list of probes defined in the cluster parameter are considered as already clustered probes. The list needs to contain proper names for probe vectors so that they would be recognizable later. Also instead of using the default clustering method the user can write his/her own function for clustering probes based on the expression values.

Further specification of the parameters:

  • group All the CEL files used in the analysis need to contain at least one of the names specified in the group parameter because they determine what groups to compare against each other. It is also a good practice to name the CEL files to include their common features. However if a situation arises where the group/feature name occurs in all filenames then the user can set group names with specific filenames by seperating names in one group with the "|" symbol. For example group = c( "FILENAME1|FILENAME2|FILENAME3", "FILENAME4|FILENAME5|FILENAME6" ). These two groups are then used for clustering the probes expression values.

  • clust.method The user can write his/her own function for clustering probes according to their expression values. The custom function should take DEMIClust object as the only parameter and output a list. The output list should contain the name of the clusters and the corresponding probe ID's. For example return( list( cluster1 = c(1:10), cluster2 = c(11:20), cluster3 = c(21:30) ).

  • cluster This parameter allows to calculate differential expression on user defined clusters of probe ID's. It needs to be a list of probe ID's where the list names correspond to the cluster names. For example list( cluster1 = c(1:10), cluster2(1:10) ). When using this approach you need to make sure that all the probe ID's given in the clusters are available in the analysis. Otherwise an error message will be produced and you need to remove those probes that have no alignment in the analysis. When setting this parameter the default behaviour will be overridden and no default clustering will be applied.

Value

A DEMIClust object.

Author(s)

Sten Ilmjarv

See Also

DEMIExperiment, demi.wilcox.test, demi.wilcox.test.fast, demi.comp.test, wprob

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
## Not run: 

# To use the example we need to download a subset of CEL files from
# http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9819 published
# by Pradervand et al. 2008.

# Set the destination folder where the downloaded files fill be located.
# It can be any folder of your choosing.
destfolder <- "demitest/testdata/"

# Download packed CEL files and change the names according to the feature
# they represent (for example to include UHR or BRAIN in them to denote the
# features).
# It is good practice to name the files according to their features which
# allows easier identification of the files later.

ftpaddress <- "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM247nnn"
download.file( paste( ftpaddress, "GSM247694/suppl/GSM247694.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "UHR01_GSM247694.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247695/suppl/GSM247695.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "UHR02_GSM247695.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247698/suppl/GSM247698.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "UHR03_GSM247698.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247699/suppl/GSM247699.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "UHR04_GSM247699.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247696/suppl/GSM247696.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "BRAIN01_GSM247696.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247697/suppl/GSM247697.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "BRAIN02_GSM247697.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247700/suppl/GSM247700.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "BRAIN03_GSM247700.CEL.gz", sep = "" ) )
download.file( paste( ftpaddress, "GSM247701/suppl/GSM247701.CEL.gz", sep = "/" ),
		destfile = paste( destfolder, "BRAIN04_GSM247701.CEL.gz", sep = "" ) )

# We need the gunzip function (located in the R.utils package) to unpack the gz files.
# Also we will remove the original unpacked files for we won't need them.
library( R.utils )
for( i in list.files( destfolder ) ) {
	gunzip( paste( destfolder, i, sep = "" ), remove = TRUE )
}

# Now we can continue the example of the function DEMIClust

# Set up an experiment.
demiexp <- DEMIExperiment(analysis = 'gene', celpath = destfolder,
			experiment = 'myexperiment', organism = 'homo_sapiens')

# Create clusters with default behaviour
demiclust <- DEMIClust( demiexp, group = c( "BRAIN", "UHR" ) )

# Create clusters with an optimized wilcoxon's rank sum test incorporated within demi that
# precalculates the probabilities.
# The user can specify his/her own function for clustering.
demiclust <- DEMIClust( demiexp, group = c( "BRAIN", "UHR" ), clust.method = demi.wilcox.test.fast )

# Create a 'DEMIClust' object with custom lists of probeID's
demiclust <- DEMIClust( demiexp, cluster = list( customcluster = c(1190, 1998, 2007) ) )

# To retrieve the clusters use
getCluster( demiclust )

# To retrieve cluster names use
names( getCluster( demiclust ) )


## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.