getGDI,COTAN-method | R Documentation |
A collection of functions returning various statistics associated to the genes. In particular the discrepancy between the expected probabilities of zero and their actual occurrences, both at single gene level or looking at genes' pairs
To make the GDI
more specific, it may be desirable to restrict
the set of genes against which GDI
is computed to a selected subset, with
the recommendation to include a consistent fraction of cell-identity genes,
and possibly focusing on markers specific for the biological question of
interest (for instance neural cortex layering markers). In this case we
denote it as Local Differentiation Index (LDI
) relative to the selected
subset.
## S4 method for signature 'COTAN'
getGDI(objCOTAN)
## S4 method for signature 'COTAN'
storeGDI(objCOTAN, genesGDI)
genesCoexSpace(objCOTAN, primaryMarkers, numGenesPerMarker = 25L)
establishGenesClusters(
objCOTAN,
groupMarkers,
numGenesPerMarker = 25L,
kCuts = 6L,
distance = "cosine",
hclustMethod = "ward.D2"
)
calculateGenesCE(objCOTAN)
calculateGDIGivenCorr(corr, numDegreesOfFreedom, rowsFraction = 0.05)
calculateGDI(objCOTAN, statType = "S", rowsFraction = 0.05)
calculatePValue(
objCOTAN,
statType = "S",
geneSubsetCol = vector(mode = "character"),
geneSubsetRow = vector(mode = "character")
)
calculatePDI(
objCOTAN,
statType = "S",
geneSubsetCol = vector(mode = "character"),
geneSubsetRow = vector(mode = "character")
)
objCOTAN |
a |
genesGDI |
the named genes' GDI |
primaryMarkers |
A vector of primary marker names. |
numGenesPerMarker |
the number of correlated genes to keep as other markers (default 25) |
groupMarkers |
a named |
kCuts |
the number of estimated cluster (this defines the height for the tree cut) |
distance |
type of distance to use. Default is |
hclustMethod |
default is "ward.D2" but can be any method defined by
|
corr |
a |
numDegreesOfFreedom |
a |
rowsFraction |
The fraction of rows that will be averaged to calculate
the |
statType |
Which statistics to use to compute the p-values. By default
it will use the "S" (Pearson's |
geneSubsetCol |
an array of genes. It will be put in columns. If left empty the function will do it genome-wide. |
geneSubsetRow |
an array of genes. It will be put in rows. If left empty the function will do it genome-wide. |
getGDI()
extracts the genes' GDI array as it was stored by the
method storeGDI()
storeGDI()
stored and already calculated genes' GDI array
in a
COTAN
object. It can be retrieved using the method getGDI()
genesCoexSpace()
calculates genes groups based on the primary
markers and uses them to prepare the genes' COEX
space data.frame
.
establishGenesClusters()
perform the genes' clustering based on a
pool of gene markers, using the genes' COEX
space
calculateGenesCE()
is used to calculate the discrepancy between
the expected probability of zero and the observed zeros across all cells
for each gene as cross-entropy: -\sum_{c}{\mathbb{1}_{X_c == 0}
\log(p_c) - \mathbb{1}_{X_c != 0} \log(1 - p_c)}
where X_c
is the
observed count and p_c
the probability of zero
calculateGDIGivenCorr()
produces a vector
with the GDI for
each column based on the given correlation matrix, using the Pearson's
\chi^{2}
test
calculateGDI()
produces a data.frame
with the GDI for each
gene based on the COEX
matrix
calculatePValue()
computes the p-values for genes in the COTAN
object. It can be used genome-wide or by setting some specific genes of
interest. By default it computes the p-values using the S
statistics
(\chi^{2}
)
calculatePDI()
computes the p-values for genes in the COTAN
object using calculatePValue()
and takes their
\log{({-\log{(\cdot)}})}
to calculate the genes' Pair Differential
Index
getGDI()
returns the genes' GDI array if available or NULL
otherwise
storeGDI()
returns the given COTAN
object with updated
GDI genes' information
genesCoexSpace()
returns a list
with:
"SecondaryMarkers"
a named list
that for each secondary marker,
gives the list
of primary markers that selected for it
"GCS"
the relevant subset of COEX
matrix
"rankGenes"
a data.frame
with the rank of each gene according to its
p-value
establishGenesClusters()
a list
of:
"g.space"
the genes' COEX
space data.frame
"plot.eig"
the eigenvalues plot
"pca_clusters"
the pca components data.frame
"tree_plot"
the tree plot for the genes' COEX
space
calculateGenesCE()
returns a named array
with the cross-entropy
of each gene
calculateGDIGivenCorr()
returns a vector
with the GDI data for
each column of the input
calculateGDI()
returns a data.frame
with:
"sum.raw.norm"
the sum of the normalized data rows
"GDI"
the GDI data
"exp.cells"
the percentage of cells expressing the gene
calculatePValue()
returns a p-value matrix
as dspMatrix
calculatePDI()
returns a Pair Differential Index matrix
as
dspMatrix
data("test.dataset")
objCOTAN <- COTAN(raw = test.dataset)
objCOTAN <- proceedToCoex(objCOTAN, cores = 6L, saveObj = FALSE)
markers <- getGenes(objCOTAN)[sample(getNumGenes(objCOTAN), 10)]
GCS <- genesCoexSpace(objCOTAN, primaryMarkers = markers,
numGenesPerMarker = 15)
groupMarkers <- list(G1 = c("g-000010", "g-000020", "g-000030"),
G2 = c("g-000300"),
G3 = c("g-000510", "g-000530", "g-000550",
"g-000570", "g-000590"))
resList <- establishGenesClusters(objCOTAN, groupMarkers = groupMarkers,
numGenesPerMarker = 11)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.