First install the auxilliary packages for conos, conosPanel:
install.packages('conosPanel', repos='https://kharchenkolab.github.io/drat/', type='source')
Now load the conos library, and the R package conosPanel
for the example data panel
:
library(conos) panel <- conosPanel::panel
Next, use pagoda2 for pre-processing:
library(pagoda2) panel.preprocessed <- lapply(panel, basicP2proc, n.cores=1, min.cells.per.gene=0, n.odgenes=2e3, get.largevis=FALSE, make.geneknn=FALSE)
Now align the datasets:
con <- Conos$new(panel.preprocessed, n.cores=1) con$buildGraph(k=15, k.self=5, space='PCA', ncomps=30)
Next find the clusters, and create an embedding:
con$findCommunities() con$embedGraph(method="UMAP")
Now prepare the metadata (which can be any type of clustering of all the cells):
metadata <- data.frame(Cluster=con$clusters$leiden$groups)
Save data (set exchange_dir
to your path):
## use current directory exchange_dir <- "." hdf5file = "example.h5" saveConosForScanPy(con, output.path=exchange_dir, hdf5_filename=hdf5file, verbose=TRUE)
Users can then access the data saved to the HDF5 file, e.g. to access metadata, run:
library(rhdf5) metadata = h5read(paste0(exchange_dir, "/example.h5"), 'metadata/metadata.df') head(metadata, 4)
All possible fields included in the output HDF5 file are:
raw_count_matrix
: the sparse dgCMatrix
of raw counts raw_count_matrix/data
: the matrix entriesraw_count_matrix/shape
: the matrix dimensionsraw_count_matrix/indices
: 0-based vector of non-zero matrix entriesraw_count_matrix/indptr
: vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row)metadata/metadata.df
: the data.frame
of metadata valuesgenes/genes.df
: the data.frame
of genescount_matrix
: the sparse dgCMatrix
of normalized counts, if cm.norm=TRUE
count_matrix/data
: the matrix entriescount_matrix/shape
: the matrix dimensionscount_matrix/indices
: 0-based vector of non-zero matrix entriescount_matrix/indptr
: vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row)embedding/embedding.df
: the data.frame
of the conos embedding, if embedding=TRUE
pseudopca/pseudopca.df
: the data.frame
of an emulated PCA by embedding the graph to a space with n.dims
dimensions and save it as a pseudoPCA, if pseudopca=TRUE
pca/pca.df
: the data.frame
of the PCA of all the samples (not batch corrected), if pca=TRUE
graph_connectivities
: the sparse dgCMatrix
of graph connectivites, if alignment.graph=TRUE
graph_connectivities/data
: the matrix entriesgraph_connectivities/shape
: the matrix dimensionsgraph_connectivities/indices
: 0-based vector of non-zero matrix entriesgraph_connectivities/indptr
: vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row)graph_distances
: the sparse dgCMatrix
of graph distances, if alignment.graph=TRUE
graph_distances/data
: the matrix entriesgraph_distances/shape
: the matrix dimensionsgraph_distances/indices
: 0-based vector of non-zero matrix entriesgraph_distances/indptr
: vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row)In order to read in the dcGMatrix
again, simply use the Matrix
package as follows:
library(rhdf5) library(Matrix) rawcountMat = h5read(paste0(exchange_dir, "/example.h5"), 'raw_count_matrix') raw_count_matrix = sparseMatrix(x = as.numeric(rawcountMat$data), dims = as.numeric(c(rawcountMat$shape[1], rawcountMat$shape[2])), p = as.numeric(rawcountMat$indptr), i = rawcountMat$indices, index1=FALSE)
Note: Please set index1=FALSE
as the index vectors are 0-based. For more details, see the documentation for Matrix::sparseMatrix()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.