run.harmony | R Documentation |
This function allows you to run the 'Harmony' data alignment algorithm on single cell or cytometry data stored in a data.table
run.harmony()
dat |
NO DEFAULT. A data.table with all of the data you wish to align |
align.cols |
NO default. The columns you wish to align. For cytometry data, this can be the markers themselves or principle components. For single-cell seq data, principle components are recommended. |
batch.col |
NO default. The column that denotes the batch or dataset that each cell belongs to |
append.name |
DEFAULT = '_aligned'. Text that will be appended to the new columns containing aligned data |
do_pca |
DEFAULT = TRUE. Whether to perform PCA on input matrix. |
npcs |
If doing PCA on input matrix, number of PCs to compute. |
theta |
Diversity clustering penalty parameter. Specify for each variable in vars_use Default theta=2. theta=0 does not encourage any diversity. Larger values of theta result in more diverse clusters. |
lambda |
Ridge regression penalty parameter. Specify for each variable in vars_use. Default lambda=1. Lambda must be strictly positive. Smaller values result in more aggressive correction. |
sigma |
Width of soft kmeans clusters. Default sigma=0.1. Sigma scales the distance from a cell to cluster centroids. Larger values of sigma result in cells assigned to more clusters. Smaller values of sigma make soft kmeans cluster approach hard clustering. |
nclust |
Number of clusters in model. nclust=1 equivalent to simple linear regression. |
tau |
Protection against overclustering small datasets with large ones. tau is the expected number of cells per cluster. |
block.size |
What proportion of cells to update during clustering. Between 0 to 1, default 0.05. Larger values may be faster but less accurate |
max.iter.harmony |
Maximum number of rounds to run Harmony. One round of Harmony involves one clustering and one correction step. |
max.iter.cluster |
Maximum number of rounds to run clustering at each round of Harmony. |
epsilon.cluster |
Convergence tolerance for clustering round of Harmony. Set to -Inf to never stop early. |
epsilon.harmony |
Convergence tolerance for Harmony. Set to -Inf to never stop early. |
plot_convergence |
Whether to print the convergence plot of the clustering objective function. TRUE to plot, FALSE to suppress. This can be useful for debugging. |
return_object |
(Advanced Usage) Whether to return the Harmony object or only the corrected PCA embeddings. |
verbose |
DEFAULT = FALSE. Whether to print progress messages. TRUE to print, FALSE to suppress. |
reference_values |
(Advanced Usage) Defines reference dataset(s). Cells that have batch variables values matching reference_values will not be moved. |
cluster_prior |
(Advanced Usage) Provides user defined clusters for cluster initialization. If the number of provided clusters C is less than K, Harmony will initialize K-C clusters with kmeans. C cannot exceed K. |
Returns a data.table with aligned data added in new columns.
Thomas M Ashhurst, thomas.ashhurst@sydney.edu.au
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.