run.harmony: run.harmony - dun Harmony alignment on a data.table
In ImmuneDynamics/Spectre: High-dimensional cytometry and imaging analysis

run.harmony

R Documentation

run.harmony - dun Harmony alignment on a data.table

Description

This function allows you to run the 'Harmony' data alignment algorithm on single cell or cytometry data stored in a data.table

Usage

run.harmony()

Arguments

`dat`	NO DEFAULT. A data.table with all of the data you wish to align
`align.cols`	NO default. The columns you wish to align. For cytometry data, this can be the markers themselves or principle components. For single-cell seq data, principle components are recommended.
`batch.col`	NO default. The column that denotes the batch or dataset that each cell belongs to
`append.name`	DEFAULT = '_aligned'. Text that will be appended to the new columns containing aligned data
`do_pca`	DEFAULT = TRUE. Whether to perform PCA on input matrix.
`npcs`	If doing PCA on input matrix, number of PCs to compute.
`theta`	Diversity clustering penalty parameter. Specify for each variable in vars_use Default theta=2. theta=0 does not encourage any diversity. Larger values of theta result in more diverse clusters.
`lambda`	Ridge regression penalty parameter. Specify for each variable in vars_use. Default lambda=1. Lambda must be strictly positive. Smaller values result in more aggressive correction.
`sigma`	Width of soft kmeans clusters. Default sigma=0.1. Sigma scales the distance from a cell to cluster centroids. Larger values of sigma result in cells assigned to more clusters. Smaller values of sigma make soft kmeans cluster approach hard clustering.
`nclust`	Number of clusters in model. nclust=1 equivalent to simple linear regression.
`tau`	Protection against overclustering small datasets with large ones. tau is the expected number of cells per cluster.
`block.size`	What proportion of cells to update during clustering. Between 0 to 1, default 0.05. Larger values may be faster but less accurate
`max.iter.harmony`	Maximum number of rounds to run Harmony. One round of Harmony involves one clustering and one correction step.
`max.iter.cluster`	Maximum number of rounds to run clustering at each round of Harmony.
`epsilon.cluster`	Convergence tolerance for clustering round of Harmony. Set to -Inf to never stop early.
`epsilon.harmony`	Convergence tolerance for Harmony. Set to -Inf to never stop early.
`plot_convergence`	Whether to print the convergence plot of the clustering objective function. TRUE to plot, FALSE to suppress. This can be useful for debugging.
`return_object`	(Advanced Usage) Whether to return the Harmony object or only the corrected PCA embeddings.
`verbose`	DEFAULT = FALSE. Whether to print progress messages. TRUE to print, FALSE to suppress.
`reference_values`	(Advanced Usage) Defines reference dataset(s). Cells that have batch variables values matching reference_values will not be moved.
`cluster_prior`	(Advanced Usage) Provides user defined clusters for cluster initialization. If the number of provided clusters C is less than K, Harmony will initialize K-C clusters with kmeans. C cannot exceed K.