fix_missing_cell_labels: Replace NA cell column data values left after running...
In cole-trapnell-lab/monocle3: Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

fix_missing_cell_labels

R Documentation

Replace NA cell column data values left after running transfer_cell_labels.

Description

Try to replace NA values left in a query cell_data_set after running transfer_cell_labels.

Usage

fix_missing_cell_labels(
  cds,
  reduction_method = c("UMAP", "PCA", "LSI"),
  from_column_name,
  to_column_name = from_column_name,
  out_notna_models_dir = NULL,
  k = 10,
  nn_control = list(),
  top_frac_threshold = 0.5,
  top_next_ratio_threshold = 1.5,
  verbose = FALSE
)

Arguments

`cds`	the cell_data_set upon which to perform this operation
`reduction_method`	a string specifying the reduced dimension matrix to use for the label transfer. These are "PCA", "LSI", and "UMAP". Default is "UMAP".
`from_column_name`	a string giving the name of the query cds column with NA values to fix.
`to_column_name`	a string giving the name of the query cds column where the fixed column data will be stored. The default is from_column_name
`out_notna_models_dir`	a string with the name of the transform model directory where you want to save the not-NA transform models, which includes the nearest neighbor index. If NULL, the not-NA models are not saved. The default is NULL.
`k`	an integer giving the number of reference nearest neighbors to find. This value must be large enough to find meaningful column value fractions. See the top_frac_threshold parameter below for additional information. The default is 10.
`nn_control`	An optional list of parameters used to make the nearest neighbors index. See the set_nn_control help for additional details. The default metric is cosine for reduction_methods PCA and LSI and is euclidean for reduction_method UMAP.
`top_frac_threshold`	a numeric value. The top fraction of reference values must be greater than top_frac_threshold in order to be transferred to the query. The top fraction is the fraction of the k neighbors with the most frequent value. The default is 0.5.
`top_next_ratio_threshold`	a numeric value giving the minimum value of the ratio of the counts of the most frequent to the second most frequent reference values required for transferring the reference value to the query. The default is 1.5.
`verbose`	a boolean controlling verbose output.

Details

fix_missing_cell_labels uses non-NA cell data values in the query cell_data_set to replace NAs in nearby cells. It partitions the cells into a set with NA and a set with non-NA column data values. It makes a nearest neighbor index using cells with non-NA values, and for each cell with NA, it tries to find an acceptable non-NA column data value as follows. If more than top_frac_threshold fraction of them have the same value, it replaces the NA with it. If not, it checks whether the ratio of the most frequent to the second most frequent values is at least top_next_ratio_threshold, in which case it copies the most frequent value. Otherwise, it leaves the NA.

Value

an updated cell_data_set object

Examples

  ## Not run: 
     expression_matrix <- readRDS(system.file('extdata',
                                              'worm_l2/worm_l2_expression_matrix.rds',
                                              package='monocle3'))
     cell_metadata <- readRDS(system.file('extdata',
                                          package='monocle3'))
     gene_metadata <- readRDS(system.file('extdata',
                                          'worm_l2/worm_l2_rowdata.rds',
                                          package='monocle3'))

     cds <- new_cell_data_set(expression_data=expression_matrix,
                              cell_metadata=cell_metadata,
                              gene_metadata=gene_metadata)

    ncell <- nrow(colData(cds))
    cell_sample <- sample(seq(ncell), 2 * ncell / 3)
    cell_set <- seq(ncell) %in% cell_sample
    cds1 <- cds[,cell_set]
    cds1 <- preprocess_cds(cds1)
    cds1 <- reduce_dimension(cds1, build_nn_index=TRUE)
    save_transform_models(cds1, 'tm')

    cds2 <- cds[,!cell_set]
    cds2 <- load_transform_models(cds2, 'tm')
    cds2 <- preprocess_transform(cds2, 'PCA')
    cds2 <- reduce_dimension_transform(cds2)
    cds2 <- transfer_cell_labels(cds2, 'UMAP', colData(cds1), 'cao_cell_type', 'transfer_cell_type')
    cds2 <- fix_missing_cell_labels(cds2, 'UMAP', 'transfer_cell_type', 'fixed_cell_type')
  
## End(Not run)

cole-trapnell-lab/monocle3 documentation built on June 11, 2025, 11:22 p.m.

cole-trapnell-lab/monocle3 index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cole-trapnell-lab/monocle3
Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

fix_missing_cell_labels: Replace NA cell column data values left after running...
In cole-trapnell-lab/monocle3: Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

Replace NA cell column data values left after running transfer_cell_labels.

Description

Usage

Arguments

Details

Value

Examples

Related to fix_missing_cell_labels in cole-trapnell-lab/monocle3...

R Package Documentation

Browse R Packages

We want your feedback!

cole-trapnell-lab/monocle3 Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

fix_missing_cell_labels: Replace NA cell column data values left after running... In cole-trapnell-lab/monocle3: Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

Replace NA cell column data values left after running transfer_cell_labels.

Description

Usage

Arguments

Details

Value

Examples

Related to fix_missing_cell_labels in cole-trapnell-lab/monocle3...

R Package Documentation

Browse R Packages

We want your feedback!

cole-trapnell-lab/monocle3
Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq

fix_missing_cell_labels: Replace NA cell column data values left after running...
In cole-trapnell-lab/monocle3: Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq