run_analysis: Run analysis of a batch correction

View source: R/utils_analysis.R

run_analysisR Documentation

Run analysis of a batch correction

Description

This function only runs under the assumption that the batch correction has been run with the cyCombine workflow in mind. Some result preparation can be necessary, if cyCombine was not used during batch correction. This function assumes that the uncorrected data us stored under the name "{data_dir}/{tool}_{data}{variant}{uncorrected_extension}.RDS" and the corrected data is stored with the name "{data_dir}/{tool}_{data}{variant}{corrected_extension}.RDS". The analysis currently encompass marker-wise density plots, UMAP of corrected vs uncorrected, and Earth Movers Distance.

Usage

run_analysis(
  tool,
  data,
  data_dir,
  uncorrected_extension = "_uncorrected",
  corrected_extension = "_corrected",
  variant = NULL,
  uncorrected_variant = NULL,
  use_cycombine_uncor = FALSE,
  restart = FALSE,
  md = NULL,
  panel = NULL,
  markers = NULL,
  celltype_col = NULL,
  segment = "",
  binSize = 0.1,
  rlen = 10,
  gridsize = 8,
  seed = 473,
  umap_size = 20000
)

Arguments

tool

The name of the tool used to batch correct

data

The name of the data used

data_dir

The location of the uncorrected and corrected data

uncorrected_extension

The extension used to name the uncorrected data. Default: "_uncorrected

corrected_extension

The extension used to name the corrected data. Default: "_corrected"

variant

Optional: A parameter to set a variant name of an experiment

uncorrected_variant

Optional: A parameter to set a variant specific to the uncorrected data

use_cycombine_uncor

If TRUE, the uncorrected data made by cyCombine will be used.

restart

If TRUE, the SOM grid will be calculated even if it has been computed and stored previously

md

Optional: Metadata filename. Currently not useful

panel

Optional: If given, it will be used to define markers. Otherwise the function get_markers will be used

markers

Optional: Manually define markers to use in plots and performance metrics

celltype_col

Optional: If the cell types are known, specify which column they are defined in. If NULL, a clustering will be run.

segment

Optional: Run only a specific segment of the analysis. Options include: "emd", "density", "umap"

binSize

The size of bins to use when binning data

rlen

Number of times the data is presented to the SOM network

gridsize

The gridsize to use when clustering. Only used if no celltype_col is given

seed

The seed to use when creating the UMAP

umap_size

Number of cells to include in UMAP

Examples

## Not run: 
run_analysis(tool = "cycombine",
 data = "dfci1",
 data_dir = "_data")
 run_analysis(tool = "cycombine",
 data = "FR-FCM-ZY34",
 data_dir = "_data",
 variant = "_p3",
 panel = "/attachments/MC_panel3.xlsx")
 
## End(Not run)

biosurf/cyCombine documentation built on May 23, 2024, 4:07 a.m.