anndata_to_ctd: Convert: 'AnnData' ==> 'CellTypeDataset'

View source: R/anndata_to_ctd.R

anndata_to_ctdR Documentation

Convert: AnnData ==> CellTypeDataset

Description

Convert: AnnData ==> CellTypeDataset

Usage

anndata_to_ctd(
  obj,
  annotLevels,
  dataset = basename(tempfile()),
  chunk_size = NULL,
  agg_fun = "mean",
  agg_method = c("monocle3", "stats"),
  dropNA = TRUE,
  standardise = TRUE,
  as_sparse = TRUE,
  as_delayedarray = FALSE,
  verbose = TRUE,
  ...
)

Arguments

obj

A single-cell object supported by scKirby. See converters for a table of all supported conversions.

annotLevels

List with arrays of strings containing the cell type names associated with each column in exp.

dataset

CellTypeData. name.

chunk_size

An integer indicating number of cells to include per chunk. This can be a more memory efficient and scalable way of aggregating on-disk data formats like AnnData, rather than reading in the entire matrix into memory at once (default: NULL).

agg_fun

Aggregation function passed to aggregate_mapped_genes. Set to NULL to skip aggregation step (default).

agg_method

Aggregation method.

dropNA

Drop genes assigned to NA in groupings.

standardise

Run standardise_ctd.

as_sparse

Convert gene_df to a sparse matrix. Only works if gene_df is one of the following classes:

  • matrix

  • Matrix

  • data.frame

  • data.table

  • tibble

If gene_df is a sparse matrix to begin with, it will be returned as a sparse matrix (so long as gene_output= "rownames" or "colnames").

as_delayedarray

Convert aggregated matrix to DelayedArray.

verbose

Print messages.

...

Arguments passed on to EWCE::standardise_ctd

ctd

Input CellTypeData.

input_species

Which species the gene names in exp come from. See list_species for all available species.

output_species

Which species' genes names to convert exp to. See list_species for all available species.

sctSpecies_origin

Species that the sct_data originally came from, regardless of its current gene format (e.g. it was previously converted from mouse to human gene orthologs). This is used for computing an appropriate backgrund.

non121_strategy

How to handle genes that don't have 1:1 mappings between input_species:output_species. Options include:

  • "drop_both_species" or "dbs" or 1 :
    Drop genes that have duplicate mappings in either the input_species or output_species
    (DEFAULT).

  • "drop_input_species" or "dis" or 2 :
    Only drop genes that have duplicate mappings in the input_species.

  • "drop_output_species" or "dos" or 3 :
    Only drop genes that have duplicate mappings in the output_species.

  • "keep_both_species" or "kbs" or 4 :
    Keep all genes regardless of whether they have duplicate mappings in either species.

  • "keep_popular" or "kp" or 5 :
    Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs.

  • "sum","mean","median","min" or "max" :
    When gene_df is a matrix and gene_output="rownames", these options will aggregate many-to-one gene mappings (input_species-to-output_species) after dropping any duplicate genes in the output_species.

method

R package to use for gene mapping:

  • "gprofiler" : Slower but more species and genes.

  • "homologene" : Faster but fewer species and genes.

  • "babelgene" : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.

force_new_quantiles

By default, quantile computation is skipped if they have already been computed. Set =TRUE to override this and generate new quantiles.

force_standardise

If ctd has already been standardised, whether to rerun standardisation anyway (Default: FALSE).

remove_unlabeled_clusters

Remove any samples that have numeric column names.

numberOfBins

Number of non-zero quantile bins.

keep_annot

Keep the column annotation data if provided.

keep_plots

Keep the dendrograms if provided.

as_DelayedArray

Convert to DelayedArray.

rename_columns

Remove replace_chars from column names.

make_columns_unique

Rename each columns with the prefix dataset.species.celltype.

Examples

obj <- example_obj("anndata")
obj2 <- anndata_to_ctd(obj, annotLevels=list(groups=NULL))

bschilder/scKirby documentation built on Oct. 2, 2024, 10:16 p.m.