build_gene_activity_matrix: Calculate initial Cicero gene activity matrix

View source: R/activityScores.R

build_gene_activity_matrixR Documentation

Calculate initial Cicero gene activity matrix

Description

This function calculates the initial Cicero gene activity matrix. After this function, the activity matrix should be normalized with any comparison matrices using the function normalize_gene_activities.

Usage

build_gene_activity_matrix(
  input_cds,
  cicero_cons_info,
  site_weights = NULL,
  dist_thresh = 250000,
  coaccess_cutoff = 0.25
)

Arguments

input_cds

Binary sci-ATAC-seq input CDS. The input CDS must have a column in the fData table called "gene" which is the gene name if the site is a promoter, and NA if the site is distal.

cicero_cons_info

Cicero connections table, generally the output of run_cicero. This table is a data frame with three required columns named "Peak1", "Peak2", and "coaccess". Peak1 and Peak2 contain coordinates for the two compared elements, and coaccess contains their Cicero co-accessibility score.

site_weights

NULL or an individual weight for each site in input_cds.

dist_thresh

The maximum distance in base pairs between pairs of sites to include in the gene activity calculation.

coaccess_cutoff

The minimum Cicero co-accessibility score that should be considered connected.

Value

Unnormalized gene activity matrix.

Examples

  data("cicero_data")
  data("human.hg19.genome")
  sample_genome <- subset(human.hg19.genome, V1 == "chr18")
  sample_genome$V2[1] <- 100000
  input_cds <- make_atac_cds(cicero_data, binarize = TRUE)
  input_cds <- detectGenes(input_cds)
  input_cds <- reduceDimension(input_cds, max_components = 2, num_dim=6,
                               reduction_method = 'tSNE',
                               norm_method = "none")
  tsne_coords <- t(reducedDimA(input_cds))
  row.names(tsne_coords) <- row.names(pData(input_cds))
  cicero_cds <- make_cicero_cds(input_cds,
                                reduced_coordinates = tsne_coords)
  cons <- run_cicero(cicero_cds, sample_genome, sample_num=2)

  data(gene_annotation_sample)
  gene_annotation_sub <- gene_annotation_sample[,c(1:3, 8)]
  names(gene_annotation_sub)[4] <- "gene"
  input_cds <- annotate_cds_by_site(input_cds, gene_annotation_sub)
  num_genes <- pData(input_cds)$num_genes_expressed
  names(num_genes) <- row.names(pData(input_cds))
  unnorm_ga <- build_gene_activity_matrix(input_cds, cons)



cole-trapnell-lab/cicero documentation built on March 13, 2023, 6:02 p.m.