build_gene_activity_matrix: Calculate initial Cicero gene activity matrix

Description Usage Arguments Value Examples

View source: R/activityScores.R

Description

This function calculates the initial Cicero gene activity matrix. After this function, the activity matrix should be normalized with any comparison matrices using the function normalize_gene_activities.

Usage

1
2
3
4
5
6
7
build_gene_activity_matrix(
  input_cds,
  cicero_cons_info,
  site_weights = NULL,
  dist_thresh = 250000,
  coaccess_cutoff = 0.25
)

Arguments

input_cds

Binary sci-ATAC-seq input CDS. The input CDS must have a column in the fData table called "gene" which is the gene name if the site is a promoter, and NA if the site is distal.

cicero_cons_info

Cicero connections table, generally the output of run_cicero. This table is a data frame with three required columns named "Peak1", "Peak2", and "coaccess". Peak1 and Peak2 contain coordinates for the two compared elements, and coaccess contains their Cicero co-accessibility score.

site_weights

NULL or an individual weight for each site in input_cds.

dist_thresh

The maximum distance in base pairs between pairs of sites to include in the gene activity calculation.

coaccess_cutoff

The minimum Cicero co-accessibility score that should be considered connected.

Value

Unnormalized gene activity matrix.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
  data("cicero_data")
  data("human.hg19.genome")
  sample_genome <- subset(human.hg19.genome, V1 == "chr18")
  sample_genome$V2[1] <- 100000
  input_cds <- make_atac_cds(cicero_data, binarize = TRUE)
  input_cds <- detectGenes(input_cds)
  input_cds <- reduceDimension(input_cds, max_components = 2, num_dim=6,
                               reduction_method = 'tSNE',
                               norm_method = "none")
  tsne_coords <- t(reducedDimA(input_cds))
  row.names(tsne_coords) <- row.names(pData(input_cds))
  cicero_cds <- make_cicero_cds(input_cds,
                                reduced_coordinates = tsne_coords)
  cons <- run_cicero(cicero_cds, sample_genome, sample_num=2)

  data(gene_annotation_sample)
  gene_annotation_sub <- gene_annotation_sample[,c(1:3, 8)]
  names(gene_annotation_sub)[4] <- "gene"
  input_cds <- annotate_cds_by_site(input_cds, gene_annotation_sub)
  num_genes <- pData(input_cds)$num_genes_expressed
  names(num_genes) <- row.names(pData(input_cds))
  unnorm_ga <- build_gene_activity_matrix(input_cds, cons)

cicero documentation built on Dec. 10, 2020, 2 a.m.