binding_scores: Calculate binding scores for each gene in a SeRP experiment

Description Usage Arguments Value See Also

View source: R/binding_scores.R

Description

A binding score for a gene is defined as the highest value of the position-wise confidence interval for the ratio sample1/sample2. Enrichment confidence intervals are calculated with binom_ci_profile separately for each experiment and replicate. Calculated values for all replicates are averaged to create a new replicate avg which is returned along with values for the individual replicates.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
binding_scores(
  data,
  sample1,
  sample2,
  bin,
  window_size,
  skip_5prime = 0,
  skip_3prime = 0,
  conf.level = 0.95,
  bpparam = BiocParallel::bpparam()
)

Arguments

data

A serp_data object.

sample1

Name of the first sample (the numerator). If missing, the default sample1 of the data set will be used.

sample2

Name of the second sample (the denominator). If missing, the default sample2 of the data set will be used.

bin

Binning mode (bynuc or byaa). If missing, the default binning level of the data set will be used.

window_size

Neighborhood size for the confidence interval calculation in nucleotides. If missing, the default window size of the data set will be used.

skip_5prime

How many nucleotides to skip at the 5' end of the ORF. Useful if you know that the 5' end contains artifacts.

skip_3prime

How many nucleotides to skip at the 3' end of the ORF. useful if you know that the 3' end contains artifacts.

conf.level

Confidence level.

bpparam

A BiocParallelParam-class object.

Value

A tibble with the following columns:

gene

The gene/ORF name.

exp

The experiment name.

rep

The replicate name.

score_position

Position within the gene for which the confidence interval is returned. Corresponds to the position where the highest value of the lower CI bound was observed. If bin == 'byaa' this is measured in codons, otherwise in nucleotides.

lo_CI

Lower confidence bound at the position score_position.

hi_CI

Upper confidence bound at the position score_position.

mean

Estimated enrichment at the position score_position.

total_enrichment

Total enrichment of the gene, calculated using all reads mapped anywhere within the ORF

sample1_total_counts

Total number of reads mapped to this ORF. Note that the actual column name is the value of sample1.

sample2_total_counts

Total number of reads mapped to this ORF. Note that the actual column name is the value of sample2.

sample1_total_RPM

sample1_total_counts normalized to the number of mapped reads. Note that the actual column name is the value of sample1.

sample2_total_RPM

sample2_total_counts normalized to the number of mapped reads. Note that the actual column name is the value of sample2.

sample1_avg_read_density

Average read density in sample 1, calculated as sum(sample1)/lengh. If bin is byaa, density is in reads/nucleotide, otherwise in reads/codon. Note that the actual column name is the value of sample1.

sample2_avg_read_density

Average read density in sample 2, calculated as sum(sample2)/lengh. If bin is byaa, density is in reads/nucleotide, otherwise in reads/codon. Note that the actual column name is the value of sample2.

rank

Ranking of the gene within the experiment and replicate. Genes are ranked by lo_CI in descending order.

See Also

defaults


ilia-kats/RiboSeqTools documentation built on Oct. 5, 2020, 7:41 p.m.