PASCCA: Distance matrix computation

Description Usage Arguments Details Value Author(s) References Examples

Description

This function computes and returns the distance matrix by using the PASCCA distance measure to computer the distances between the genes (rows) of a APA-related gene experssion data with replicates.

Usage

1
PASCCA(data, alpha = 0.05, repli, tissues, tiss)

Arguments

data

The APA-related gene expression, the first column is poly(A) or exon names, the second column is gene names, and the remaining column is sample names under different biological conditions such as different tissues, cell types and developmental stages. If the set of samples have repeated measurements, the order of the samples in data must be arranged in from big to small according to the number of replicates.

alpha

The cut-off value of the significance level. We accept the null hypothesis if the significance level is above the cut-off value. It means the confidence interval is 0.95 when the alpha is 0.05. The default value of alpha is 0.05.

repli

The numbers of replicates per biological condition such as different tissues, cell types and developmental stages. Note that it needs to be in the same order as the input.

tissues

The total number of biological conditions. If the input data consists of root with three biological replicates, seed with three biological replicates and flower with two biological replicates, the tissues will be three because there are three conditions (root,seed and flower).

tiss

The frequency of the first type of repetition. If the input data consists of root with three biological replicates, seed with three biological replicates and flower with two biological replicates, the tiss will be two since both root and seed have three biological replicates.

Details

The function PASCCA uses the shrinkage canonical correlation analysis to calculate the relationship between genes. The input of this function is a matrix or a data frame, for example, it can be the expressions of RNA-seq data at exon levels or position levels.The output of this function is a distance matrix.

Value

PASCCA returns an object of distance matrix.

Author(s)

Yuqi Long, Wenbin Ye

References

Yao J, Chang C, Salmi M L, et al. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient[J]. BMC bioinformatics, 2008, 9(1): 288.

Hong S, Chen X, Jin L, et al. Canonical correlation analysis for RNA-seq co-expression networks[J]. Nucleic acids research, 2013, 41(8): e95-e95.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
  ##example1---------------------------------------------
  ##Loading example data
  data(polyA_example_data2)
  dim(data2)

  ##Data preprocessing
  pre_data <- PAprocess(data2,log=TRUE)
  dim(pre_data)

  ##Getting information of the samples
  sample_name <- colnames(pre_data)[3:ncol(pre_data)]
  sample_name <- strsplit(sample_name,"\\d$")
  sample_name <- paste("",lapply(sample_name,"[[",1),sep="");
  table(sample_name)

  ##Calculationg PASCCA distance matrix
  gene_dist <- PASCCA(pre_data, alpha = 0.05,
                   repli=as.numeric(table(sample_name)),
                   tissues=length(unique(sample_name)),
                   tiss=as.numeric(table(table(sample_name))))
  str(gene_dist)
  gene_dist[1:3,1:3]
  #or
  gene_dist <- PASCCA(pre_data, alpha = 0.05,repli=c(rep(3,14)), tissues=14, tiss=14)

  ##Example2---------------------------------------------
  data(polyA_example_data1)
  pre_data <- PAprocess(data1,log=TRUE)
  gene_dist <- PASCCA(pre_data, alpha = 0.05,repli=c(rep(3,13),rep(2,1)), tissues=14, tiss=13)
  ##Example3---------------------------------------------
  data(polyA_example_data3)
  pre_data <- PAprocess(data3,log=TRUE)
  gene_dist <- PASCCA(pre_data, alpha = 0.05,repli=c(rep(4,1),rep(3,8),rep(2,5)), tissues=14, tiss=1)

BMILAB/PASCCA documentation built on Nov. 20, 2020, 11:32 p.m.