Description Usage Arguments Value Author(s) See Also Examples
create_data imports the equivalence classes and create a 'BANDITS_data' object.
1 2 3 4 5  | 
salmon_or_kallisto | 
 a character string indicating the input data: 'salmon' or 'kallisto'.  | 
gene_to_transcript | 
 a matrix or data.frame with a list of gene-to-transcript correspondances. The first column represents the gene id, while the second one contains the transcript id.  | 
salmon_path_to_eq_classes | 
 (for salmon input only) a vector of length equals to the number of samples: each element indicates the path to the equivalence classes of the respective sample (computed by salmon).  | 
kallisto_equiv_classes | 
 (for kallisto input only) a vector of length equals to the number of samples: each element indicates the path to the equivalence classes ('.ec' files) of the respective sample (computed by kallisto).  | 
kallisto_equiv_counts | 
 (for kallisto input only) a vector of length equals to the number of samples: each element indicates the path to the counts of the equivalence classes ('.tsv' files) of the respective sample (computed by kallisto).  | 
kallisto_counts | 
 (for kallisto input only) a matrix or data.frame, with 1 column per sample and 1 row per transcript, containing the estimated abundances for each transcript in each sample, computed by kallisto. The matrix must be unfiltered and the order or rows must be unchanged.  | 
eff_len | 
 a vector containing the effective length of transcripts; the vector names indicate the transcript ids.
Ideally, created via   | 
n_cores | 
 the number of cores to parallelize the tasks on. It is highly suggested to use at least one core per sample (default if not specificied by the user).  | 
transcripts_to_keep | 
 a vector containing the list of transcripts to keep.
Ideally, created via   | 
max_genes_per_group | 
 an integer number specifying the maximum number of genes that each group can contain. When equivalence classes contain transcripts from distinct genes, these genes are analyzed together. For computational reasons, 'max_genes_per_group' sets a limit to the number of genes that each group can contain.  | 
A BANDITS_data object.
Simone Tiberi simone.tiberi@uzh.ch
eff_len_compute, filter_transcripts, filter_genes, BANDITS_data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49  | # specify the directory of the internal data:
data_dir = system.file("extdata", package = "BANDITS")
# load gene_to_transcript matching:
data("gene_tr_id", package = "BANDITS")
# Specify the directory of the transcript level estimated counts.
sample_names = paste0("sample", seq_len(4))
quant_files = file.path(data_dir, "STAR-salmon", sample_names, "quant.sf")
# Load the transcript level estimated counts via tximport:
library(tximport)
txi = tximport(files = quant_files, type = "salmon", txOut = TRUE)
counts = txi$counts
# Optional (recommended): transcript pre-filtering
transcripts_to_keep = filter_transcripts(gene_to_transcript = gene_tr_id,
                                         transcript_counts = counts,
                                         min_transcript_proportion = 0.01,
                                         min_transcript_counts = 10,
                                         min_gene_counts = 20)
# compute the Median estimated effective length for each transcript:
eff_len = eff_len_compute(x_eff_len = txi$length)
# specify the path to the equivalence classes:
equiv_classes_files = file.path(data_dir, "STAR-salmon", sample_names, "aux_info", "eq_classes.txt")
# create data from 'salmon' and filter internally lowly abundant transcripts:
input_data = create_data(salmon_or_kallisto = "salmon",
                         gene_to_transcript = gene_tr_id,
                         salmon_path_to_eq_classes = equiv_classes_files,
                         eff_len = eff_len, 
                         n_cores = 2,
                         transcripts_to_keep = transcripts_to_keep)
input_data
# create data from 'kallisto' and filter internally lowly abundant transcripts:
kallisto_equiv_classes = file.path(data_dir, "kallisto", sample_names, "pseudoalignments.ec")
kallisto_equiv_counts  = file.path(data_dir, "kallisto", sample_names, "pseudoalignments.tsv")
input_data_2 = create_data(salmon_or_kallisto = "kallisto",
                          gene_to_transcript = gene_tr_id,
                          kallisto_equiv_classes = kallisto_equiv_classes,
                          kallisto_equiv_counts = kallisto_equiv_counts,
                          kallisto_counts = counts,
                          eff_len = eff_len, n_cores = 2,
                          transcripts_to_keep = transcripts_to_keep)
input_data_2
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.