makeSE | R Documentation |
Creates a NxtSE object from the data (that was collated using collateData). This object is used for downstream differential analysis of IR and alternative splicing events using ASE-methods, data generation for visualization of scatter plots and heatmaps via make_plot_data methods, and coverage visualisation using plotCoverage
makeSE(
collate_path,
colData,
RemoveOverlapping = TRUE,
realize = FALSE,
verbose = TRUE
)
collate_path |
(Required) The output path of collateData pointing to the collated data |
colData |
(Optional) A data frame containing the sample annotation
information. The first column must contain the sample names.
Omit |
RemoveOverlapping |
(default = |
realize |
(default = |
verbose |
(default = |
makeSE
retrieves the data collated by
collateData, and initialises a NxtSE object. It references
the required on-disk assay data using DelayedArrays, thereby utilising
'on-disk' memory to conserve memory usage.
For extremely large datasets, loading the entire data into memory may consume
too much memory. In such cases, make a subset of the NxtSE
object (e.g. subset by samples) before loading the data into memory (RAM)
using realize_NxtSE. Alternatively supply a data frame to the colData
parameter of the makeSE()
function. Only samples listed in the first column
of the colData
data frame will be imported into the NxtSE
object.
It should be noted that downstream applications of SpliceWiz, including ASE-methods, plotCoverage, are much faster if the NxtSE is realized. It is recommended to realize the NxtSE object before extensive usage.
If COV files assigned via collateData have been moved relative to the
collate_path
, the created NxtSE object will not be linked to
any COV files and plotCoverage cannot be used. To reassign these
files, a vector of file paths corresponding to all the COV files of the data
set can be assigned using covfile(se) <- vector_of_cov_files
. See
the example below for details.
If RemoveOverlapping = TRUE
, makeSE
will remove introns that overlap
other introns with higher junction read counts in the dataset. This means
that SpliceWiz will assess a set of non-overlapping introns which belong
to likely major isoforms, ensuring that overlapping
IR events are not 'double-counted'.
NB: Since version 1.3.4, SpliceWiz has improved the algorithm of generating
the set of non-overlapping introns (prior versions appear to generate
sets of introns that still overlap). To use the prior algorithm for
compatibility with prior analysis, set RemoveOverlapping = FALSE
.
A NxtSE object containing the compiled data in
DelayedArrays (or as matrices if realize = TRUE
), pointing to the assay
data contained in the given collate_path
# The following code can be used to reproduce the NxtSE object
# that can be fetched with SpliceWiz_example_NxtSE()
buildRef(
reference_path = file.path(tempdir(), "Reference"),
fasta = chrZ_genome(),
gtf = chrZ_gtf()
)
bams <- SpliceWiz_example_bams()
processBAM(bams$path, bams$sample,
reference_path = file.path(tempdir(), "Reference"),
output_path = file.path(tempdir(), "SpliceWiz_Output")
)
expr <- findSpliceWizOutput(file.path(tempdir(), "SpliceWiz_Output"))
collateData(expr,
reference_path = file.path(tempdir(), "Reference"),
output_path = file.path(tempdir(), "Collated_output")
)
se <- makeSE(collate_path = file.path(tempdir(), "Collated_output"))
# "Realize" NxtSE object to load all H5 assays into memory:
se <- realize_NxtSE(se)
# If COV files have been removed since the last call to collateData()
# reassign them to the NxtSE object, for example:
covfile_path <- system.file("extdata", package = "SpliceWiz")
covfile_df <- findSamples(covfile_path, ".cov")
covfile(se) <- covfile_df$path
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.