expect_equal
for comparing numerical valuesTransformCatalog
in case R was configured and built in a way that did not support long double.Updated documentation of ReadCatalog
and ReadCatalogInternal
as there are no ID96 catalogs in COSMIC v3.2.
Changed the URL of COSMIC mutational signatures page to the redirected URL.
Updated some tests for TransformCatalog
in case R was configured and built in a way that did not support long double.
Added the argument strict
back to ReadCatalog
for backward compatibility;
strict
is now ignored and deprecated.
Robustified function StandardChromNameNew
to select the column which
contains chromosome names by name instead of column index.
Fixed a bug in function CheckSeqContextInVCF
.
Fixed a bug in function PlotCatalog.SBS96Catalog
when plotting the X axis
after setting par(tck) = 0
.
Changed PlotCatalog
to round the mutation counts for each main type for
SBS96, SBS192, DBS78 and ID counts catalog in case the input is reconstructed
counts catalog.
Updated function AdjustNumberOfCores
not to throw a message on MS Windows machine.
Added an additional argument ylabels
to PlotCatalog
and PlotCatalogToPdf
.
When ylabels = FALSE
, don't plot the y axis labels. Implemented for SBS96Catalog, DBS78Catalog, IndelCatalog.
Enabled argument grid
, uppder
, xlabels
in PlotCatalog
and PlotCatalogToPdf
for DBS78Catalog, IndelCatalog.
ReadCatalog
to import files with: ReadCatalog
function, e.g.
ReadCatalog.SBS96Catalog
. Now they are in
data-raw/obsolete-files/ReadCatalogMethods.R
.ConvCatalogToICAMS
to convert SigProfiler/COSMIC-formatted
catalog files into ICAMS catalog objects. Now these functions are in
data-raw/obsolete-files/ConvCatalogToICAMS.R
, and their functionalities are
integrated into ReadCatalog
.ReadCatalog
to remove rows which have NA in the data table read in. Otherwise
the number of rows will not be accurate to infer the correct catalog type.InferClassOfCatalogForRead
to data-raw/obsolete-files/InferClassOfCatalogForRead.R
.CreateOneColDBSMatrix
when returning 1-column
DBS144 matrix with all values being 0 and the correct row labels.Added an additional argument tmpdir
in function AddRunInformation
.
Updated function CheckAndRemoveDiscardedVariants
and MakeDataFrameFromVCF
to check
for variants that have same REF and ALT.
Create new temp directory when generating zip archive from VCFs to avoid zipping unnecessary files in the output.
Fixed a bug in function AddRunInformation
for allowing ref.genome
to be a
Bioconductor package.
Fixed bugs in function CreateOneColSBSMatrix
, CreateOneColDBSMatrix
and
CreateOneColIDMatrix
when the variants in the input vcfs should all be
discarded.
Updated function CheckAndFixChrNames
to give a warning instead of an error
when "23" and "X" or "24" and "Y" appear on the chromosome names on the VCF at
the same time. CheckAndFixChrNames
will change "23" to "X" or "24" to "Y"
internally for downstream processing.
Changed some code in function AddTranscript
, CreateOneColSBSMatrix
,
CreateOneColDBSMatrix
to use functions from package dplyr
instead of
data.table
due to segfault error.
RemoveRowsWithDuplicatedCHROMAndPOSNew
to remove variants
that have same CHROM, POS, REF.files
in function VCFsToZipFile
.Fixed a bug in ReadAndSplitVCFs
for merging adjacent SBSs into DBS when variant.caller
is mutect
.
Fixed a bug inCheckAndRemoveDiscardedVariants
for removing wrong DBS variants.
CheckAndRemoveDiscardedVariants
to remove wrong DBS variants that have same base in the same position in REF and ALT (e.g. TA > TT or GT > CT).name.of.VCF
in function MakeDataFrameFromVCF
for better error reporting.Updated function MakeDataFrameFromVCF
for better error reporting when reading
in files that are actually not VCFs.
Updated function ReadVCFs
to automatically change the number of cores to 1 on
Windows instead of throwing an error.
CheckAndFixChrNames
for returning the correct number of chromosome names.stop.on.error
and code tryCatch
in function
VCFsToCatalogs
for better tracing if the function stops on error.Added argument stop.on.error
to VCFsToCatalogs
; if false, return
list with single element named error.
Added new internal function CheckAndFixChrNamesForTransRanges
. The
chromosome names in exported data TranscriptRanges
don't have "chr". ICAMS now
will check for the chromosome names format in input vcf and update the
trans.ranges chromosome names in function AddTranscript
if needed.
Added new argument name.of.VCF
in function AnnotateSBSVCF
and
AnnotateDBSVCF
for better error reporting.
Changed return from ReadCatalog
to include possible
attribute "error" and allow for not calling stop() on
error.
For a stranded catalog, as.catalog
and ReadCatalog
will silently convert
region = "genome" to "transcript".
Updated function AddTranscript
to check whether the format of VCF chromosome
names is consistent with that in trans.ranges
used.
Removed documentation warnings related to \link{BSgenome...}
Some file reorganization.
CreateOneColSBSMatrix
for showing message that SBS variant
whose reference base in ref.genome does not match the reference base in the VCF
file.Enabled functions PlotCatalog
and PlotCatalogToPdf
to plot a numeric
matrix, numeric data.frame, or a vector denoting the mutation counts.
Added new internal function AdjustNumberOfCores
to change the number of cores
automatically to 1 if the operating system is Windows.
Added test processing VCF with unknown variant caller.
Added new internal function SplitSBSVCF
, SplitOneVCF
, SplitListOfVCFs
and VCFsToZipFileXtra
, WriteSBS96CatalogAsTsv
,
ReadSBS96CatalogFromTsv
, GetConsensusVAF
.
Added new exported function ReadAndSplitVCFs
, VCFsToCatalogs
, VCFsToCatalogsAndPlotToPdf
and VCFsToZipFile
.
Added new argument filter.status
and get.vaf.function
in functions ReadVCF
,
ReadVCFs
, ReadAndSplitVCFs
, VCFsToCatalogs
, VCFsToCatalogsAndPlotToPdf
and VCFsToZipFile
.
Added a new internal data catalog.row.headers.SBS.96.v1
.
Added new argument max.vaf.diff
in internal functions SplitOneVCF
, SplitListOfVCFs
and exported functions ReadAndSplitVCFs
, VCFsToCatalogs
,
VCFsToCatalogsAndPlotToPdf
and VCFsToZipFile
.
Added new dependency package parallel
.
Added new dependency package R.utils
for data.table::fread
to read gz and bz2 files directly.
Added new argument num.of.cores
in internal functions ReadVCFs
, SplitListOfVCFs
and exported functions ReadAndSplitVCFs
, VCFsToCatalogsAndPlotToPdf
, VCFsToCatalogs
, VCFsToZipFile
, VCFsToIDCatalogs
, VCFsToSBSCatalogs
, VCFsToDBSCatalogs
.
Added new argument ...
in internal functions ReadVCF
, ReadVCFs
and exported functions ReadAndSplitVCFs
, VCFsToCatalogsAndPlotToPdf
, VCFsToCatalogs
, VCFsToZipFile
.
Added new argument mc.cores
in internal functions GetConsensusVAF
.
MakeDataFrameFromVCF
to use data.table::fread
instead of
read.csv
.MakeDataFrameFromVCF
when reading in VCF from URL.Updated function CreateOneColSBSMatrix
to throw a message instead of an error when there are SBS variant whose reference base in ref.genome does not match the reference base in the VCF file.
Updated function MakeVCFDBSdf
to inherit column information from original SBS VCF.
Changed the words in legend for DBS144 plot from "Transcribed", "Untranscribed" to "Transcribed strand" and "Untranscribed strand".
Updated the documentation for exported data all.abundance.
Updated function ReadCatalog.COMPOSITECatalog
not to convert "::" to ".." in the column names.
Updated various functions in ICAMS to generate catalogs with zero mutation counts from empty vcfs.
Added a section "ID classification" in the documentation for exported data catalog.row.order
.
New argument suppress.discarded.variants.warnings
in exported
function AnnotateIDVCF
with default value TRUE.
Added another paper information in AddRunInformation
. "Characterization of
colibactin-associated mutational signature in an Asian oral squamous cell
carcinoma and in other mucosal tumor types", Genome Research 2020
https://doi.org/10.1101/gr.255620.119.
Changed the format of DOIs in DESCRIPTION according to CRAN policy.
Changed back the return value of ReadStrelkaIDVCFs
, ReadStrelkaSBSVCFs
,
ReadMutectVCFs
to a list of data frames with no variants discarded.
Combined all the discarded variants from ReadAndSplitMutectVCFs
and
ReadAndSplitStrelkaSBSVCFs
under one element discarded.variants
in the
return value. An extra column discarded.reason
were added to show the details.
Updated internal functions ReadVCF
and ReadVCFs
not to remove any discarded
variants.
No more removal of "chr" in the CHROM
column when reading in VCFs.
CheckAndReturnSBSMatrix
, CheckAndReturnDBSMatrix
,
CreateOneColSBSMatrix
,CreateOneColDBSMatrix
, VCFsToSBSCatalogs
,
VCFsToDBSCatalogs
.CalculateExpressionLevel
for the edge case.CreateOneColIDMatrix
when the ID.class contains non canonical
representation of the ID mutation type.The return value of exported function ReadStrelkaIDVCFs
now sometimes
contains a new element, discarded.variants
. This appears when there are
variants that were discarded immediately after reading in the VCFs. At present
these are variants that have duplicated chromosome/positions and variants that
have illegal chromosome names. This means that the user must check the return to
see if discarded.variants
is present and remove it before passing the return
to a function that expects a list of VCFs. Code in ICAMS that takes lists of
VCFs already checks for this element and removes it if present.
Added argument return.annotated.vcfs
to exported function
VCFsToIDCatalogs
. The default value for the argument is FALSE
to be consistent with other functions.
Argument return.annotated.vcfs
in functions
VCFsToSBSCatalogs
,VCFsToDBSCatalogs
, VCFsToIDCatalogs
,
MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
,
MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToCatalog
,
StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToZipFile
,
StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
and
StrelkaIDVCFFilesToZipFile
.
Argument suppress.discarded.variants.warnings
in functions
ReadAndSplitMutectVCFs
, ReadAndSplitStrelkaSBSVCFs
,
VCFsToSBSCatalogs
,VCFsToDBSCatalogs
, VCFsToIDCatalogs
,
MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
,
MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToCatalog
,
StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToZipFile
,
StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
and
StrelkaIDVCFFilesToZipFile
.
Added documentation to exported functions ReadAndSplitStrelkaSBSVCFs
,
StrelkaSBSVCFFilesToCatalog
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
and StrelkaSBSVCFFilesToZipFile
.
Added information on the "ID classification" in documentation
of functions generating ID catalogs, FindDelMH
and FindMaxRepeatDel
.
Minor changes to documentation of functions PlotCatalog
, PlotCatalogToPdf
,
StrelkaSBSVCFFilesToZipFile
, StrelkaIDVCFFilesToZipFile
and
MutectVCFFilesToZipFile
.
Updated documentation for the return value of functions
StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
,
StrelkaIDVCFFilesToZipFile
and VCFsToIDCatalogs
to make it clearer to the user.
Added new exported data of catalog row order for SBS96, SBS1536 and DBS78
in SigProfiler format to catalog.row.order.sp
.
New internal function ConvertICAMSCatalogToSigProSBS96
, ReadVCF
, ReadVCFs
.
New exported function GetFreebayesVAF
for calculating variant allele
frequencies from Freebayes VCF.
New test data for Strelka mixed VCF.
Added time zone information to file "run-information.txt" when calling
functions MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToZipFile
and
StrelkaIDVCFFilesToZipFile
.
Enabled "counts" -> "counts.signature" catalog transformation when the source catalog has NULL abundance.
Added legend for SBS192 plot and changed the legend text for SBS12 plot.
Added a second element plot.object
to the return list from function
PlotCatalog
for catalog types "SBS192Catalog", "DBS78Catalog", "DBS144Catalog"
and "IndelCatalog". The second element is a numeric vector giving the
coordinates of the bar midpoints, useful for adding to the graph.
Made the returns from PlotCatalog
and PlotCatalogToPdf
invisible.
Improved time performance of GetMutectVAF
, CanonicalizeDBS
, CanonicalizeQUAD
.
if
statements in GetCustomKmerCounts
、 GetStrandedKmerCounts
and
GetGenomeKmerCounts
.
CreateOneColIDMatrix
when there is NA ID category.
GetMutectVAF
to check if the VCF is indeed a Mutect VCF.
CreateOneColDBSMatrix
when the VCF does not have any variant in the
transcribed region.
CalculatePValues
when there is only a single expression value.
Created an internal function MakeDataFrameFromVCF
to read in data lines of a VCF.
New argument name.of.VCF
in internal function CheckAndFixChrNames
to make
the error message more informative.
New argument name.of.VCF
in exported
function AnnotateIDVCF
to make the error message more informative.
ReadStrelkaIDVCF
to make the error message more informative.AnnotateIDVCF
to a list. The first element annotated.vcf
contains the annotated VCF. If there are rows that are discarded, the function will generate a warning and
a second element discarded.variants
will be included in the returned list.flag.mismatches
deprecated in exported function AnnotateIDVCF
. If there are mismatches to references, the
function will automatically discard these rows. User can refer to the
element discarded.variants
in the return value for the discarded variants.SplitStrelkaSBSVCF
when there are no non.SBS mutations in the input.MakeDataFrameFromMutectVCF
when a Mutect VCF has no meta-information lines.CreateOneColSBSMatrix
when an annotated SBS VCF has variants on transcribed regions that all fall on transcripts on both strand.CreateOneColDBSMatrix
when an annotated DBS VCF has variants on transcribed regions that all fall on transcripts on both strand.ReadAndSplitStrelkaSBSVCFs
.MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToZipFile
and
StrelkaIDVCFFilesToZipFile
. trans.ranges
to make it optional.name.of.VCF
in internal functions
ReadStrelkaSBSVCF
, ReadStrelkaIDVCF
and exported function GetStrelkaVAF
.flag.mismatches
in functions VCFsToIDCatalogs
,
MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
,
MutectVCFFilesToZipFile
, StrelkaIDVCFFilesToCatalog
,
StrelkaIDVCFFilesToCatalogAndPlotToPdf
and StrelkaIDVCFFilesToZipFile
.GetStrelkaVAF
and
GetMutectVAF
to a data frame which contains the VAF and read depth information.PlotCatalogToPdf
a
list. The first element is a logical value indicating whether the plot is
successful. The second element is a list containing the strand bias statistics
(only for SBS192Catalog with "counts" catalog.type
and non-NULL abundance and argument plot.SBS12
= TRUE). PlotCatalog
and PlotCatalogToPdf
:
For class SBS96Catalog:
(New) Allow setting ylim and cex.
(New) For PlotCatalog
(not PlotCatalogToPdf
), allow plotting of a 96 x 2 catalog,
in which case behavior is a stacked bar chart.
(New) Plot x axis tick marks if xlabels
is not TRUE; set par(tck = 0)
to suppress.
For class IndelCatalog:
(New) Allow setting ylim.GetCustomKmerCounts
.PlotTransBiasGeneExpToPdf
so that ymax on the plot will be changed
based on plot.type
.flat.abundance
from "numeric" to
"integer".TransformCatalog
; see documentation
for rationale.TransformCatalog
and updated its documentation
for parameter target.abundance
.CheckAndFixChrNames
and updated the automated tests.TransformCatalog
.GetMutectVAF
and updated the warning
message to make it more informative. cbind
to check the attributes of the incoming catalogs and assign attributes accordingly.TransformCatalog
to check the attributes of the catalog to be
transformed in the first place.AnnotateSBSVCF
, AnnotateDBSVCF
and
AnnotateIDVCF
.PlotTransBiasGeneExp
and PlotTransBiasGeneExpToPdf
.names.of.VCFs
in functions
ReadAndSplitMutectVCFs
, ReadAndSplitStrelkaSBSVCFs
, ReadStrelkaIDVCFs
,
MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
,
StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
,
StrelkaSBSVCFFilesToCatalog
and StrelkaSBSVCFFilesToCatalogAndPlotToPdf
for users to specify the names of samples in the VCF files.as.catalog
.gene.expression.data.HepG2
and
gene.expression.data.MCF10A
.tumor.col.names
in functions
ReadAndSplitMutectVCFs
, MutectVCFFilesToCatalog
and
MutectVCFFilesToCatalogAndPlotToPdf
to specify the column of the VCF
that contains sequencing statistics such as sequencing depth; this column
is often called "unknown" in Mutect.MutectVCFFilesToCatalog
,
MutectVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToCatalog
,
StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, VCFsToSBSCatalogs
,
VCFsToDBSCatalogs
, ReadCatalog
informing the user how to change
attributes of the generated catalog.VCFsToIDCatalogs
, StrelkaIDVCFFilesToCatalog
and StrelkaIDVCFFilesToCatalogAndPlotToPdf
a list; 1st element is the
spectrum catalog (previously the only return); 2nd element is a list of
VCFs with additional annotations.PlotCatalog
a list. The first element is
a logical value indicating whether the plot is successful. The second element
is a numeric vector giving the coordinates of all the bar midpoints drawn,
useful for adding to the graph (only implemented for SBS96Catalog).output.file
argument in
MutectVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
,
and StrelkaIDVCFFilesToCatalogAndPlotToPdf
so that an indicator of the catalog type plus ".pdf" is simply
appended to the base output.file
name. Also made this argument
optional with sensible default behavior.trans.ranges.GRCh37
, trans.ranges.GRCh38
and trans.ranges.GRCm38
.FindDelMH
, cryptic repeats (i.e. un-normalized deletions in a repeat
such as GAGG deleted from CCCAGGGAGGGTCCC, which should be normalized
to a deletion of AGGG) are now ignored with a warning rather than
causing a stop
.FindDelMH
, which previously did not flag the
cryptic repeat in what is now the second example in the function documentation.as.catalog
supports creation of the catalog from a vector (interpreted
as a 1-column matrix) and optionally infers the class from the
number of rows in the input.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.