Minor changes:
Cellosaurus
: internal code tweak to improve redownload of outdated cached
release file.New functions:
geneFusions
, mutations
: New functions that extract sequence annotation
information about gene fusions and driver gene mutations.cellsPerGeneFusion
, cellsPerMutation
: New functions that call
geneFusions
or mutations
internally respectively, and return a DFrame
containing logical columns per gene fusion or gene mutation.Minor changes:
tnbc
: Added cell line exclusion rules.Major changes:
Cellosaurus
: No longer converting strings to factor. Simply encoding using
Rle
instead. Removed factorize
call in primary generator.Minor changes:
export
: Updated to use generic from AcidGenerics instead of BiocIO. This
variant doesn't require unused format
argument, which is preferable.Major changes:
selectCells
from DepMapAnalysis package and added code coverage
against expected match failures.Minor changes:
excludeContaminatedCells
and excludeProblematicCells
to
separate documentation files.Minor changes:
mapCells
function, in particular.Major changes:
Cellosaurus
: Reworked internal code to parse and extract ATCC identifiers,
which are commonly used instead of Cellosaurus identifiers to organize cell
lines. Also added support for misspellings
column, which handles edge cases
where cell line names are misspelled.Minor changes:
mapCells
: Added option to return NA
on map failure instead of error by
setting strict = FALSE
.mapCells
: Added support for mapping by ATCC identifiers.New functions:
excludeProblematicCells
: Exclude (remove) cell lines from the Cellosaurus
object that are labeled as "Problematic cell line"
in the comments. Note
that this function is more strict than excludeContaminatedCells
, which
are a subset of problematic cells on Cellosaurus.excludeContaminatedCells
: Exclude cell lines that are labeled as
"Problematic cell line: Contaminated"
in the comments.Major changes:
Cellosaurus
: Return now includes OncoTree metadata, which are mapped against
the NCI thesaurus disease identifiers.Minor changes:
Cellosaurus
generator now returns isContaminated
column, which is useful
for differentiating between isProblematic
lines, which may simply be
misidentified, versus cell lines that are really problematic due to
contamination issues.cello
object.New functions:
currentCellosaurusVersion
: Check the Cellosaurus server for current release
version. Currently returns as integer
.Minor changes:
Cellosaurus
: Updated key for samplingSite
metadata column, which is
now defined as Derived from site
in 46 release update.Minor changes:
mapCells
: Reworked our internal matching code.matchNested
function internally.DFrame
instead of DataFrame
virtual class..processEntry
function.Minor changes:
rbindToDataFrame
function instead of
data.table rbindlist
.Minor changes:
Cellosaurus
: Fixed ncitDiseaseId
and ncitDiseaseName
mapping issue with
accessions containing multiple matches (e.g. "CVCL_0011"
, "CVCL_0028"
).Major changes:
Cellosaurus
: Completely reworked main generator function. Now the package
parses the cellosaurus.txt
file internally instead of the previously used
cellosaurus.obo
file. We ran into OBO parser issues with the current
cellosaurus.obo
file (release 44). Also, only the cellosaurus.txt
file
contains additional useful metadata, including secondary accessions and the
patient age at sampling. We have attempted to standardize metadata columns
in the returned Cellosaurus
object to better match the naming conventions
currently used on the Cellosaurus website.export
: Updated method to drop nested list columns (SimpleList
) from the
exported CSV file. Dropped columns currently include: "comments"
,
"crossReferences"
, "date"
, "diseases"
, "hierarchy"
,
"originateFromSameIndividual"
, "referencesIdentifiers"
,
"strProfileData"
, "webPages"
.mapCells
: Updated mapping engine to also support secondary accession
identifiers, which is very useful for redirected previously used identifiers
that are still present in DepMap and Sanger CellModelPassports databases. Also
reworked approach for handling standardized cell names at the last step, to
avoid mapping issues with tricky cell line names, like ICC2 vs. ICC-2. These
are non-breaking changes that are tested to map against all supported cell
lines on DepMap and Sanger CellModelPassports.Minor changes:
mapCells
: Now supports return of cell line name.Major changes:
mapCells
: Reworked internal matching engine, and added support for manual
overrides using overrides
object defined in sysdata.rda
. The original
mappings are defined in overrides.csv
(see data-raw
). Mappings are now
covered against all cell lines defined in DepMap (22Q4) and Sanger
CellModelPassports.Minor changes:
Cellosaurus
: removed option to override caching manually with cache
.sanitizeCells
: Added an additional handling rule for edge case.Cellosaurus
object now gets saved with packageVersion
in metadata
.cello
object.Minor changes:
Cellosaurus
: Fix for "CVCL_7082"
line, which is actually named "NA"
.standardizeCells
: Fix for handling of all cells in Cellosaurus database.mapCells
: Added some additional name variant rules for better matching.Major changes:
cellosaurus.obo
file internally at r.acidgenomics.com
server instead of downloading the latest release version from
ftp.expasy.org
. This change was made due to breaking changes introduced in
Cellosaurus 44 release that broke the package.Minor changes:
depmapId
instead of
depMapId
; sangerModelId
instead of sangerId
), for better consistency
with DepMapAnalysis and CellModelPassports packages.cache
override option to main Cellosaurus
generator, which makes
updating to latest version (e.g. 43), more intuitive than having to delete
the BiocFileCache directory.Minor changes:
export
: Harden inheritance of S4 methods, to ensure that we class on
Cellosaurus
, instead of inheriting the default method for DataFrame
.Minor changes:
Cellosaurus
class now returns with sex
metadata column.factorize
internally,
and all applicable vectors are converted to Rle
for improved memory
efficiency.export
: Added initial experimental method support for export of
Cellosaurus metadata, that dynamically drops columns that aren't useful
in CSV format.This is a major update, with breaking changes.
New S4 classes:
Cellosaurus
: Now defining this class instead of CellosaurusTable
.
Data is retrieved using ontologyIndex from Cellosaurus FTP server instead
of querying the website directly.Major changes:
mapCells
: Now supports return of multiple identifier key types, including
Cellosaurus (default), DepMap, and Sanger (for Cell Model Passports).Minor changes:
Major changes:
CellosaurusTable
: Added support for return of more identifier columns.
Improved support for handling of non-human (e.g. mouse) cell lines.CellosaurusTable
to use R 4.2-specific formula
call.DFrame
now, due to a breaking change introduced
with Bioconductor 3.15, where DataFrame
no longer works.Minor changes:
Minor changes:
rbindToDataFrame
approach.Major changes:
Minor changes:
Minor changes:
mapCells
and standardizeCells
functions to S4 methods that work
on character class. We may define methods for these generics that work on
classed objects inside the DepMapAnalysis package.Initial release.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.