The packages CNAGSCG aims at being a recollection of common functions we use in our day to day life analyzing single-cell data.
You can install the released version of CNAGSCG from GitHub with:
devtools::install_github("Single-Cell-Genomics-Group-CNAG-CRG/SCrafty-package", ref = "main")
And the development version with:
# install.packages("devtools")
devtools::install_github("Single-Cell-Genomics-Group-CNAG-CRG/SCrafty-package", ref = "devel")
If you need set the GitHub PAT in R to install the package you can set it in R by:
## set personal access token:
credentials::set_github_pat("YourPAT")
Keep in mind that these being a shared package maintained by us all it
is important we stick to consistent best practices so we can all read
other peoples code and contribute on top of each other. When adding new
functions or modifying pre-existing ones please use pull requests so
someone else can check your code prior to merging in the main branch.
* Best practices on how to write an R package are detailed in Hadley
Wickham’s book.
* Best practices on R function writing can be found in the books
R for Data Science
and the
Functions chapter of Advanced R
.
* Best practices on how to write R code can be found
here
and here. Use package lintr
to make sure the basic best practices are met. You can run
lintr::lint_dir()
to lint a specific directory or
lintr::lint_package()
to lint all the files in the package.
New functions need to be written in an R script file following the
predefined format as shown in the toy function
fbind()
and saved in the R/
directory. For ease of use the R script name
should be the same as the function name. This can be done by running
usethis::use_r("fbind")
which will initialize the R script in the
right directory.
After adding a new function make sure everything runs smoothly by:
1- running devtools::load_all()
, this runs a mock build of the package
and allows you to test the function runs correctly within the package
environment.
2- running devtools::document()
to build the documentation for the
function in man/
and NAMESPACE
. Check will return error if the
documentation for a function in R/
is not present.
3- running devtools::check()
, after making sure the function runs well
we need to be sure that all the moving parts of the package still work
in the package environment. devtools::check()
is the equivalent of
running R CMD check
in the terminal.
4- Add a unit test for the function via usethis::use_test("fbind")
.
This will open an rscript named test-fbind.R in test/testthat
. To
check that all the unit tests runs properly you can run
devtools::test()
.
5- Lastly, if you add an example in README.Rmd
compile it using
devtools::build_readme()
.
Bonus, if your package requires a dependency not found the file
DESCRITPION section imports you can add it by running
usethis::use_package("dplyr")
.
Add examples of functions below:
library(SCrafty)
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(Seurat)
#> Attaching SeuratObject
library(GOstats)
#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#>
#> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#> clusterExport, clusterMap, parApply, parCapply, parLapply,
#> parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:dplyr':
#>
#> combine, intersect, setdiff, union
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#> dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#> grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#> order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#> rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#> union, unique, unsplit, which.max, which.min
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: Category
#> Loading required package: stats4
#> Loading required package: AnnotationDbi
#> Loading required package: IRanges
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:dplyr':
#>
#> first, rename
#> The following object is masked from 'package:base':
#>
#> expand.grid
#>
#> Attaching package: 'IRanges'
#> The following objects are masked from 'package:dplyr':
#>
#> collapse, desc, slice
#>
#> Attaching package: 'AnnotationDbi'
#> The following object is masked from 'package:dplyr':
#>
#> select
#> Loading required package: Matrix
#>
#> Attaching package: 'Matrix'
#> The following object is masked from 'package:S4Vectors':
#>
#> expand
#> Loading required package: graph
#>
#>
#> Attaching package: 'GOstats'
#> The following object is masked from 'package:AnnotationDbi':
#>
#> makeGOGraph
If you are going to use Seurat object for function examples please use
datasets from SeuratData
to run the vignette so we can all recreate
the examples easily.
# devtools::install_github('satijalab/seurat-data')
library(SeuratData)
#> Registered S3 method overwritten by 'cli':
#> method from
#> print.boxx spatstat.geom
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> Warning in if (is.na(desc)) {: the condition has length > 1 and only the first
#> element will be used
#> ── Installed datasets ───────────────────────────────────── SeuratData v0.2.1 ──
#> ✓ pbmc3k 3.1.4 ✓ stxBrain 0.1.1
#> ✓ pbmcsca 3.0.0
#> ────────────────────────────────────── Key ─────────────────────────────────────
#> ✓ Dataset loaded successfully
#> > Dataset built with a newer version of Seurat than installed
#> ❓ Unknown version of Seurat installed
Under development…👩💻 🧑💻
Here is a set of funtions to compute GO analysis:
gene_de <- c("MS4A1", "CD79A", "CD79B")
gene_universe <- c("HLA-DRA", "AIF1", "C1QA", "LYZ", "SELENOP", "ADAMDEC1",
"CD14", "CD68", "IGF1", "LGMN", "AXL", "C5AR1", "CD163", "CD163L1", "CD209",
"CSF1R", "IL10", "IL10RA", "IL13RA1", "NRP1", "NRP2", "SDC3", "SIGLEC1",
"TGFBI", "TGFBR1", "EEF1A1", "OAZ1", "RPL15", "RPL19", "RPS19", "RPS29",
"TPT1", "UBA52", "APOE", "FCGRT", "MKI67", "PCLAF", "RRM2", "CD1C", "CLC",
"IL4", "TPSAB1", "CPA3", "NRG1", "RETN", "EREG", "ACOD1", "TNIP3", "VCAN",
"INHBA", "CXCL5", "HTRA1", "SPP1", "HSPA1B", "CXCR2", "PI3", "PROK2",
"HSPA6", "NCCRP1", "LAD1", "CCL22", "MS4A1", "CD79A", "CD79B")
GO_results <- gene_enrichment_GO(
gene_de = gene_de,
gene_universe = gene_universe,
gene_from = "SYMBOL",
gene_to = "ENTREZID",
annotation = "org.Hs.eg.db",
pvalue_cutoff = 0.05,
test_direction = "over",
ontology = "BP")
#>
#> 'select()' returned 1:1 mapping between keys and columns
#> 'select()' returned 1:1 mapping between keys and columns
#> Loading required package: org.Hs.eg.db
knitr::kable(summary(GO_results))
| GOBPID | Pvalue | OddsRatio | ExpCount | Count | Size | Term | |:------------------------------------------------|----------:|----------:|----------:|------:|-----:|:-------------------------------------------------------------------| | GO:0050853 | 0.0000252 | Inf | 0.1428571 | 3 | 3 | B cell receptor signaling pathway | | GO:0050851 | 0.0001007 | Inf | 0.1904762 | 3 | 4 | antigen receptor-mediated signaling pathway | | GO:0002429 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | immune response-activating cell surface receptor signaling pathway | | GO:0002757 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | immune response-activating signal transduction | | GO:0002764 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | immune response-regulating signaling pathway | | GO:0002768 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | immune response-regulating cell surface receptor signaling pathway | | GO:0030183 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | B cell differentiation | | GO:0042113 | 0.0005036 | Inf | 0.2857143 | 3 | 6 | B cell activation | | GO:0002253 | 0.0008814 | Inf | 0.3333333 | 3 | 7 | activation of immune response | | GO:0030098 | 0.0008814 | Inf | 0.3333333 | 3 | 7 | lymphocyte differentiation | | GO:0002521 | 0.0014102 | Inf | 0.3809524 | 3 | 8 | leukocyte differentiation | | GO:0030097 | 0.0030218 | Inf | 0.4761905 | 3 | 10 | hemopoiesis | | GO:0048534 | 0.0041550 | Inf | 0.5238095 | 3 | 11 | hematopoietic or lymphoid organ development | | GO:0046649 | 0.0055400 | Inf | 0.5714286 | 3 | 12 | lymphocyte activation | | GO:0050778 | 0.0055400 | Inf | 0.5714286 | 3 | 12 | positive regulation of immune response | | GO:0002520 | 0.0072020 | Inf | 0.6190476 | 3 | 13 | immune system development | | GO:0042100 | 0.0090151 | 58 | 0.1904762 | 2 | 4 | B cell proliferation | | GO:0050776 | 0.0114578 | Inf | 0.7142857 | 3 | 15 | regulation of immune response | | GO:0002684 | 0.0334920 | Inf | 1.0000000 | 3 | 21 | positive regulation of immune system process | | GO:0045321 | 0.0387802 | Inf | 1.0476190 | 3 | 22 | leukocyte activation | | GO:0032943 | 0.0401904 | 18 | 0.3809524 | 2 | 8 | mononuclear cell proliferation | | GO:0046651 | 0.0401904 | 18 | 0.3809524 | 2 | 8 | lymphocyte proliferation | | GO:0002115 | 0.0476190 | Inf | 0.0476190 | 1 | 1 | store-operated calcium entry | | GO:0060401 | 0.0476190 | Inf | 0.0476190 | 1 | 1 | cytosolic calcium ion transport | | GO:0060402 | 0.0476190 | Inf | 0.0476190 | 1 | 1 | calcium ion transport into cytosol | | GO:0070509 | 0.0476190 | Inf | 0.0476190 | 1 | 1 | calcium ion import | | GO:1902656 | 0.0476190 | Inf | 0.0476190 | 1 | 1 | calcium ion import into cytosol |
Visualization… Under development…👩💻 🧑💻
With ggpreview
we can preview a plot in the viewer pane. It is very
handy to find the right height and width as well as the right font size.
df <- data.frame(x = 1:10, y = 1:10)
tmp_plt <- ggplot2::ggplot(df, ggplot2::aes(x = x, y = y)) +
ggplot2::geom_point()
CNAGSCG::ggpreview(x = tmp_plt, w = 9, h = 4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.