| runSCORPION | R Documentation |
Builds per-group regulatory networks by running scorpion on subsets of cells defined by cellsMetadata and combining the resulting networks into a wide-format data frame where each column corresponds to a network.
runSCORPION(
gexMatrix,
tfMotifs,
ppiNet,
cellsMetadata,
groupBy,
normalizeData = TRUE,
removeBatchEffect = FALSE,
batch = NULL,
minCells = 30,
computingEngine = "cpu",
nCores = 1,
gammaValue = 10,
nPC = 25,
assocMethod = "pearson",
alphaValue = 0.1,
hammingValue = 0.001,
nIter = Inf,
outNet = "regNet",
zScaling = TRUE,
showProgress = TRUE,
randomizationMethod = "None",
scaleByPresent = FALSE,
filterExpr = FALSE
)
gexMatrix |
An expression dataset with genes in the rows and barcodes (cells) in the columns. |
tfMotifs |
A motif dataset, a data.frame or a matrix containing 3 columns. Each row describes a motif associated with a transcription factor (column 1) a gene (column 2) and a score (column 3). |
ppiNet |
A Protein-Protein-Interaction dataset, a data.frame or matrix containing 3 columns. Each row describes a protein-protein interaction between transcription factor 1 (column 1), transcription factor 2 (column 2) and a score (column 3). |
cellsMetadata |
A data.frame with cell-level metadata; must contain columns specified in |
groupBy |
Character vector of one or more column names in |
normalizeData |
Boolean to indicate normalization of expression data. Default TRUE performs log normalization. |
removeBatchEffect |
Boolean to indicate batch effect correction. Default FALSE. |
batch |
Factor or vector giving batch assignment for each cell; required if |
minCells |
Minimum number of cells per group required to build a network. Default is 30. |
computingEngine |
Either 'cpu' or 'gpu'. Passed to |
nCores |
Number of processors to be used if BLAS or MPI is active. |
gammaValue |
Graining level of data (proportion of number of single cells to super-cells). Default 10. |
nPC |
Number of principal components to use for kNN network construction. Default 25. |
assocMethod |
Association method. Must be one of 'pearson', 'spearman' or 'pcNet'. Default 'pearson'. |
alphaValue |
Value to be used for update variable in PANDA. Default 0.1. |
hammingValue |
Value at which to terminate the process based on Hamming distance. Default 0.001. |
nIter |
Sets the maximum number of iterations PANDA can run before exiting. Default Inf. |
outNet |
Character specifying which network to extract. Options include "regNet", "coregNet", "coopNet". Default "regNet". |
zScaling |
Boolean to indicate use of Z-Scores in output. FALSE will use [0,1] scale. Default TRUE. |
showProgress |
Boolean to indicate printing of output for algorithm progress. Default TRUE. |
randomizationMethod |
Method by which to randomize gene expression matrix. Default "None". Must be one of "None", "within.gene", "by.gene". |
scaleByPresent |
Boolean to indicate scaling of correlations by percentage of positive samples. Default FALSE. |
filterExpr |
Boolean to indicate whether or not to remove genes with 0 expression across all cells. Default FALSE. |
This function is a wrapper around scorpion that groups cells according to metadata columns, filters out groups with insufficient cells, runs network inference on each remaining group independently, and finally combines all resulting networks into a single wide-format data frame.
A data.frame in wide format where rows represent TF-target pairs (union across all networks) and columns represent network identifiers. Cell values are edge weights from the corresponding network.
## Not run:
# Load test data
data(scorpionTest)
# Example 1: Group by single column (region)
nets_by_region <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = "region"
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# i 3 networks requested
# + 3 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_by_region)
# tf target T B N
# 1 AATF ACKR1 -0.31433856 -0.3569918 -0.33734920
# 2 ABL1 ACKR1 -0.32915008 -0.3648895 -0.34437341
# 3 ACSS2 ACKR1 -0.31418599 -0.3557854 -0.33663144
# 4 ADNP ACKR1 0.04105895 0.1109288 0.09910822
# 5 AEBP2 ACKR1 -0.18964574 -0.2202269 -0.17558140
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.31024700 -0.3508320 -0.33054519
# Example 2: Group by single column (donor)
nets_by_donor <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = "donor"
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# i 3 networks requested
# + 3 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_by_donor)
# tf target P31 P32 P33
# 1 AATF ACKR1 -0.34869366 -0.3557884 -0.35010835
# 2 ABL1 ACKR1 -0.33724323 -0.3575331 -0.32875974
# 3 ACSS2 ACKR1 -0.34569954 -0.3573108 -0.34980657
# 4 ADNP ACKR1 0.09933951 0.1045316 0.06046914
# 5 AEBP2 ACKR1 -0.25111137 -0.2245655 -0.23157035
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.34148264 -0.3518686 -0.34398594
# Example 3: Group by two columns (donor and region)
nets_by_donor_region <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = c("donor", "region")
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# i 9 networks requested
# + 9 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_by_donor_region)
# tf target P31--T P31--B P31--N
# 1 AATF ACKR1 -0.32634975 -0.33717677 -0.3442886
# 2 ABL1 ACKR1 -0.34048759 -0.33890429 -0.3509986
# 3 ACSS2 ACKR1 -0.32570697 -0.33600811 -0.3436603
# 4 ADNP ACKR1 0.07975735 0.05354279 0.1048301
# 5 AEBP2 ACKR1 -0.21472437 -0.20545660 -0.1815737
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.31861592 -0.32809314 -0.3375652
# Example 4: Group by three columns (donor, region, and cell_type)
nets_by_donor_region_cell_type <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = c("donor", "region", "cell_type")
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# i 9 networks requested
# + 9 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_by_donor_region_cell_type)
# tf target P31--T--Epithelial P31--B--Epithelial
# 1 AATF ACKR1 -0.32634975 -0.33717677
# 2 ABL1 ACKR1 -0.34048759 -0.33890429
# 3 ACSS2 ACKR1 -0.32570697 -0.33600811
# 4 ADNP ACKR1 0.07975735 0.05354279
# 5 AEBP2 ACKR1 -0.21472437 -0.20545660
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.31861592 -0.32809314
# Example 5: Using GPU computing engine (if available)
nets_gpu <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = "region",
computingEngine = "gpu"
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# i 3 networks requested
# + 3 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_gpu)
# tf target T B N
# 1 AATF ACKR1 -0.31433821 -0.3569913 -0.33734894
# 2 ABL1 ACKR1 -0.32915005 -0.3648892 -0.34437302
# 3 ACSS2 ACKR1 -0.31418574 -0.3557851 -0.33663106
# 4 ADNP ACKR1 0.04105883 0.1109285 0.09910798
# 5 AEBP2 ACKR1 -0.18964562 -0.2202267 -0.17558131
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.31024694 -0.3508317 -0.33054504
# Example 6: Removing batch effect using donor as batch
nets_batch_corrected <- runSCORPION(
gexMatrix = scorpionTest$gex,
tfMotifs = scorpionTest$tf,
ppiNet = scorpionTest$ppi,
cellsMetadata = scorpionTest$metadata,
groupBy = "region",
removeBatchEffect = TRUE,
batch = scorpionTest$metadata$donor
)
# -- SCORPION ----------------------------------------------------------------
# + Normalizing data (log scale)
# + Correcting for batch effects
# i 3 networks requested
# + 3 networks meet the minimum cell requirement (30)
# i Computing networks
# + Networks successfully constructed
# + Networks successfully combined
# head(nets_batch_corrected)
# tf target T B N
# 1 AATF ACKR1 -0.3337298 -0.34885471 -0.13011777
# 2 ABL1 ACKR1 -0.3408020 -0.35409813 -0.17694266
# 3 ACSS2 ACKR1 -0.3325270 -0.35115311 -0.12661518
# 4 ADNP ACKR1 0.1117504 0.08691481 0.01608898
# 5 AEBP2 ACKR1 -0.2334648 -0.22113011 0.12519312
# 6 AEBP2_EED_EZH2_RBBP4_SUZ12 ACKR1 -0.3274770 -0.34475499 -0.12449908
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.