scPOEM: Main Function.

View source: R/scPOEM.R

scPOEMR Documentation

Main Function.

Description

This function takes paired single-cell ATAC-seq (scATAC-seq) and RNA-seq (scRNA-seq) data to embed peaks and genes into a shared low-dimensional space. It integrates regulatory relationships from peak-peak interactions (via Cicero), peak-gene interactions (via Lasso, random forest, and XGBoost), and gene-gene interactions (via principal component regression). Additionally, it supports gene-gene network reconstruction using epsilon-NN projections and compares networks across conditions through manifold alignment (scTenifoldNet).

Usage

scPOEM(
  mode = c("single", "compare"),
  input_data,
  dirpath = tempdir(),
  count_device = 1,
  nComp = 5,
  seed = NULL,
  numwalks = 5,
  walklength = 3,
  epochs = 100,
  neg_sample = 5,
  batch_size = 32,
  weighted = TRUE,
  exclude_pos = FALSE,
  d = 100,
  rebuild_GGN = TRUE,
  rebuild_PPN = TRUE,
  rebuild_PGN_Lasso = TRUE,
  rebuild_PGN_RF = TRUE,
  rebuild_PGN_XGB = TRUE,
  relearn_pg_embedding = TRUE,
  save_file = TRUE,
  pg_method = c("Lasso", "RF", "XGBoost"),
  python_env = "scPOEM_env"
)

Arguments

mode

The mode indicating whether to analyze data from a single condition or to compare two conditions.

input_data

A list of input data.

If mode = "single", input_data must be a list containing the following seven objects:

  • X: The scATAC-seq data, sparse matrix.

  • Y: The scRNA-seq data, sparse matrix.

  • peak_data: A data.frame containing peak information.

  • gene_data: A data.frame containing gene information (must contain a column "gene_name").

  • cell_data: A data.frame containing cell metadata.

  • neibor_peak: The peak IDs within a certain range of each gene, must have cols c("gene_name", "start_use", "end_use"). The id numbers in "start_use" and "end_use" are start from 0.

  • genome: The genome length for the species.

If mode = "compare", input_data must be a named list of two elements, with names corresponding to two state names (e.g., "S1" and "S2"). Each element must itself be a list containing the same seven components as described above for mode = "single".

dirpath

The folder path to read or write file.

count_device

The number of cpus used to train models.

nComp

The number of PCs used for regression in constructing GGN.

seed

An integer specifying the random seed to ensure reproducible results.

numwalks

Number of random walks per node. Default is 5.

walklength

Length of walk depth. Default is 3.

epochs

Number of training epochs. Default is 100.

neg_sample

Number of negative samples per positive sample. Default is 5.

batch_size

Batch size for training. Default is 32.

weighted

Whether the sampling network is weighted. Default is TRUE.

exclude_pos

Whether to exclude positive samples from negative sampling. Default is FALSE.

d

The dimension of latent space. Default is 100.

rebuild_GGN

Logical. Whether to rebuild the gene-gene network from scratch. If FALSE, the function will attempt to read from GGN.mtx under dirpath/test in single mode or dirpath/state_name/test in compare mode.

rebuild_PPN

Logical. Whether to rebuild the peak-peak network from scratch. If FALSE, the function will attempt to read from PPN.mtx under dirpath/test in single mode or dirpath/state_name/test in compare mode.

rebuild_PGN_Lasso

Logical. Whether to rebuild the peak-gene network via Lasso from scratch. If FALSE, the function will attempt to read from PGN_Lasso.mtx under
dirpath/test in single mode or dirpath/state_name/test in compare mode.

rebuild_PGN_RF

Logical. Whether to rebuild the peak-gene network via random forest from scratch. If FALSE, the function will attempt to read from PGN_RF.mtx under dirpath/test in single mode or dirpath/state_name/test in compare mode.

rebuild_PGN_XGB

Logical. Whether to rebuild the peak-gene network via XGBoost from scratch. If FALSE, the function will attempt to read from PGN_XGB.mtx under
dirpath/test in single mode or dirpath/state_name/test in compare mode.

relearn_pg_embedding

Logical. Whether to relearn the low-dimensional representations for peaks and genes from scratch. If FALSE, the function will attempt to read from
node_embeddings.mtx, node_used_peak.csv, node_used_gene.csv
under dirpath/embedding in single mode or
dirpath/state_name/embedding in compare mode.

save_file

Logical, whether to save the output to a file.

pg_method

The vector of methods used to construct peak-gene net. Default is c("Lasso", "RF", "XGBoost").

python_env

Name or path of the Python environment to be used.

Value

The scPOEM result.

Single Mode

Returns a list containing the following elements:

E

Low-dimensional representations of peaks and genes.

peak_node

Peak IDs that are associated with other peaks or genes.

gene_node

Gene IDs that are associated with other peaks or genes.

Compare Mode

Returns a list containing the following elements:

state1 name

The single-mode result for the first condition.

state2 name

The single-mode result for the second condition.

compare

A summary list containing:

E_g2

Low-dimensional embedding representations of genes under the two conditions.

common_genes

Genes shared between both conditions and used in the analysis.

diffRegulation

A list of differential regulatory information for each gene.

Examples


library(scPOEM)
library(monocle)
dirpath <- "./example_data"
# An example for analysing a single dataset.
# Download and read data.
data(example_data_single)
single_result <- scPOEM(mode = "single",
                        input_data=example_data_single,
                        dirpath=file.path(dirpath, "single"),
                        save_file=FALSE)

# An example for analysing and comparing datasets from two conditions.
# Download compare mode example data
data(example_data_compare)
compare_result <- scPOEM(mode = "compare",
                         input_data=example_data_compare,
                         dirpath=file.path(dirpath, "compare"),
                         save_file=FALSE)



scPOEM documentation built on Aug. 28, 2025, 9:09 a.m.

Related to scPOEM in scPOEM...