SCREEN: Batch Run
In HailinWei98/SCREEN: What the Package Does (One Line, Title Case)

SCREEN

R Documentation

Batch Run

Description

Get all results using one function easily.

Usage

SCREEN(
  sg_dir,
  mtx_dir,
  fragments,
  cal.FRiP = TRUE,
  species = "Hs",
  version = "v75",
  data_type = "RNA",
  Mixscape = TRUE,
  prefix = "./",
  label = "",
  gene_type = "Symbol",
  protein_coding = TRUE,
  frac = 0.01,
  cal.mt = TRUE,
  nFeature = c(200, 5000),
  nCount = 1000,
  FRiP = 0.1,
  mt = 10,
  blank_NTC = FALSE,
  lambda = 0.01,
  permutation = NULL,
  p_val_cut = 0.05,
  score_cut = 0.5,
  cicero_p_val_cut = 0.05,
  cicero_score_cut = 0,
  ylimit = "auto",
  project = "perturb",
  NTC = "NTC",
  replicate = 1,
  select_gene = NULL,
  selected = NULL,
  gene_annotations = NULL,
  pro_annotations = NULL,
  pro_up = 3000,
  pro_down = 0,
  overlap_cut = 0,
  p_adj_cut = 0.05,
  logFC_cut = 1,
  min.pct = 0.2,
  upstream = 2e+06,
  downstream = 2e+06,
  test.use = "wilcox",
  track_size = c(1, 0.3, 0.2, 0.3),
  include_axis_track = TRUE,
  connection_color = "#7F7CAF",
  connection_color_legend = TRUE,
  connection_width = 2,
  connection_ymax = NULL,
  gene_model_color = "#81D2C7",
  alpha_by_coaccess = FALSE,
  gene_model_shape = c("smallArrow", "box")
)

Arguments

`sg_dir`	Data frame or directory to a txt file containing 3 columns: cell, barcode, gene. If sgRNA information stored in a matrix-like format or sinput data frame only has sgRNA frequence of each cell, use `sgRNAassign` to assign sgRNA to each cell.
`mtx_dir`	SeuratObject or directory to rds file of SeuratObject, with cell in columns and features in rows.
`fragments`	Directory of fragments file used to calculate FRiP for perturb-ATAC input.
`cal.FRiP`	Logical, calculate FRiP or not. Default is `TRUE`.
`species`	Only support "Hs" and "Mm", if input other species, `percent.mt` will be count as "Mm". Default is "Hs".
`version`	Version of the reference genome(Ensembl), used for perturb-ATAC input and perturb-enhancer input. Default is "v75".
`data_type`	Type of input data, can be one of c("RNA", "ATAC"). Default is "RNA".
`Mixscape`	Logical, run `IntegratedMixscape` or not. Default is `TRUE`.
`prefix`	Path to save all the results. Default is current directory.
`label`	The label of the output file.
`gene_type`	Type of gene name, selected from one of c("Symbol", "Ensembl"). Default is "Symbol".
`protein_coding`	Logical, only use protein coding gene or not. This parameter is only used for calculating gene activity for perturb-ATAC input. Default is `TRUE`.
`frac`	A paramter for filtering low expressed genes or low accessibility peaks. By default, only genes or peaks that have expressions or counts in at least that fractions of cells are kept. Default is 0.01.
`cal.mt`	Logical, calculate percentage of mitochondrial gene expression of each cell or not. Default is `TRUE`.
`nFeature`	Limitation of detected feature numbers in each cell, in the format c(200, 5000). Default is c(200, 5000).
`nCount`	Minimal count numbers in each cell. Default is 1000.
`FRiP`	Minimal FRiP of each cell. Default is 0.1.
`mt`	Maximal percentage of mitochondrial gene expression of each cell. Default is 10.
`blank_NTC`	Logical, use blank control as negative control or not. Default is `FALSE`.
`lambda`	Parameter used in ridge regression of `improved_scmageck_lr`. Default is 0.01.
`permutation`	Permutation times in `improved_scmageck_lr`. Default is 10000.
`p_val_cut`	P-value cutoff of `improved_scmageck_lr` results. Default is 0.05.
`score_cut`	Score cutoff of `improved_scmageck_lr` results. Default is 0.5.
`cicero_p_val_cut`	P-value cutoff of `improved_scmageck_lr` results used for `ciceroPlot`. Default is 0.05.
`ylimit`	Limitation of y-axis of `DE_gene_plot` in the format c(-600, 600, 200). These numbers mean c(minimum, maximum, interval). Default is "auto", which means that this function will get `ylimit` automatically.
`project`	Title of `DE_gene_plot`. Default is "perturb".
`NTC`	The name of the genes served as negative controls. Default is "NTC".
`replicate`	Required a vector of replicate information corresponding to each cell with the same order. Default is 1, which means no replicate.
`select_gene`	The list of genes for regression in `scMAGeCK` step. By default, all genes in the table are subject to regression.
`selected`	Enhancer regions to visualize for perturb-enhancer or perturbations to chose for perturb-ATAC, in `cicero` step. By default, all enhancers or all perturbations will be chosen.
`gene_annotations`	Gene annotations stored in data frame format, including c("chromosome", "start", "end", "strand", "transcript") as colnames, used for /codeciceroPlot step. By default, gene annotations are from `ensembldb`.
`pro_annotations`	Gene annotations stored in data frame format, including c("chromosome", "start", "end", "strand", "transcript") as colnames. By default, gene annotations are from `ensembldb`.
`pro_up`	The number of nucleotides upstream of the transcription start site that should be included in the promoter region, only used for perturb-ATAC data. Default is 3000.
`pro_down`	The number of nucleotides downstream of the transcription start site that should be included in the promoter region, only used for perturb-ATAC data. Default is 0.
`overlap_cut`	Maximum overlap nucleotides between peaks and promoters, only used for perturb-ATAC data. Default is 0.
`p_adj_cut`	Parameter only used for finding DA peaks. Maximum adjust p_value calculated by `FindMarkers`. Default is 0.05.
`logFC_cut`	Parameter only used for finding DA peaks. Minimum log fold change calculated by `FindMarkers`. Default is 1.
`min.pct`	Parameter only used for finding DA peaks. Only test genes that are detected in a minimum fraction of min.pct cells in either of the NTC or perturbations. Meant to speed up the function by not testing genes that are very infrequently expressed. Default is 0.2.
`upstream`	The number of nucleotides upstream of the start site of selected region in `ciceroPlot` step. Default is 2000000.
`downstream`	The number of nucleotides downstream of the start site of selected region in `ciceroPlot` step. Default is 2000000.
`test.use`	Parameter only used for finding DA peaks. Default is "wilcox".For more details, see `FindMarkers`.
`track_size`	Size of each axis in /codeciceroPlot result. Default is c(1,.3,.2,.3). If `include_axis_track=FALSE`, track_size should be a vector with 3 elements.
`include_axis_track`	Logical, should a genomic axis be plotted? Default is `TRUE`.
`connection_color`	Color for connection lines. A single color, the name of a column containing color values, or the name of a column containing a character or factor to base connection colors on.
`connection_color_legend`	Logical, should connection color legend be shown?
`connection_width`	Width of connection lines.
`connection_ymax`	Connection y-axis height. If NULL, chosen automatically.
`gene_model_color`	Color for gene annotations.
`alpha_by_coaccess`	Logical, should the transparency of connection lines be scaled based on co-accessibility score?
`cicero_socre_cut`	Score cutoff of `improved_scmageck_lr` results used for `ciceroPlot`. Default is 0.