Home

/

GitHub

/

EfresBR/G4iMGrinder

/

G4iMGrinder: Detect and analyze potential G-quadruplexes, i-Motifs, and...

G4iMGrinder: Detect and analyze potential G-quadruplexes, i-Motifs, and...
In EfresBR/G4iMGrinder: G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

View source: R/G4iM.Grinder.Funs.R

G4iMGrinder

R Documentation

Detect and analyze potential G-quadruplexes, i-Motifs, and higher-order structures in DNA or RNA sequences

Description

G4iM Grinder is a flexible search engine and characterization tool designed to detect and analyze sequences capable of forming G-quadruplexes (Potential Quadruplex Sequences, PQSs), i-Motifs (Potential i-Motif Sequences, PiMS), or other higher-order quadruplex-like structures in DNA or RNA. It provides multiple “methods” to search for these motifs, allowing extensive configurability. Users can tailor the search to match specific criteria and search within their results for known quadruplex-forming or non-quadruplex-forming sequences. The results include both raw findings and frequency-weighted summaries.

Usage

G4iMGrinder(
  Name,
  Sequence,
  DNA = TRUE,
  Complementary = TRUE,
  RunComposition = "G",
  BulgeSize = 1,
  MaxIL = 3,
  MaxRunSize = 5,
  MinRunSize = 3,
  MinNRuns = 4,
  MaxNRuns = 0,
  MaxPQSSize = 33,
  MinPQSSize = 15,
  MaxLoopSize = 10,
  MinLoopSize = 0,
  LoopSeq = c("G", "T", "A", "C"),
  Method2 = TRUE,
  Method3 = TRUE,
  G4hunter = TRUE,
  cGcC = FALSE,
  PQSfinder = TRUE,
  Bt = 14,
  Pb = 17,
  Fm = 3,
  Em = 1,
  Ts = 4,
  Et = -19,
  Is = -16,
  Ei = 1,
  Ls = 1,
  ET = 1,
  WeightParameters = c(0.5, 0.5, 0),
  FreqWeight = 0,
  KnownQuadruplex = TRUE,
  KnownNOTQuadruplex = FALSE,
  RunFormula = FALSE,
  NCores = 1,
  Verborrea = TRUE
)

Arguments

`Name`	`character`. Name of the DNA or RNA sequence under analysis.
`Sequence`	`character`. The nucleotide sequence to be examined. Must be composed of valid DNA or RNA bases.
`DNA`	`logical`. Indicates whether `Sequence` is DNA (`TRUE`) or RNA (`FALSE`). Defaults to `TRUE`.
`Complementary`	`logical`. If `TRUE`, the complementary strand is generated and analyzed in parallel. Defaults to `TRUE`.
`RunComposition`	`character`. Nucleotide(s) used to define the “runs.” Typically `"G"` for G-quadruplex search or `"C"` for i-Motif search. Defaults to `"G"`.
`BulgeSize`	`integer`. Number of allowed non-`RunComposition` nucleotides within a run (used by M1). Defaults to `1`.
`MaxIL`	`integer`. Total number of additional nucleotides allowed between runs (used by M2). Defaults to `3`.
`MaxRunSize`	`integer`. Maximum length of a run. Defaults to `5` (used by M2).
`MinRunSize`	`integer`. Minimum length of a run. Defaults to `3` (used by M1).
`MinNRuns`	`integer`. Minimum number of runs required to form a structure. Defaults to `4` (used by M2 and M3).
`MaxNRuns`	`integer`. Maximum number of runs that compose a structure. Defaults to `0`, which disables an upper limit for run count (used by M2).
`MaxPQSSize`	`integer`. Maximum total length of a putative quadruplex structure. Defaults to `33` (used by M2).
`MinPQSSize`	`integer`. Minimum total length of a putative quadruplex structure. Defaults to `15` (used by M2 and M3).
`MaxLoopSize`	`integer`. Maximum number of nucleotides allowed in each loop (used by M2 and M3). Defaults to `10`.
`MinLoopSize`	`integer`. Minimum number of nucleotides allowed in each loop (used by M2 and M3). Defaults to `0`.
`LoopSeq`	`character` vector. Defines the nucleotide(s) or pattern(s) to measure or highlight within detected structures. Defaults to `c("G", "T", "A", "C")`.
`Method2`	`logical`. If `TRUE`, enables Method 2 (M2), which searches for size-defined structures and computes frequency (M2A and M2B). Defaults to `TRUE`.
`Method3`	`logical`. If `TRUE`, enables Method 3 (M3), which searches for size-unrestricted structures and computes frequency (M3A and M3B). Defaults to `TRUE`.
`G4hunter`	`logical`. If `TRUE`, applies the G4Hunter scoring system. Defaults to `TRUE`.
`cGcC`	`logical`. If `TRUE`, applies the cGcC scoring system (valid for RNA). Defaults to `FALSE`.
`PQSfinder`	`logical`. If `TRUE`, applies an adaptation of the PQSfinder scoring system. Defaults to `TRUE`.
`Bt`	`integer`. Tetrad stacking bonus for PQSfinder calculations. Defaults to `14`.
`Pb`	`integer`. Inter-Loop penalization constant for PQSfinder calculations. Defaults to `17`.
`Fm`	`integer`. Loop length penalization constant for PQSfinder calculations. Defaults to `3`.
`Em`	`integer`. Loop length exponential constant for PQSfinder calculations. Defaults to `1`.
`Ts`	`integer`. Tetrad supplement constant for PQSfinder calculations. Defaults to `4`.
`Et`	`integer`. Inter-Loop supplement constant for PQSfinder calculations. Defaults to `-19`.
`Is`	`integer`. Loop supplement constant for PQSfinder calculations. Defaults to `-16`.
`Ei`	`integer`. Tetrad exponential constant for PQSfinder calculations. Defaults to `1`.
`Ls`	`integer`. Inter-Loop exponential constant for PQSfinder calculations. Defaults to `1`.
`ET`	`integer`. Total formula exponential constant for PQSfinder calculations. Defaults to `1`.
`WeightParameters`	`numeric` vector of length 3. Weights for combining `G4hunter`, `PQSfinder`, and `cGcC` scores (in that order). Defaults to `c(0.5, 0.5, 0)`, producing an average of the first two.
`FreqWeight`	`numeric`. Weight factor for incorporating structure frequency in the final score (relevant to M2B and M3B). Defaults to `0`.
`KnownQuadruplex`	`logical`. If `TRUE`, matches results against known sequences that have been shown to form G-quadruplex or i-Motif in vitro. Defaults to `TRUE`.
`KnownNOTQuadruplex`	`logical`. If `TRUE`, matches results against known sequences shown not to form quadruplexes. Defaults to `FALSE`.
`RunFormula`	`logical`. If `TRUE`, calculates and reports a symbolic formula for each detected PQS. Defaults to `FALSE`.
`NCores`	`integer`. Number of cores to use for parallel processing. Defaults to `1`.
`Verborrea`	`logical`. If `TRUE`, prints verbose messages about progress. Defaults to `TRUE`.

Value

A list containing:

`Configuration`	A `data.frame` of the parameters used in the run.
`FunTime`	A `data.frame` with timing information for each step.
`PQSM2a`	A `data.frame` of the M2A (size-defined) results, if `Method2 = TRUE`.
`PQSM2b`	A `data.frame` of the M2B (frequency-weighted) results, if `Method2 = TRUE`.
`PQSM3a`	A `data.frame` of the M3A (unrestricted-size) results, if `Method3 = TRUE`.
`PQSM3b`	A `data.frame` of the M3B (frequency-weighted) results, if `Method3 = TRUE`.

Column Meanings

Start: Integer. Start position in Sequence (for M2A/M3A).
Finish: Integer. End position in Sequence (for M2A/M3A).
Freq: Integer. Frequency of occurrence (for M2B/M3B).
Runs: Integer. Number of runs (e.g., G-runs in G-quadruplex).
IL: Integer. Number of bulges or irregularities.
mRun: Numeric. Average run size.
Sequence: Character. The identified motif sequence.
Length: Integer. Total length of the identified structure.
Strand: Character. Indicates “+” (original) or “–” (complementary) strand, if Complementary = TRUE.
G4Hunter: Numeric. Score assigned by the G4Hunter algorithm (if G4hunter = TRUE).
pqsfinder: Numeric. Score from the PQSfinder adaptation (if PQSfinder = TRUE).
cGcC: Numeric. Score from the cGcC algorithm (if cGcC = TRUE).
Score: Numeric. Combined overall score, integrating all selected scoring methods plus frequency weighting.
Conf.Quad.Seqs: Character. Known quadruplex-forming sequences detected, with counts. DNA hits have “*” after the count; RNA hits have “^”.
Conf.NOT.Quad.Seqs: Character. Known non-quadruplex sequences detected, with counts. DNA hits have “*” after the count; RNA hits have “^”.

Note

M1 stands for Method 1; M2 stands for Method 2; M3 stands for Method 3.

Author(s)

Efres Belmonte-Reche

References

Belmonte-Reche, E. and Morales, J. C. (2019). G4-iM Grinder: when size and frequency matter. G-Quadruplex, i-Motif and higher order structure search and analysis tool. NAR Genomics and Bioinformatics, 2. DOI: 10.1093/nargab/lqz005

https://academic.oup.com/nargab/article/2/1/lqz005/5576141

Examples

  library(G4iMGrinder)

  # Example: retrieve a DNA sequence and run basic G4 search
  if (!require("seqinr")) {
    install.packages("seqinr")
    library(seqinr)
  }

  Name <- "LmajorESTs"
  Sequence <- paste0(
    read.fasta(
      file = url("http://tritrypdb.org/common/downloads/release-36/Lmajor/fasta/TriTrypDB-36_Lmajor_ESTs.fasta"),
      as.string = TRUE, legacy.mode = TRUE, seqonly = TRUE,
      strip.desc = TRUE, seqtype = "DNA"
    ),
    collapse = ""
  )

  # G-quadruplex search on DNA
  resultDNA <- G4iMGrinder(Name = Name, Sequence = Sequence)

  # G-quadruplex search on RNA (with cGcC scoring)
  resultRNA <- G4iMGrinder(Name = Name, Sequence = Sequence, DNA = FALSE, cGcC = TRUE)

  # i-Motif search in DNA
  resultIMotif <- G4iMGrinder(Name = Name, Sequence = Sequence, RunComposition = "C")

  # Customized search with bulge allowance and larger loop sizes ## More bulges and smaller G-runs (GG) increases significantly computation time
  resultCustom <- G4iMGrinder(
    Name = Name,
    Sequence = Sequence,
    BulgeSize = 2,
    MaxLoopSize = 20,
    MaxIL = 10
  )

  # Viewing results
  View(resultDNA$PQSM2a)  # M2A results
  View(resultDNA$PQSM2b)  # M2B results (with frequency weighting)
  View(resultDNA$PQSM3a)  # M3A results (unrestricted-size search)
  View(resultDNA$PQSM3b)  # M3B results (unrestricted-size with frequency weighting)

EfresBR/G4iMGrinder documentation built on June 12, 2025, 3:52 a.m.

EfresBR/G4iMGrinder index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EfresBR/G4iMGrinder
G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

G4iMGrinder: Detect and analyze potential G-quadruplexes, i-Motifs, and...
In EfresBR/G4iMGrinder: G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

Detect and analyze potential G-quadruplexes, i-Motifs, and higher-order structures in DNA or RNA sequences

Description

Usage

Arguments

Value

Column Meanings

Note

Author(s)

References

Examples

Related to G4iMGrinder in EfresBR/G4iMGrinder...

R Package Documentation

Browse R Packages

We want your feedback!

EfresBR/G4iMGrinder G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

G4iMGrinder: Detect and analyze potential G-quadruplexes, i-Motifs, and... In EfresBR/G4iMGrinder: G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

Detect and analyze potential G-quadruplexes, i-Motifs, and higher-order structures in DNA or RNA sequences

Description

Usage

Arguments

Value

Column Meanings

Note

Author(s)

References

Examples

Related to G4iMGrinder in EfresBR/G4iMGrinder...

R Package Documentation

Browse R Packages

We want your feedback!

EfresBR/G4iMGrinder
G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.

G4iMGrinder: Detect and analyze potential G-quadruplexes, i-Motifs, and...
In EfresBR/G4iMGrinder: G4iMGrinder: G4 Quadruplex, i-Motif and higher-order structures in DNA and RNA sequences search and analysis tool.