View source: R/G4iM.Grinder.Funs.R
GiG.Seq.Analysis | R Documentation |
GiG.Seq.Analysis
examines a DNA or RNA sequence (and optionally its complementary strand) to identify and count runs of Guanines and Cytosines (G and C). It detects both perfect runs (e.g., GGGG) and imperfect runs (with bulges) of sizes ranging from 2 (e.g., GG; G-run of size 2) to 4 (e.g.,GGGG), in an non-overlapping way. The function can return raw counts or densities (per a specified length, e.g., per 100,000 nucleotides) for direct comparisons across sequences of different lengths. G and C runs are searched independently.
GiG.Seq.Analysis(
Name,
Sequence,
DNA = TRUE,
Complementary = TRUE,
Density = 1e+05,
byDensity = TRUE
)
Name |
|
Sequence |
|
DNA |
|
Complementary |
|
Density |
|
byDensity |
|
By default, this function specifically looks for runs of “G” and “C,” counting perfect runs (e.g., G2, G3, G4) and imperfect runs. Runs are analyzed in a sequential, non-overlapping manner. For example, a run of “GGGG” is counted only as G4, not as G4 plus any subset runs like G2 or G3.
A one-row data.frame
summarizing the run analysis:
Identifier for the analyzed sequence, matching the Name
argument.
Logical: TRUE
if the sequence was treated as DNA; FALSE
if RNA.
Total length of Sequence
.
TRUE
if the complementary strand was analyzed as well, otherwise FALSE
.
Percentages of each nucleotide type within Sequence
. (U and T are combined under UT%seq if DNA=FALSE
.)
Counts or densities of perfect (G2, G3) and perfect+imperfect (G2X, G3X) G-runs of length 2 to 4, according to the method 1 approach in G4-iM Grinder.
Counts or densities of perfect (C2, C3) and perfect+imperfect (C2X, C3X) C-runs of length 2 to 4, according to the same method 1 approach.
Any additional notes or implementation details can be placed here.
Efres Belmonte-Reche
Belmonte-Reche, E. and Morales, J. C. (2019). G4-iM Grinder: when size and frequency matter. G-Quadruplex, i-Motif and higher order structure search and analysis tool. NAR Genomics and Bioinformatics, 2. DOI: 10.1093/nargab/lqz005
https://academic.oup.com/nargab/article/2/1/lqz005/5576141
G4iMGrinder
for broader G4/i-Motif detection and scoring.
# Creating a random nucleotide sequence of length 10,000
Seq <- paste0(
sample(
c("G", "C", "T", "A", "N"),
10000,
prob = c(1, 1, 0.6, 0.6, 0.01),
replace = TRUE
),
collapse = ""
)
# Running the analysis with default parameters
Rs <- GiG.Seq.Analysis(
Name = "RandomSeq",
Sequence = Seq,
DNA = TRUE,
Complementary = TRUE,
byDensity = TRUE
)
# Analyzing a second sequence and storing results in the same data frame
Seq2 <- paste0(
sample(
c("G", "C", "T", "A", "N"),
10000,
prob = c(1, 1, 0.6, 0.6, 0.01),
replace = TRUE
),
collapse = ""
)
Rs[2, ] <- GiG.Seq.Analysis(
Name = "RandomSeq2",
Sequence = Seq2,
DNA = TRUE,
Complementary = TRUE,
Density = 1e5,
byDensity = TRUE
)
print(Rs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.