GiG.Seq.Analysis: Genomic sequence nucleotidic run analyser.

Description Usage Arguments Value Column Meanings Author(s) Examples

View source: R/G4iM.Grinder.Funs.R

Description

A function to analyse a genomic sequence in relationship with its nucleotidic composition and organisation, concretely regarding the G and C-runs within.

Usage

1
GiG.Seq.Analysis(Name, Sequence, DNA = TRUE, Complementary = TRUE, Nucleotides = c("G", "C"), BulgeSize = 1, Density = 1e+05, byDensity = TRUE)

Arguments

Name

character, name of the DNA or RNA sequence to analyse.

Sequence

character, DNA or RNA sequence to analyse composed of the nucleotide arrangement.

DNA

logical, controls if the sequence is DNA or RNA. The factory-fresh default is TRUE assuming the sequence is DNA.

Complementary

logical, controls if the Complementary strand should be created and analyzed. The factory-fresh default is TRUE.

Nucleotides

character or vector of characters, nucleotide that composes the runs to analyse. The factory-fresh default is c("G", "C") to analyse both G and C-runs. Any other nucleotide (or letter) can be imputed but will be ignored.

BulgeSize

integer, number of acceptable non-run nucleotides to exist within runs. The factory-fresh default is 1.

Density

integer, constant to calculate density results. The factory-fresh default is 100000, returning results found per 100000 nucleotides. Only pertinent if byDensity = TRUE.

byDensity

logical, should the results be returned as a density. Calculated as Result.Density = (Density*Results)/(total genomic length). If set to FALSE, it will return counts. Density allows genomic size-indepedent comparisons. The factory-fresh default is TRUE.

Value

The result of GiG.Seq.Analysis is a one row data.frame with the summary of the genomic sequence.

Column Meanings

Name: name of summary, given by the Name input.

DNA: genome type, given by DNA input. If DNA == TRUE, the genome is DNA, else it is assumed to be RNA.

Length: Length of the genomic sequence.

Complementary: genome type, given by Complementary input. If Complementary == TRUE, the genome is double stranded, and both strands have been analysed.

G%seq: percentage of the genomic sequence which is G.

C%seq: percentage of the genomic sequence which is C.

A%seq: percentage of the genomic sequence which is A.

UT%seq: percentage of the genomic sequence which is U or T.

N%seq: percentage of the genomic sequence which is N.

G2: Number of **perfect** G-runs identified with lengths betwen 2 and 5 using method 1 of G4-iM Grinder's algorithm, (GG, GGG, GGGG and GGGGG). Returned as counts or density.

G3: Number of **perfect** G-runs identified with lengths betwen 3 and 5 using method 1 of G4-iM Grinder's algorithm, (GGG, GGGG and GGGGG). Returned as counts or density.

G2X: Number of **perfect and imperfect** G-runs identified with lengths betwen 2 and 5 (excluding the bulges) using method 1 of G4-iM Grinder's algorithm. Returned as counts or density.

G3X: Number of **perfect and imperfect** G-runs identified with lengths betwen 3 and 5 (excluding the bulges) using method 1 of G4-iM Grinder's algorithm. Returned as counts or density.

C2: Number of **perfect** C-runs identified with lengths betwen 2 and 5 using method 1 of G4-iM Grinder's algorithm. (CC, CCC, CCCC and CCCCC). Returned as counts or density.

C3: Number of **perfect** C-runs identified with lengths betwen 3 and 5 using method 1 of G4-iM Grinder's algorithm. (CCC, CCCC and CCCCC). Returned as counts or density.

C2X: Number of **perfect and imperfect** C-runs identified with lengths betwen 2 and 5 (excluding the bulges) using method 1 of G4-iM Grinder's algorithm. Returned as counts or density.

C3X: Number of **perfect and imperfect** C-runs identified with lengths betwen 3 and 5 (excluding the bulges) using method 1 of G4-iM Grinder's algorithm. Returned as counts or density.

Author(s)

Efres Belmonte-Reche

Examples

1
2
3
4
5
6
7
8
9
### Creating random nucleotidic sequence
Seq <- paste0(sample(c("G", "C", "T", "A", "N"), 10000, prob = c(1,1,0.6,0.6,0.01), replace = T), collapse = "")

### Analysing sequence
Rs <- GiG.Seq.Analysis(Name = "RandomSeq", Sequence = Seq, DNA = TRUE, Complementary = TRUE, Nucleotides = c("G", "C"), BulgeSize = 1, Density = 1e+05, byDensity = TRUE)

### Adding another analysis
Seq <- paste0(sample(c("G", "C", "T", "A", "N"), 10000, prob = c(1,1,0.6,0.6,0.01), replace = T), collapse = "")
Rs[2,] <- GiG.Seq.Analysis(Name = "RandomSeq2", Sequence = Seq, DNA = TRUE, Complementary = TRUE, Nucleotides = c("G", "C"), BulgeSize = 1, Density = 1e+05, byDensity = TRUE)

EfresBR/G4iMGrinder documentation built on June 11, 2021, 2:57 a.m.