map_probes_sequence: Map 450k/EPIC probes to a user-defined sequence

Description Usage Arguments Value Examples

View source: R/map_probes_sequence.R

Description

Map 450k/EPIC probes to a user-defined sequence

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
map_probes_sequence(
  sequence,
  next_base,
  prev_base,
  array = c("450k", "EPIC"),
  max_width = 50,
  min_width = 15,
  step_size = 5,
  allow_mismatch = FALSE,
  max_mismatch = 1,
  allow_indel = FALSE,
  min_distance = 6,
  use_Y = TRUE,
  methylation_status = "methylated",
  verbose = TRUE,
  cores = 1
)

Arguments

sequence

DNA sequence (string)

next_base

Base following the end of the sequence (necessary to know for bisulfite conversion)

prev_base

Base preceding the start of the sequence (necessary to know for bisulfite conversion)

array

Array used (450k or EPIC)

max_width

Maximum probe length to map (starting from the 3'-end of the probe).

min_width

Minimum probe length to map (starting from the 3'-end of the probe).

step_size

Map probe lengths from min_width to max_width in these steps.

allow_mismatch

Allow a mismatch in matching? (TRUE/FALSE)

max_mismatch

Maximum number of allowed mismatches

allow_indel

Allow an INDEL in matching? (TRUE/FALSE)

min_distance

Minimum distance from 3'end of probe where mismatches/indels are allowed

use_Y

Use Y (IUPAC) to represent Cs in CpG-sites?

methylation_status

Assumed CpG-sites are either methylated or unmethylated (argument not used if use_Y == TRUE)

verbose

Should function be verbose? (TRUE/FALSE)

cores

Number of cores to use (default = 1).

Value

A data frame with one row for each match and .. columns

Probe

Probe ID

start, end, strand

positions

width

width (in basepairs) of the match

sbe_site

base preceding the match

mismatch_pos

position of mismatch (bp from 3'end of probe), NA if exact match

indel_pos

position of INDEL (bp from 3'end of probe), NA if exact match

width_incl_indel

width of match including INDEL, NA if exact match

sequence_bs

bisulfite-converted sequence

Type2

Type: II, I_Methylated or I_Unmethylated

channel

predicted color channel for type I probes

Examples

1
2
3
4
5
6
7
# Map probes to the C9orf72 hexanucleotide repeat
repeat_sequence <- paste(rep("GGCCCC", 10), collapse="")
matches_c9 <- map_probes_sequence(sequence = repeat_sequence, next_base = "G", prev_base = "C",
                                  array = "450k", min_width = 10, max_width = 25, allow_indel = FALSE, 
                                  allow_mismatch = TRUE, min_distance = 6,
                                  step_size = 1, use_Y = FALSE, methylation_status = "methylated")
head(matches_c9 %>% data.frame())

pjhop/DNAmCrosshyb documentation built on June 23, 2021, 1:04 p.m.