df2pwm: Create a PWM from a dataframe with sequences

View source: R/pwm_utils.R

df2pwmR Documentation

Create a PWM from a dataframe with sequences

Description

Generate a positional weight matrix (PWM) from a set of sequences. This function comes with batteries included.

Usage

df2pwm(
  data,
  ID_col,
  alphabet,
  bg_prob_letter,
  pseudocount_letter,
  long_format = FALSE
)

Arguments

data

A data.frame with a minimum of 2 columns. One named Sequence, the other named as you prefer that will be specified with ID_col.

ID_col

The name of the column in data to be used as the identifier of the Sequence column.

alphabet

A character vector containing the alphabet letters present in Sequence. Guessed by default.

bg_prob_letter

Explain well in details.

pseudocount_letter

Explain well in details.

long_format

Logical. If TRUE reshape the PWM into a tidy long data.frame format. Default FALSE.

Details

Here I'll explain more things, ideally with formulas if LaTeX syntax was supported.

Value

A data.frame or a tidy long format data.frame

Examples

df2pwm(data, ID_col = 'Species', alphabet = c('a', 'c', 'g', 't'), long_format = T) 

Ni-Ar/niar documentation built on Feb. 3, 2025, 9:25 a.m.