karlinMonteCarlo: Monte Carlo - Karlin [p-value]

karlinMonteCarloR Documentation

Monte Carlo - Karlin [p-value]

Description

Estimates p-value, for integer scores, based on a Monte Carlo estimation of Gumble parameters from simulations of smaller sequences with same distribution. Appropriate for great sequences with length > 10^3, for i.i.d and markovian sequence models.

Usage

karlinMonteCarlo(
  local_score,
  sequence_length,
  simulated_sequence_length,
  FUN,
  ...,
  numSim = 1000,
  plot = TRUE
)

Arguments

local_score

local score observed in a segment.

sequence_length

length of the sequence

simulated_sequence_length

length of simulated sequences produced by FUN

FUN

function to simulate similar sequences with.

...

parameters for FUN

numSim

number of sequences to create for estimation

plot

boolean value if to display plots for cumulated function and density

Details

The length of the simulated sequences is an argument specific to the function provided for simulation. Thus, it has to be provided also in the parameter simulated_sequence_length in the arguments of the "Monte Carlo - Karlin" function. It is a crucial detail as it influences precision and computation time of the result. Note that to get an appropriate estimation, the given average score must be non-positive.

Value

Floating value corresponding to the probability to obtain a local score with a value greater or equal to the parameter local_score

Examples

new = sample(-7:6, replace = TRUE, size = 1000) 
#MonteCarlo taking random sample from the input sequence itself

karlinMonteCarlo(local_score = 66, sequence_length = 1000,  
               FUN = function(x, simulated_sequence_length) {return(sample(x = x, 
               size = simulated_sequence_length, replace = TRUE))}, 
               x=new, simulated_sequence_length = 1000,  numSim = 1000)

               

localScore documentation built on Nov. 3, 2023, 1:08 a.m.