pValue_CherryFreqsChange_i: Compute p-Values for Methylation Frequency Changes in...

View source: R/summaryStatistics.R

pValue_CherryFreqsChange_iR Documentation

Compute p-Values for Methylation Frequency Changes in Cherries

Description

Calculates p-values for changes in methylation frequency between pairs of cherry tips in a phylogenetic tree. A cherry is a pair of leaf nodes (also called tips or terminal nodes) in a phylogenetic tree that share a direct common ancestor.

Usage

pValue_CherryFreqsChange_i(
  data,
  categorized_data = FALSE,
  index_islands,
  tree,
  input_control = TRUE
)

Arguments

data

A list containing methylation states at tree tips for each genomic structure (e.g., island/non-island). The data should be structured as data[[tip]][[structure]], where each structure has the same number of sites across tips. The input data must be prefiltered to ensure CpG sites are represented consistently across different tips. Each element contains the methylation states at the sites in a given tip and structure represented as 0, 0.5 or 1 (for unmethylated, partially-methylated and methylated). If methylation states are not represented as 0, 0.5, 1 they are categorized as 0 when value equal or under 0.2 0.5 when value between 0.2 and 0.8 and 1 when value over 0.8. For customized categorization thresholds use categorize_siteMethSt

categorized_data

Logical defaulted to FALSE. TRUE to skip redundant categorization when methylation states are represented as 0, 0.5, and 1.

index_islands

A numeric vector specifying the indices of islands to analyze.

tree

A rooted binary tree in Newick format (character string) or as an ape phylo object with minimum 2 tips.

input_control

Logical; if TRUE, validates input.

Details

The function uses simulate.p.value = TRUE in chisq.test to compute the p-value via Monte Carlo simulation to improve reliability regardless of whether the expected frequencies meet the assumptions of the chi-squared test (i.e., expected counts of at least 5 in each category).

Value

A data frame containing tip pair information (first tip name, second tip name, first tip index, second tip index, distance) and one column per island with the p-values from the chi-squared tests.

Examples

# Example with hypothetical tree and data structure

tree <- "((d:1,e:1):2,a:2);"
data <- list(
  #Tip 1
  list(c(rep(1,9), rep(0,1)), 
       c(rep(0,9), 1), 
       c(rep(0,9), rep(0.5,1))), 
  #Tip 2
  list(c(rep(0,9), rep(0.5,1)), 
       c(rep(0.5,9), 1), 
       c(rep(1,9), rep(0,1))), 
  #Tip 3
  list(c(rep(1,9), rep(0.5,1)), 
       c(rep(0.5,9), 1), 
       c(rep(0,9), rep(0.5,1)))) 

index_islands <- c(1,3)

pValue_CherryFreqsChange_i(data, categorized_data = TRUE, index_islands, tree)


MethEvolSIM documentation built on April 12, 2025, 1:30 a.m.