get_siteFChange_cherry: Compute Site Frequency of Methylation Changes per Cherry

View source: R/summaryStatistics.R

get_siteFChange_cherryR Documentation

Compute Site Frequency of Methylation Changes per Cherry

Description

This function calculates the total frequency of methylation differences (both full and half changes) for each genomic structure for each cherry in a phylogenetic tree. A cherry is a pair of leaf nodes (also called tips or terminal nodes) in a phylogenetic tree that share a direct common ancestor. In other words, if two leaves are connected to the same internal node and no other leaves are connected to that internal node, they form a cherry.

Usage

get_siteFChange_cherry(tree, data, categorized_data = FALSE)

Arguments

tree

A phylogenetic tree in Newick format or a phylo object from the ape package. The function ensures the tree has a valid structure and at least two tips.

data

A list containing methylation states at tree tips for each genomic structure (e.g., island/non-island). The data should be structured as data[[tip]][[structure]], where each structure has the same number of sites across tips. The input data must be prefiltered to ensure CpG sites are represented consistently across different tips. Each element contains the methylation states at the sites in a given tip and structure represented as 0, 0.5 or 1 (for unmethylated, partially-methylated and methylated). If methylation states are not represented as 0, 0.5, 1 they are categorized as 0 when value equal or under 0.2 0.5 when value between 0.2 and 0.8 and 1 when value over 0.8. For customized categorization thresholds use categorize_siteMethSt

categorized_data

Logical defaulted to FALSE. TRUE to skip redundant categorization when methylation states are represented as 0, 0.5, and 1.

Details

The function first verifies that tree and data have valid structures and the minimum number of tips. It then extracts per-cherry methylation differences using freqSites_cherryMethDiff, handling potential errors. Finally, it aggregates the full and half methylation differences for each genomic structure at each cherry.

Value

A data frame with one row per cherry, containing the following columns:

tip_names

A character string representing the names of the two tips in the cherry, concatenated with a hyphen.

tip_indices

A character string representing the indices of the two tips in the cherry, concatenated with a hyphen.

dist

A numeric value representing the sum of the branch distances between the cherry tips.

One column for each structure named with the structure number

A numeric value representing the total frequency of methylation changes (both full and half) for the given structure.

Examples

# Example data setup

data <- list(
list(rep(1,10), rep(0,5), rep(1,8)),
list(rep(1,10), rep(0.5,5), rep(0,8)),
list(rep(1,10), rep(0.5,5), rep(0,8)),
list(c(rep(0,5), rep(0.5, 5)), c(0, 0, 1, 1, 1), c(0.5, 1, rep(0, 6))))

tree <- "((a:1.5,b:1.5):2,(c:2,d:2):1.5);"

get_siteFChange_cherry(tree, data, categorized_data = TRUE)


MethEvolSIM documentation built on April 12, 2025, 1:30 a.m.