segment: Segment the log2 ratios for a set of individuals

View source: R/segment.R

segmentR Documentation

Segment the log2 ratios for a set of individuals

Description

This function is a wrapper around the function Rob_seg.std of package robseg which segments all the log2 ratio profiles in the data.frame provided as input. This wrapper implements a set of parameters is is very narrow compared to the complete functionality of function Rob_seg.std. Notably, the loss function used here is the one that is robust to outliers as the cost function is bounded at a certain threshold. More information can be found on the robseg package's Github page and in the research paper describing the method (see the "see also")

Usage

segment(data_matrix, info_indices, chrom_col, pos_col, threshold_param,
  lambda_param = 2, verbose = TRUE)

Arguments

data_matrix

a data.frame containing optional metadata columns and one or several columns of log2 ratios corresponding to individuals after which the columns are named.

info_indices

an integer vector. The indices of the metadata columns, such that these columns are not segmented. Typically, this will be 1:3 as the first three columns will be the chromosome and the lower and upper bounds of the genomic bins.

chrom_col

a character vector of length one. The name of the column containing the names of the chromosomes.

pos_col

a character vector of length one. The name of a column giving the genomic position of the bin (either lower or upper). This argument is only used to check that the bins are in increasing order in the data.frame, as this is needed for segmentation. It therefore does not matter whether this position is the lower or upper breakpoint of the bin.

threshold_param

a single numeric value. The multiplier that is applied to the standard deviation estimate to get the threshold that bounds the increase of the cost function. Higher values make the segmentation more sensitive to outliers, whereas lower values make it more robust to outliers.

lambda_param

a single numeric value. The penalty value used by the segmentation. Defaults to two.

verbose

a single logical value (TRUE or FALSE). Whether the progress of the segmentation should be printed. Defaults to TRUE.

Value

To be completed.

Source

Link to robseg's Github page: https://github.com/guillemr/robust-fpop

Original description of the segmentation approach: Fearnhead, P., & G. Rigaill (2018) Changepoint Detection in the Presence of Outliers. Journal of the American Statistical Association. DOI: 10.1080/01621459.2017.1385466

Examples

NULL

malemay/delgbs documentation built on Feb. 1, 2024, 8:38 a.m.