segtoFreq | R Documentation |
Thie function calculates the frequency of deletions and duplications
segtoFreq(
data,
cnv_column_idx = 6,
cohort_name = "unspecified cohort",
assembly = "hg38",
bin_size = 1e+06,
overlap = 1000,
soft_expansion = 0.1
)
data |
Segment data containing CNV states. The first four columns should represent sample ID, chromosome, start position, and end position, respectively. The fifth column can contain the number of markers or other relevant information. The column representing CNV states (with a column index of 6 or higher) should either contain "DUP" for duplications and "DEL" for deletions, or level-specific CNV states such as "EFO:0030072", "EFO:0030071", "EFO:0020073", and "EFO:0030068", which correspond to high-level duplication, low-level duplication, high-level deletion, and low-level deletion, respectively. |
cnv_column_idx |
Index of the column specifying the CNV state. Default is 6, based on the "pgxseg" format used in Progenetix.
If the input segment data follows the general |
cohort_name |
A string specifying the cohort name. Default is "unspecified cohort". |
assembly |
A string specifying the genome assembly version for CNV frequency calculation. Allowed options are "hg19" or "hg38". Default is "hg38". |
bin_size |
Size of genomic bins used to split the genome, in base pairs (bp). Default is 1,000,000. |
overlap |
Numeric value defining the amount of overlap between bins and segments considered as bin-specific CNV, in base pairs (bp). Default is 1,000. |
soft_expansion |
Fraction of |
The binned CNV frequency stored in "pgxfreq" format
## load necessary data (this step can be skipped in real implementation)
data("hg38_cytoband")
## get pgxseg data
seg <- read.table(system.file("extdata", "example.pgxseg",package = 'pgxRpi'),header=TRUE,sep = "\t")
## calculate frequency data
freq <- segtoFreq(seg)
## visualize
pgxFreqplot(freq)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.