generate.data: Generate Simulated Dataset

Description Usage Arguments Value Author(s)

View source: R/utils.R

Description

Generate simulated data based on real data and the results of a previous analysis.

Usage

1
2
generate.data(data, group, pos.range = c(1, 10), 
    num.seq = 100, gap = 35, split.gap = 1000, min.len = 2)

Arguments

data

A data.frame with information about genomic coordinates of probes (chromosome and position) in the first two columns. Subsequent columns contain probe measurements of individual samples.

group

Information that can be used to assign probes to one of two classes. Either a logical vector or the name of a GFF file. In the later case all probes in annotated regions are considered to be ‘positive’.

pos.range

Indicates how many positive regions should be generated for each observation sequence. The actual number for each sequence is sampled uniformly from the indicated range of values.

num.seq

Number of observation sequences to generate.

gap

Gap between probes. Used to generate artificial probe coordinates.

split.gap

Gap between sequences.

min.len

Minimum number of probes per region.

Value

A list with components

observation

A data.frame with the same format as data.

regions

A list of state sequences.

Author(s)

Peter Humburg


humburg/tileHMM documentation built on May 17, 2019, 9:13 p.m.