README.md

MiMMAl

Using two-component Gaussian mixture modelling to estimate the major allele distribution in genotying data.

Installation

Using R >= 3.3.2, MiMMAl depends on the following packages:

Once those packages have been installed. MiMMAl can be downloaded and installed in the following way.

install.packages("MiMMAl", repos = NULL, type="source")

Running

To run MiMMAl a tab-separated text file needs to be produces containing four columns containing; chromosome (chr), position (pos), raw BAF value (BAF) and the mean/median mirrored BAF of the segment of which the loci belongs (BAFseg), for heterozygous SNPs only.

The minimum requirements for running MiMMAl (runMiMMAl) are to include the samplename, this will be appended to .BAFphased.txt, the output of MiMMAl, that will be produced in the current working directory. You will also have to provide the path and name of the input text file as inputfile.

By default MiMMAl will produce plots representing the fits produced, including the results of the initial search of fits using expectation maximisation to search for a range for sd, as well as the global and local searches of parameter space including means and sd in the current working directory. This can be disabled in the options for MiMMAl.

Testing

One can make some next generation sequencing style input data using the following lines of code in R.

n = 100000

coverage = rpois(n, lambda = 120)

majcov = rbinom(n, size = coverage, prob = 0.6)

majoraf = majcov / coverage

baf = ifelse(runif(n) > 0.5, 1 - majoraf, majoraf)

Additional parameters

There are some additional parameters that can be set in runMiMMAl as required:



georgecresswell/MiMMAl documentation built on Oct. 25, 2020, 2:40 p.m.