read.big.fasta: Reading large FASTA alignments

View source: R/read.big.fasta.R

read.big.fastaR Documentation

Reading large FASTA alignments

Description

This function splits FASTA alignments that are too large to fit into the computer memory into chunks.

Usage


read.big.fasta(filename,populations=FALSE,outgroup=FALSE,window=2000,
               SNP.DATA=FALSE,include.unknown=FALSE,
               parallized=FALSE,FAST=FALSE,big.data=TRUE)

Arguments

filename

the basepath of the FASTA alignment

outgroup

vector of outgroup sequences

populations

list of populations

window

chunk size: number of columns/nucleotide sites

SNP.DATA

should be switched to TRUE if you use SNP data in alignment format

include.unknown

include unknown positions in the biallelic.matrix

parallized

Use parallel computations to speed up the reading - works only on UNIX systems!

FAST

Fast computation. see readData()

big.data

use the ff-package

Details

The algorithm reads the data for each individual and stores the information
on disk. The data can be analyzed as regions of the defined window size, or can
be concatenated in the PopGenome framework via the function concatenate.regions.
This function should only be used when the FASTA file does not fit into the RAM;
else, use the function readData.

Value

The function creates an object of class "GENOME"

———————————————————
The following slots will be filled in the "GENOME" object
———————————————————

Slot Description
1. n.sites total number of sites
2. n.biallelic.sites number of biallelic sites
3. region.names names of regions
4. region.data some detailed information about the data

Examples


# GENOME.class <- read.big.fasta("Alignment.fas", big.data=TRUE)
# GENOME.class
# GENOME.class@region.names
# CON <- concatenate.regions(GENOME.class)
# CON@region.data@biallelic.sites
# GENOME.class.slide <- sliding.window.transform(GENOME.class,100,100)
# GENOME.class <- neutrality.stats(GENOME.class,FAST=TRUE)
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data


pievos101/PopGenome documentation built on Feb. 24, 2023, 7:11 a.m.