read.big.fasta: Reading large FASTA alignments
In PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses

Description Usage Arguments Details Value Examples

This function splits FASTA alignments that are too large to fit into the computer memory into chunks.

1
2
3

read.big.fasta(filename,populations=FALSE,outgroup=FALSE,window=2000,
               SNP.DATA=FALSE,include.unknown=FALSE,
               parallized=FALSE,FAST=FALSE,big.data=TRUE)

`filename`	the basepath of the FASTA alignment
`outgroup`	vector of outgroup sequences
`populations`	list of populations
`window`	chunk size: number of columns/nucleotide sites
`SNP.DATA`	should be switched to TRUE if you use SNP data in alignment format
`include.unknown`	include unknown positions in the biallelic.matrix
`parallized`	Use parallel computations to speed up the reading - works only on UNIX systems!
`FAST`	Fast computation. see readData()
`big.data`	use the ff-package

The algorithm reads the data for each individual and stores the information
on disk. The data can be analyzed as regions of the defined window size, or can
be concatenated in the PopGenome framework via the function concatenate.regions.
This function should only be used when the FASTA file does not fit into the RAM;
else, use the function readData.

The function creates an object of class "GENOME"

———————————————————
The following slots will be filled in the "GENOME" object
———————————————————

	Slot	Description
1.	`n.sites`	total number of sites
2.	`n.biallelic.sites`	number of biallelic sites
3.	`region.names`	names of regions
4.	`region.data`	some detailed information about the data

# GENOME.class <- read.big.fasta("Alignment.fas", big.data=TRUE)
# GENOME.class
# GENOME.class@region.names
# CON <- concatenate.regions(GENOME.class)
# CON@region.data@biallelic.sites
# GENOME.class.slide <- sliding.window.transform(GENOME.class,100,100)
# GENOME.class <- neutrality.stats(GENOME.class,FAST=TRUE)
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data