readData: Read alignments and calculate summary data
In pievos101/PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses

readData

R Documentation

Read alignments and calculate summary data

Description

This function reads alignments/SNP data in several formats and calculates some summary data.

Usage


readData(path,populations=FALSE,outgroup=FALSE,include.unknown=FALSE,
         gffpath=FALSE,format="fasta",parallized=FALSE,
         progress_bar_switch=TRUE, FAST=FALSE,big.data=FALSE,
         SNP.DATA=FALSE
        )

## S4 method for signature 'GENOME'
get.sum.data(object)

Arguments

`object`	object of class `"GENOME"`
`path`	the basepath (folder) of the alignments
`outgroup`	vector of outgroup sequences
`include.unknown`	if positions with unknown nucleotides should be considered.
`populations`	list of populations. default:`FALSE`
`gffpath`	the basepath (folder) of the corresponding GFF-files. default:`FALSE`
`format`	data formats. `"fasta"` is default. See details !
`parallized`	parallel processing to accelerate the reading process. See details !
`progress_bar_switch`	progress_bar
`FAST`	fast computation. See details !
`big.data`	use the ff-package
`SNP.DATA`	important for reference positions; should be TRUE if you use SNP-data in alignment format

Details

All data (alignments or SNP-files) have to be stored in one folder. The folder is the input of this
function. If no GFF file (which also have to be stored in a folder) is specified, an alignment in
the correct reading frame (starting at a first codon position) is expected.
Otherwise synonymous and non-synonymous positions are not identified correctly.

Note:
The GFF-files have to be EXACTLY the same names (without any extensions like .fas or .gff)
as the files storing the nucleotide data to ensure correct matching

format:
"fasta","nexus","phylip",
"MAF","MEGA"
"HapMap","VCF"
"RData"
Valid nucleotides are T,t,U,u,G,g,A,a,C,c,N,n,-

parallized:
- will speed up calculations if you use a very large amount of alignments

FAST:
- will not classify synonymous/non-synonymous SNPs directly
- fast computation (via compiled C code) of biallelic matrix, biallelic sites, transversions/transitions
and biallelic substitutions
- can be switched to TRUE in case of SNP data without loss of information

big.data:
- use the ff-package
- ff mechanism is used for biallelic.matrix and GFF/GTF information
- is automatically activated for readVCF or readSNP
- Note! you should set this to TRUE if you use big chunks of data
and you want to later concatenate them in the PopGenome framework
(for example: sliding windows of the whole dataset).

SNP.DATA:
- should be switched to TRUE if you use SNP-data in alignment format.
- the corresponding SNP positions can be set via set.ref.positions

Value

The function creates an object of class "GENOME"

———————————————————
The following slots will be filled in the "GENOME" object
———————————————————

	Slot	Description
1.	`n.sites`	total number of sites
2.	`n.biallelic.sites`	number of biallelic sites
3.	`n.gaps`	number of sites with gaps
4.	`n.unknowns`	number of sites with unknown nucleotides
5.	`n.valid.sites`	number of valid sites
6.	`n.polyallelic.sites`	number of sites with >2 nucleotides
7.	`trans.transv.ratio`	transition/transversion ratio of biallelic sites
8.	`region.names`	names of regions
9.	`region.data`	some detailed information about the data read

Examples


# GENOME.class <- readData("...\Alignments", FAST=TRUE)
# GENOME.class <- readData("VCF", format="VCF")
# Note, "Alignments" and "VCF" are folders !
# GENOME.class@region.names
# GENOME.class <- readData("...\Alignments", big.data=TRUE)
# object.size(GENOME.class)
# GENOME.class <- readData("...\Alignments",gffpath="...\Alignments_GFF")
# GENOME.class
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data

pievos101/PopGenome documentation built on May 23, 2024, 7:31 p.m.

pievos101/PopGenome index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pievos101/PopGenome
An Efficient Swiss Army Knife for Population Genomic Analyses

readData: Read alignments and calculate summary data
In pievos101/PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses

Read alignments and calculate summary data

Description

Usage

Arguments

Details

Value

Examples

Related to readData in pievos101/PopGenome...

R Package Documentation

Browse R Packages

We want your feedback!

pievos101/PopGenome An Efficient Swiss Army Knife for Population Genomic Analyses

readData: Read alignments and calculate summary data In pievos101/PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses

Read alignments and calculate summary data

Description

Usage

Arguments

Details

Value

Examples

Related to readData in pievos101/PopGenome...

R Package Documentation

Browse R Packages

We want your feedback!

pievos101/PopGenome
An Efficient Swiss Army Knife for Population Genomic Analyses

readData: Read alignments and calculate summary data
In pievos101/PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses