Description Usage Arguments Details Value Examples
This function reads alignments/SNP data in several formats and calculates some summary data.
1 2 3 4 5 6 7 8 |
object |
object of class |
path |
the basepath (folder) of the alignments |
outgroup |
vector of outgroup sequences |
include.unknown |
if positions with unknown nucleotides should be considered. |
populations |
list of populations. default: |
gffpath |
the basepath (folder) of the corresponding GFF-files. default: |
format |
data formats. |
parallized |
parallel processing to accelerate the reading process. See details ! |
progress_bar_switch |
progress_bar |
FAST |
fast computation. See details ! |
big.data |
use the ff-package |
SNP.DATA |
important for reference positions; should be TRUE if you use SNP-data in alignment format |
All data (alignments or SNP-files) have to be stored in one folder. The folder is the input of this
function. If no GFF file (which also have to be stored in a folder) is specified, an alignment in
the correct reading frame (starting at a first codon position) is expected.
Otherwise synonymous and non-synonymous positions are not identified correctly.
Note:
The GFF-files have to be EXACTLY the same names (without any extensions like .fas or .gff)
as the files storing the nucleotide data to ensure correct matching
format:
"fasta"
,"nexus"
,"phylip"
,
"MAF"
,"MEGA"
"HapMap"
,"VCF"
"RData"
Valid nucleotides are T,t,U,u,G,g,A,a,C,c,N,n,-
parallized:
- will speed up calculations if you use a very large amount of alignments
FAST:
- will not classify synonymous/non-synonymous SNPs directly
- fast computation (via compiled C code) of biallelic matrix, biallelic sites, transversions/transitions
and biallelic substitutions
- can be switched to TRUE
in case of SNP data without loss of information
big.data:
- use the ff-package
- ff mechanism is used for biallelic.matrix and GFF/GTF information
- is automatically activated for readVCF or readSNP
- Note! you should set this to TRUE if you use big chunks of data
and you want to later concatenate them in the PopGenome framework
(for example: sliding windows of the whole dataset).
SNP.DATA:
- should be switched to TRUE
if you use SNP-data in alignment format.
- the corresponding SNP positions can be set via set.ref.positions
The function creates an object of class "GENOME"
———————————————————
The following slots will be filled in the "GENOME"
object
———————————————————
Slot | Description | |
1. | n.sites | total number of sites |
2. | n.biallelic.sites | number of biallelic sites |
3. | n.gaps | number of sites with gaps |
4. | n.unknowns | number of sites with unknown nucleotides |
5. | n.valid.sites | number of valid sites |
6. | n.polyallelic.sites | number of sites with >2 nucleotides |
7. | trans.transv.ratio | transition/transversion ratio of biallelic sites |
8. | region.names | names of regions |
9. | region.data | some detailed information about the data read |
1 2 3 4 5 6 7 8 9 10 11 | # GENOME.class <- readData("...\Alignments", FAST=TRUE)
# GENOME.class <- readData("VCF", format="VCF")
# Note, "Alignments" and "VCF" are folders !
# GENOME.class@region.names
# GENOME.class <- readData("...\Alignments", big.data=TRUE)
# object.size(GENOME.class)
# GENOME.class <- readData("...\Alignments",gffpath="...\Alignments_GFF")
# GENOME.class
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.