| readData | R Documentation | 
This function reads alignments/SNP data in several formats and calculates some summary data.
readData(path,populations=FALSE,outgroup=FALSE,include.unknown=FALSE,
         gffpath=FALSE,format="fasta",parallized=FALSE,
         progress_bar_switch=TRUE, FAST=FALSE,big.data=FALSE,
         SNP.DATA=FALSE
        )
## S4 method for signature 'GENOME'
get.sum.data(object)
| object | object of class  | 
| path | the basepath (folder) of the alignments | 
| outgroup | vector of outgroup sequences | 
| include.unknown | if positions with unknown nucleotides should be considered. | 
| populations | list of populations. default: | 
| gffpath | the basepath (folder) of the corresponding GFF-files. default: | 
| format | data formats.  | 
| parallized | parallel processing to accelerate the reading process. See details ! | 
| progress_bar_switch | progress_bar | 
| FAST | fast computation. See details ! | 
| big.data | use the ff-package | 
| SNP.DATA | important for reference positions; should be TRUE if you use SNP-data in alignment format | 
All data (alignments or SNP-files) have to be stored in one folder. The folder is the input of this 
 
function. If no GFF file (which also have to be stored in a folder) is specified, an alignment in 
the correct reading frame (starting at a first codon position) is expected. 
Otherwise synonymous and non-synonymous positions are not identified correctly. 
 
Note: 
 
The GFF-files have to be EXACTLY the same names (without any extensions like .fas or .gff) 
 
as the files storing the nucleotide data to ensure correct matching 
 
format: 
"fasta","nexus","phylip", 
"MAF","MEGA" 
"HapMap","VCF" 
"RData" 
 
Valid nucleotides are T,t,U,u,G,g,A,a,C,c,N,n,- 
 
parallized: 
 
- will speed up calculations if you use a very large amount of alignments 
 
FAST:   
 
- will not classify synonymous/non-synonymous SNPs directly 
- fast computation (via compiled C code) of biallelic matrix, biallelic sites, transversions/transitions 
and biallelic substitutions 
- can be switched to TRUE in case of SNP data without loss of information 
 
big.data: 
 
- use the ff-package 
 
- ff mechanism is used for biallelic.matrix and GFF/GTF information 
 
- is automatically activated for readVCF or readSNP 
- Note! you should set this to TRUE if you use big chunks of data 
 
and you want to later concatenate them in the PopGenome framework 
(for example: sliding windows of the whole dataset).
 
SNP.DATA: 
 
- should be switched to TRUE if you use SNP-data in alignment format. 
 
- the corresponding SNP positions can be set via set.ref.positions
The function creates an object of class "GENOME" 
 
——————————————————— 
The following slots will be filled in the "GENOME" object 
——————————————————— 
| Slot | Description | |
| 1. | n.sites | total number of sites | 
| 2. | n.biallelic.sites | number of biallelic sites | 
| 3. | n.gaps | number of sites with gaps | 
| 4. | n.unknowns | number of sites with unknown nucleotides | 
| 5. | n.valid.sites | number of valid sites | 
| 6. | n.polyallelic.sites | number of sites with >2 nucleotides | 
| 7. | trans.transv.ratio | transition/transversion ratio of biallelic sites | 
| 8. | region.names | names of regions | 
| 9. | region.data | some detailed information about the data read | 
# GENOME.class <- readData("...\Alignments", FAST=TRUE)
# GENOME.class <- readData("VCF", format="VCF")
# Note, "Alignments" and "VCF" are folders !
# GENOME.class@region.names
# GENOME.class <- readData("...\Alignments", big.data=TRUE)
# object.size(GENOME.class)
# GENOME.class <- readData("...\Alignments",gffpath="...\Alignments_GFF")
# GENOME.class
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.