load_1KG: Load pre-formatted 1000 Genomes Project exon data

Description Usage Arguments Details Value References See Also Examples

View source: R/SNVdata_methods.R

Description

Load pre-formatted 1000 Genomes Project exon data

Usage

1
load_1KG(chrom, pathway_df = NULL)

Arguments

chrom

Numeric. The chromosome number(s). A numeric list of chromosome numbers representing the 1000 Genomes Project exon-data to load.

pathway_df

Data frame. (Optional) A data frame that contains the positions for each exon in a pathway of interest. This data frame must contain the variables chrom, exonStart, and exonEnd. See Details.

Details

The load_1KG is used to load pre-formatted, exon-only SNV data from any of the 22 human autosomes. The original data was obtained from:

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/.

The data was reduced to remove any related indiviuals, to accopmlish this we randomly sampled one relative from each set of related individuals. This resulted in the removal of 22 individuals. Additional information regaring the formatting of the 1000 Genomes Project data may be found at https://github.com/simrvprojects/1000-Genomes-Exon-Data/ in the pdf file entitled "Documentation for Creating Exon Data_090319.pdf".

We expect that pathwayDF does not contain any overlapping segments. Users may combine overlapping exons into a single observation with the combine_exons function.

Value

An object of class SNVdata containing the imported exon data.

References

1000 Genomes Project (2010). A Map of Human Genome Variation from Population-Scale Sequencing. Nature; 467:1061-1073.

See Also

combine_exons

Examples

1
2
3
4
5
6
exdata = load_1KG(21:22)
unique(exdata$Mutations$chrom)

head(exdata$Mutations)
exdata$Haplotypes[1:20, 1:10]
head(exdata$Samples)

simrvprojects/SimRVSequences documentation built on March 12, 2020, 1:33 a.m.