Description Usage Arguments Details Value Author(s)
View source: R/HLP_basic_SNP_annotation.R
basic_SNP_annotation
adds annotation data to SNP IDs using biomaRt and/or a
manufacturer's annotation file while keeping the order of input SNPs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | basic_SNP_annotation(
data,
max.SNPs.per.biomaRt.call = 10000,
data.SNP.columnName = "SNP",
snpmaRt = useMart("ENSEMBL_MART_SNP", host = "feb2014.archive.ensembl.org", dataset =
"hsapiens_snp"),
biomaRt.SNP.columnName = "refsnp_id",
biomaRt.filter = "snp_filter",
biomaRt.attributes.groupColumns = c("refsnp_id", "chr_name", "chrom_start"),
biomaRt.attributes.summarized = c("ensembl_gene_stable_id", "ensembl_type"),
annotationFile = NULL,
lines2skip.start = 0,
lines2skip.end = -1,
annofile.SNP.columnName = "Name",
annofile.columns = NULL
)
|
data |
dataframe containing SNP IDs. |
max.SNPs.per.biomaRt.call |
numeric. Number of SNP IDs to be queried in biomaRt at once. |
data.SNP.columnName |
character with column name of SNP IDs in |
snpmaRt |
biomaRt object to be used for annotation. If NULL, biomaRt annotation is skipped. |
biomaRt.SNP.columnName |
character with attribute name for SNP IDs of the biomaRt object. |
biomaRt.filter |
character with filter name to be used in biomaRt query. |
biomaRt.attributes.groupColumns |
character vector with attribute names to be queried in biomaRt. |
biomaRt.attributes.summarized |
character vector with further attribute names, which will be
summarized according to the attributes in |
annotationFile |
dataframe or character with path to dataframe containing annotation data by the assay manufacturer. If NULL, annotation is skipped. |
lines2skip.start |
Numeric with number of rows to skip when loading |
lines2skip.end |
Numeric with number of rows to read when loading |
annofile.SNP.columnName |
character with column name of SNP IDs in |
annofile.columns |
Optional character vector with column names of |
This function uses the SNP ID column from a given dataframe as input for adding annotation data.
All annotation data is added in additional columns and does not change the order of input
SNP IDs. Since biomaRt queries of large datasets (e.g. from SNP arrays) are prone to service
malfunction, basic_SNP_annotation
divides the data in chunks of feasible size given in
max.SNPs.per.biomaRt.call
.
Before biomaRt data are merged to input data, data columns containing multiple entries per entries
(given in biomaRt.attributes.summarized
) are collapsed separated by a semicolon. Data columns
given in biomaRt.attributes.groupColumns
are considered as grouping variables.
If a annotationFile
is specified, all included data is merged to the input dataframe.
The annotationFile
may be supplied directly as dataframe or as character containing a
file path. In latter case, the file is automatically loaded.
input dataframe annotated with biomaRt and/or manufacturer data in additional columns (starting with "SNPMart_" or "Annofile_", respectively). Order of entries within the dataframe remains unchanged.
Frank Ruehle
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.