View source: R/combine-vcf-with-pgs.R
combine.vcf.with.pgs | R Documentation |
Match PGS SNPs to corresponding VCF information by genomic coordinates or rsID using a merge operation.
combine.vcf.with.pgs(vcf.data, pgs.weight.data)
vcf.data |
A data.frame containing VCF data. Required columns: |
pgs.weight.data |
A data.frame containing PGS data. Required columns: |
A list containing a data.frame of merged VCF and PGS data and a data.frame of PGS SNPs missing from the VCF.
A primary merge is first performed on chromosome and base pair coordinates. For SNPs that could not be matched in the first mergs, a second merge is attempted by rsID if available. This action can account for short INDELs that can have coordinate mismatches between the PGS and VCF data. The merge is a left outer join: all PGS SNPs are kept as rows even if they are missing from the VCF, and all VCF SNPs that are not a component of the PGS are dropped. If no PGS SNPs are present in the VCF, the function will terminate with an error.
# Example VCF
vcf.path <- system.file(
'extdata',
'HG001_GIAB.vcf.gz',
package = 'ApplyPolygenicScore',
mustWork = TRUE
);
vcf.import <- import.vcf(vcf.path);
# Example pgs weight file
pgs.weight.path <- system.file(
'extdata',
'PGS000662_hmPOS_GRCh38.txt.gz',
package = 'ApplyPolygenicScore',
mustWork = TRUE
);
pgs.import <- import.pgs.weight.file(pgs.weight.path);
merge.data <- combine.vcf.with.pgs(
vcf.data = vcf.import$dat,
pgs.weight.data = pgs.import$pgs.weight.data
);
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.