hlaPredict | R Documentation |
To predict HLA type based on a HIBAG model (in parallel).
hlaPredict(object, snp, cl=FALSE,
type=c("response+dosage", "response", "prob", "response+prob"),
vote=c("prob", "majority"), allele.check=TRUE,
match.type=c("Position", "Pos+Allele", "RefSNP+Position", "RefSNP"),
same.strand=FALSE, verbose=TRUE, verbose.match=TRUE)
## S3 method for class 'hlaAttrBagClass'
predict(object, snp, cl=FALSE,
type=c("response+dosage", "response", "prob", "response+prob"),
vote=c("prob", "majority"), allele.check=TRUE,
match.type=c("Position", "Pos+Allele", "RefSNP+Position", "RefSNP"),
same.strand=FALSE, verbose=TRUE, verbose.match=TRUE, ...)
object |
a model of |
snp |
a genotypic object of |
cl |
|
type |
"response+dosage": return the best-guess types and dosages for each allele (by default); "response": return the best-guess types with its posterior probability; "prob": return a matrix for all posterior probabilities; "response+prob": return the best-guess, dosages and all posterior probabilities |
vote |
|
allele.check |
if |
match.type |
|
same.strand |
|
verbose |
if TRUE, show information |
verbose.match |
if TRUE, show missing SNP proportions for different
|
... |
unused |
If more than 50% of SNP predictors are missing, a warning will be given.
When match.type="RefSNP+Position"
, the matching of SNPs requires
both SNP IDs and positions. A lower missing fraction maybe gained by
matching SNP IDs or positions only. Call
hlaPredict(..., match.type="RefSNP")
or
hlaPredict(..., match.type="Position")
for this purpose.
It could be safe to assume that the SNPs with the same positions on the same
genome reference (e.g., hg19) are the same variant albeit the different SNP
IDs. Any concern about SNP mismatching should be emailed to the genotyping
platform provider.
Return a hlaAlleleClass
object with posterior probabilities
of predicted HLA types, or a matrix of pairwise possible HLA types with all
posterior probabilities. If type = "response+prob"
, return a
hlaAlleleClass
object with a matrix of postprob
for the
probabilities of all pairs of alleles.
If a probability matrix is returned, colnames
is sample.id
and rownames
is an unordered pair of HLA alleles.
Xiuwen Zheng
hlaAttrBagging
, hlaAllele
,
hlaCompareAllele
, hlaParallelAttrBagging
,
hlaSetKernelTarget
, hlaAlleleToVCF
# make a "hlaAlleleClass" object
hla.id <- "A"
hla <- hlaAllele(HLA_Type_Table$sample.id,
H1 = HLA_Type_Table[, paste(hla.id, ".1", sep="")],
H2 = HLA_Type_Table[, paste(hla.id, ".2", sep="")],
locus=hla.id, assembly="hg19")
# divide HLA types randomly
set.seed(100)
hlatab <- hlaSplitAllele(hla, train.prop=0.5)
names(hlatab)
# "training" "validation"
summary(hlatab$training)
summary(hlatab$validation)
# SNP predictors within the flanking region on each side
region <- 500 # kb
snpid <- hlaFlankingSNP(HapMap_CEU_Geno$snp.id, HapMap_CEU_Geno$snp.position,
hla.id, region*1000, assembly="hg19")
length(snpid) # 275
# training and validation genotypes
train.geno <- hlaGenoSubset(HapMap_CEU_Geno,
snp.sel=match(snpid, HapMap_CEU_Geno$snp.id),
samp.sel=match(hlatab$training$value$sample.id,
HapMap_CEU_Geno$sample.id))
test.geno <- hlaGenoSubset(HapMap_CEU_Geno,
samp.sel=match(hlatab$validation$value$sample.id,
HapMap_CEU_Geno$sample.id))
# train a HIBAG model
set.seed(100)
model <- hlaAttrBagging(hlatab$training, train.geno, nclassifier=4,
verbose.detail=TRUE)
summary(model)
# validation
pred <- hlaPredict(model, test.geno, type="response+dosage")
pred
head(pred$value)
pred$dosage[, 1:4] # a dosage matrix
# compare
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
call.threshold=0))
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
call.threshold=0.5))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.