HIBAG-package: HLA Genotype Imputation with Attribute Bagging

Description Details Author(s) References Examples

Description

To impute HLA types from unphased SNP data using an attribute bagging method.

Details

Package: HIBAG
Type: R/Bioconductor Package
License: GPL version 3
Kernel Version: v1.4

HIBAG is a state of the art software package for imputing HLA types using SNP data, and it uses the R statistical programming language. HIBAG is highly accurate, computationally tractable, and can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.

Features:
1) HIBAG can be used by researchers with published parameter estimates (http://www.biostat.washington.edu/~bsweir/HIBAG/) instead of requiring access to large training sample datasets.
2) A typical HIBAG parameter file contains only haplotype frequencies at different SNP subsets rather than individual training genotypes.
3) SNPs within the xMHC region (chromosome 6) are used for imputation.
4) HIBAG employs unphased genotypes of unrelated individuals as a training set.
5) HIBAG supports parallel computing with R.

Author(s)

Xiuwen Zheng [aut, cre, cph] zhengx@u.washington.edu, Bruce S. Weir [ctb, ths] bsweir@u.washington.edu

References

Zheng X, Shen J, Cox C, Wakefield J, Ehm M, Nelson M, Weir BS; HIBAG – HLA Genotype Imputation with Attribute Bagging. The Pharmacogenomics Journal. doi: 10.1038/tpj.2013.18. https://www.nature.com/articles/tpj201318

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
# HLA_Type_Table data
head(HLA_Type_Table)
dim(HLA_Type_Table)  # 60 13

# HapMap_CEU_Geno data
summary(HapMap_CEU_Geno)


######################################################################

# make a "hlaAlleleClass" object
hla.id <- "A"
hla <- hlaAllele(HLA_Type_Table$sample.id,
    H1 = HLA_Type_Table[, paste(hla.id, ".1", sep="")],
    H2 = HLA_Type_Table[, paste(hla.id, ".2", sep="")],
    locus=hla.id, assembly="hg19")

# divide HLA types randomly
set.seed(100)
hlatab <- hlaSplitAllele(hla, train.prop=0.5)
names(hlatab)
# "training"   "validation"
summary(hlatab$training)
summary(hlatab$validation)

# SNP predictors within the flanking region on each side
region <- 500   # kb
snpid <- hlaFlankingSNP(HapMap_CEU_Geno$snp.id, HapMap_CEU_Geno$snp.position,
    hla.id, region*1000, assembly="hg19")
length(snpid)  # 275

# training and validation genotypes
train.geno <- hlaGenoSubset(HapMap_CEU_Geno,
    snp.sel=match(snpid, HapMap_CEU_Geno$snp.id),
    samp.sel=match(hlatab$training$value$sample.id,
    HapMap_CEU_Geno$sample.id))
test.geno <- hlaGenoSubset(HapMap_CEU_Geno,
    samp.sel=match(hlatab$validation$value$sample.id,
    HapMap_CEU_Geno$sample.id))

# train a HIBAG model
set.seed(100)
# please use "nclassifier=100" when you use HIBAG for real data
model <- hlaAttrBagging(hlatab$training, train.geno, nclassifier=4,
    verbose.detail=TRUE)
summary(model)

# validation
pred <- hlaPredict(model, test.geno)
summary(pred)

# compare
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
    call.threshold=0))
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
    call.threshold=0.5))


# save the parameter file
mobj <- hlaModelToObj(model)
save(mobj, file="HIBAG_model.RData")
save(test.geno, file="testgeno.RData")
save(hlatab, file="HLASplit.RData")

# Clear Workspace
hlaClose(model)  # release all resources of model
rm(list = ls())


######################################################################

# NOW, load a HIBAG model from the parameter file
mobj <- get(load("HIBAG_model.RData"))
model <- hlaModelFromObj(mobj)

# validation
test.geno <- get(load("testgeno.RData"))
hlatab <- get(load("HLASplit.RData"))

pred <- hlaPredict(model, test.geno)
# compare
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
    call.threshold=0.5))


#########################################################################
# import a PLINK BED file
#
bed.fn <- system.file("extdata", "HapMap_CEU.bed", package="HIBAG")
fam.fn <- system.file("extdata", "HapMap_CEU.fam", package="HIBAG")
bim.fn <- system.file("extdata", "HapMap_CEU.bim", package="HIBAG")
hapmap.ceu <- hlaBED2Geno(bed.fn, fam.fn, bim.fn, assembly="hg19")


#########################################################################
# predict
#
pred <- hlaPredict(model, hapmap.ceu, type="response")
head(pred$value)
#   sample.id allele1 allele2      prob
# 1   NA10859   01:01   03:01 0.9999992
# 2   NA11882   01:01   29:02 1.0000000
# ...


# delete the temporary files
unlink(c("HIBAG_model.RData", "testgeno.RData", "HLASplit.RData"), force=TRUE)

Example output

HIBAG (HLA Genotype Imputation with Attribute Bagging)
Kernel Version: v1.5 (64-bit, AVX2)
  sample.id   A.1   A.2   B.1   B.2   C.1   C.2 DQA1.1 DQA1.2 DQB1.1 DQB1.2
1   NA11882 01:01 29:02 15:01 44:03 06:02 16:01  01:02  03:01  03:02  06:02
2   NA11881 03:01 26:01 07:02 07:02 07:02 07:02  01:02  01:02  06:02  06:02
3   NA11993 26:01 29:02 44:03  <NA> 16:01 16:01  01:01  01:02  05:01  06:02
4   NA11992 01:01 02:01 08:01 35:01 04:01 07:01  01:01  05:01  02:01  05:01
5   NA11995 01:01 01:01 08:01 57:01 06:02 07:01  01:02  01:03  06:02  06:03
6   NA11994 01:01 11:01 07:02 51:01 07:02 15:02  03:01  03:01  03:02  03:02
  DRB1.1 DRB1.2
1  04:01  15:01
2  15:01  15:01
3  01:01  15:01
4  01:01  03:01
5  13:01  15:01
6  04:02  04:04
[1] 60 13
SNP genotypes: 
    60 samples X 1564 SNPs
    SNPs range from 25769023bp to 33421576bp on hg19
Missing rate per SNP:
    min: 0, max: 0.0666667, mean: 0.0651215, median: 0.0666667, sd: 0.00968287
Missing rate per sample:
    min: 0, max: 0.969949, mean: 0.0651215, median: 0.000639386, sd: 0.243737
Minor allele frequency:
    min: 0, max: 0.5, mean: 0.227582, median: 0.205357, sd: 0.1389
Allelic information:
A/G C/T G/T A/C 
655 632 141 136 
[1] "training"   "validation"
Gene: A
Range: [29910247bp, 29913661bp] on hg19
# of samples: 34
# of unique HLA alleles: 14
# of unique HLA genotypes: 23
Gene: A
Range: [29910247bp, 29913661bp] on hg19
# of samples: 26
# of unique HLA alleles: 12
# of unique HLA genotypes: 14
[1] 275
Exclude 11 monomorphic SNPs
Build a HIBAG model with 4 individual classifiers:
    # of SNPs randomly sampled as candidates for each selection: 17
    # of SNPs: 264
    # of samples: 34
    # of unique HLA alleles: 14
CPU flags: 64-bit, AVX2
# of threads: 1
[-] 2021-01-12 21:26:47
=== building individual classifier 1, out-of-bag (11/32.4%) ===
     1, SNP: 211, Loss: 196.4, OOB Acc: 54.55%, # of Haplo: 13
     2, SNP: 66, Loss: 173.548, OOB Acc: 63.64%, # of Haplo: 13
     3, SNP: 177, Loss: 136.352, OOB Acc: 68.18%, # of Haplo: 13
     4, SNP: 108, Loss: 95.8359, OOB Acc: 72.73%, # of Haplo: 13
     5, SNP: 127, Loss: 67.3216, OOB Acc: 77.27%, # of Haplo: 13
     6, SNP: 95, Loss: 47.5888, OOB Acc: 77.27%, # of Haplo: 13
     7, SNP: 33, Loss: 37.2631, OOB Acc: 77.27%, # of Haplo: 16
     8, SNP: 6, Loss: 29.7419, OOB Acc: 77.27%, # of Haplo: 18
     9, SNP: 208, Loss: 25.6913, OOB Acc: 77.27%, # of Haplo: 19
    10, SNP: 225, Loss: 25.3087, OOB Acc: 77.27%, # of Haplo: 21
    11, SNP: 11, Loss: 24.8356, OOB Acc: 77.27%, # of Haplo: 23
    12, SNP: 151, Loss: 19.4134, OOB Acc: 77.27%, # of Haplo: 23
    13, SNP: 199, Loss: 17.011, OOB Acc: 77.27%, # of Haplo: 23
[1] 2021-01-12 21:26:47, OOB Acc: 77.27%, # of SNPs: 13, # of Haplo: 23
=== building individual classifier 2, out-of-bag (13/38.2%) ===
     1, SNP: 160, Loss: 221.236, OOB Acc: 76.92%, # of Haplo: 17
     2, SNP: 145, Loss: 173.538, OOB Acc: 80.77%, # of Haplo: 23
     3, SNP: 177, Loss: 128.58, OOB Acc: 84.62%, # of Haplo: 31
     4, SNP: 111, Loss: 79.6877, OOB Acc: 84.62%, # of Haplo: 31
     5, SNP: 207, Loss: 52.5557, OOB Acc: 88.46%, # of Haplo: 32
     6, SNP: 245, Loss: 41.8731, OOB Acc: 88.46%, # of Haplo: 34
     7, SNP: 230, Loss: 31.7937, OOB Acc: 88.46%, # of Haplo: 38
     8, SNP: 151, Loss: 20.4566, OOB Acc: 88.46%, # of Haplo: 36
     9, SNP: 14, Loss: 19.5805, OOB Acc: 88.46%, # of Haplo: 42
    10, SNP: 132, Loss: 19.5101, OOB Acc: 88.46%, # of Haplo: 42
    11, SNP: 221, Loss: 19.485, OOB Acc: 88.46%, # of Haplo: 44
    12, SNP: 251, Loss: 18.5695, OOB Acc: 88.46%, # of Haplo: 48
[2] 2021-01-12 21:26:47, OOB Acc: 88.46%, # of SNPs: 12, # of Haplo: 48
=== building individual classifier 3, out-of-bag (14/41.2%) ===
     1, SNP: 191, Loss: 193.067, OOB Acc: 57.14%, # of Haplo: 11
     2, SNP: 264, Loss: 150.427, OOB Acc: 64.29%, # of Haplo: 12
     3, SNP: 132, Loss: 93.4067, OOB Acc: 67.86%, # of Haplo: 12
     4, SNP: 128, Loss: 39.8353, OOB Acc: 71.43%, # of Haplo: 12
     5, SNP: 160, Loss: 28.2998, OOB Acc: 75.00%, # of Haplo: 12
     6, SNP: 144, Loss: 13.635, OOB Acc: 75.00%, # of Haplo: 12
     7, SNP: 111, Loss: 6.04609, OOB Acc: 75.00%, # of Haplo: 12
     8, SNP: 40, Loss: 6.04583, OOB Acc: 82.14%, # of Haplo: 14
     9, SNP: 141, Loss: 6.04583, OOB Acc: 85.71%, # of Haplo: 14
    10, SNP: 73, Loss: 2.9038, OOB Acc: 85.71%, # of Haplo: 14
    11, SNP: 199, Loss: 2.20025, OOB Acc: 85.71%, # of Haplo: 14
[3] 2021-01-12 21:26:47, OOB Acc: 85.71%, # of SNPs: 11, # of Haplo: 14
=== building individual classifier 4, out-of-bag (10/29.4%) ===
     1, SNP: 147, Loss: 158.631, OOB Acc: 50.00%, # of Haplo: 12
     2, SNP: 152, Loss: 140.375, OOB Acc: 55.00%, # of Haplo: 13
     3, SNP: 78, Loss: 115.887, OOB Acc: 60.00%, # of Haplo: 16
     4, SNP: 115, Loss: 77.8082, OOB Acc: 60.00%, # of Haplo: 18
     5, SNP: 148, Loss: 62.6831, OOB Acc: 65.00%, # of Haplo: 18
     6, SNP: 13, Loss: 46.5657, OOB Acc: 75.00%, # of Haplo: 20
     7, SNP: 109, Loss: 31.0312, OOB Acc: 75.00%, # of Haplo: 20
     8, SNP: 176, Loss: 22.5073, OOB Acc: 75.00%, # of Haplo: 21
     9, SNP: 145, Loss: 20.9122, OOB Acc: 75.00%, # of Haplo: 21
    10, SNP: 128, Loss: 20.6728, OOB Acc: 75.00%, # of Haplo: 21
    11, SNP: 73, Loss: 14.6217, OOB Acc: 75.00%, # of Haplo: 22
    12, SNP: 151, Loss: 10.2879, OOB Acc: 75.00%, # of Haplo: 23
    13, SNP: 199, Loss: 8.74645, OOB Acc: 75.00%, # of Haplo: 23
[4] 2021-01-12 21:26:47, OOB Acc: 75.00%, # of SNPs: 13, # of Haplo: 23
Calculating matching proportion:
        Min.     0.1% Qu.       1% Qu.      1st Qu.       Median      3rd Qu. 
0.0002162725 0.0002198443 0.0002519909 0.0043752063 0.0092453043 0.0267068470 
        Max.         Mean           SD 
0.5261716555 0.0476729895 0.1180875414 
Accuracy with training data: 97.06%
Out-of-bag accuracy: 81.61%
Gene: A
Training dataset: 34 samples X 264 SNPs
    # of HLA alleles: 14
    # of individual classifiers: 4
    total # of SNPs used: 38
    avg. # of SNPs in an individual classifier: 12.25
        (sd: 0.96, min: 11, max: 13, median: 12.50)
    avg. # of haplotypes in an individual classifier: 27.00
        (sd: 14.63, min: 14, max: 48, median: 23.00)
    avg. out-of-bag accuracy: 81.61%
        (sd: 6.49%, min: 75.00%, max: 88.46%, median: 81.49%)
Matching proportion:
        Min.     0.1% Qu.       1% Qu.      1st Qu.       Median      3rd Qu. 
0.0002162725 0.0002198443 0.0002519909 0.0043752063 0.0092453043 0.0267068470 
        Max.         Mean           SD 
0.5261716555 0.0476729895 0.1180875414 
Genome assembly: hg19
HIBAG model:
    4 individual classifiers
    264 SNPs
    14 unique HLA alleles
Prediction:
    based on the averaged posterior probabilities
Model assembly: hg19, SNP assembly: hg19
No allelic strand orders are switched.
# of samples: 26
CPU flags: 64-bit, AVX2
# of threads: 1
Predicting (2021-01-12 21:26:48)	0%
Predicting (2021-01-12 21:26:48)	100%
Gene: A
Range: [29910247bp, 29913661bp] on hg19
# of samples: 26
# of unique HLA alleles: 11
# of unique HLA genotypes: 14
Posterior probability:
  [0,0.25) [0.25,0.5) [0.5,0.75)   [0.75,1] 
  1 (3.8%)  4 (15.4%)  4 (15.4%) 17 (65.4%) 
Matching proportion of SNP haplotype:
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.000000 0.002097 0.007711 0.034403 0.028032 0.526172 
$overall
  total.num.ind crt.num.ind crt.num.haplo   acc.ind acc.haplo call.threshold
1            26          23            49 0.8846154 0.9423077              0
  n.call call.rate
1     26         1

$confusion
       True
Predict 01:01 02:01 02:06 03:01 11:01 23:01 24:02 24:03 25:01 26:01 29:02 31:01
  01:01    12     1     0     0     0     0     0     0     0     0     0     0
  02:01     0    21     0     0     0     0     0     0     0     0     0     0
  02:06     0     0     0     0     0     0     0     0     0     0     0     0
  03:01     0     0     0     5     0     0     0     0     0     0     0     0
  11:01     0     0     0     0     2     0     0     0     0     0     0     0
  23:01     0     0     0     0     0     1     0     0     0     0     0     0
  24:02     0     0     0     0     0     1     3     0     0     0     0     0
  24:03     0     0     0     0     0     0     0     0     0     0     0     0
  25:01     0     0     0     0     0     0     0     0     1     0     0     0
  26:01     0     0     0     0     0     0     0     0     0     0     0     0
  29:02     0     0     0     0     0     0     0     0     0     0     1     0
  31:01     0     0     0     0     0     0     0     0     0     1     0     1
  32:01     0     0     0     0     0     0     0     0     0     0     0     0
  68:01     0     0     0     0     0     0     0     0     0     0     0     0
  ...       0     0     0     0     0     0     0     0     0     0     0     0
       True
Predict 32:01 68:01
  01:01     0     0
  02:01     0     0
  02:06     0     0
  03:01     0     0
  11:01     0     0
  23:01     0     0
  24:02     0     0
  24:03     0     0
  25:01     0     0
  26:01     0     0
  29:02     0     0
  31:01     0     0
  32:01     1     0
  68:01     0     1
  ...       0     0

$detail
   allele train.num train.freq valid.num valid.freq call.rate  accuracy
1   01:01        13 0.19117647        12 0.23076923         1 0.9807692
2   02:01        21 0.30882353        22 0.42307692         1 0.9807692
3   02:06         1 0.01470588         0 0.00000000         0       NaN
4   03:01         4 0.05882353         5 0.09615385         1 1.0000000
5   11:01         3 0.04411765         2 0.03846154         1 1.0000000
6   23:01         1 0.01470588         2 0.03846154         1 0.9807692
7   24:02         8 0.11764706         3 0.05769231         1 0.9807692
8   24:03         1 0.01470588         0 0.00000000         0       NaN
9   25:01         4 0.05882353         1 0.01923077         1 1.0000000
10  26:01         2 0.02941176         1 0.01923077         1 0.9807692
11  29:02         3 0.04411765         1 0.01923077         1 1.0000000
12  31:01         2 0.02941176         1 0.01923077         1 0.9807692
13  32:01         3 0.04411765         1 0.01923077         1 1.0000000
14  68:01         2 0.02941176         1 0.01923077         1 1.0000000
   sensitivity specificity       ppv       npv miscall miscall.prop
1    1.0000000   0.9750000 0.9230769 1.0000000    <NA>          NaN
2    0.9545455   1.0000000 1.0000000 0.9677419   01:01            1
3          NaN         NaN       NaN       NaN    <NA>          NaN
4    1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN
5    1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN
6    0.5000000   1.0000000 1.0000000 0.9803922   24:02            1
7    1.0000000   0.9795918 0.7500000 1.0000000    <NA>          NaN
8          NaN         NaN       NaN       NaN    <NA>          NaN
9    1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN
10   0.0000000   1.0000000       NaN 0.9807692   31:01            1
11   1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN
12   1.0000000   0.9803922 0.5000000 1.0000000    <NA>          NaN
13   1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN
14   1.0000000   1.0000000 1.0000000 1.0000000    <NA>          NaN

$overall
  total.num.ind crt.num.ind crt.num.haplo acc.ind acc.haplo call.threshold
1            26          21            42       1         1            0.5
  n.call call.rate
1     21 0.8076923

$confusion
       True
Predict 01:01 02:01 02:06 03:01 11:01 23:01 24:02 24:03 25:01 26:01 29:02 31:01
  01:01    12     0     0     0     0     0     0     0     0     0     0     0
  02:01     0    18     0     0     0     0     0     0     0     0     0     0
  02:06     0     0     0     0     0     0     0     0     0     0     0     0
  03:01     0     0     0     4     0     0     0     0     0     0     0     0
  11:01     0     0     0     0     2     0     0     0     0     0     0     0
  23:01     0     0     0     0     0     0     0     0     0     0     0     0
  24:02     0     0     0     0     0     0     2     0     0     0     0     0
  24:03     0     0     0     0     0     0     0     0     0     0     0     0
  25:01     0     0     0     0     0     0     0     0     0     0     0     0
  26:01     0     0     0     0     0     0     0     0     0     0     0     0
  29:02     0     0     0     0     0     0     0     0     0     0     1     0
  31:01     0     0     0     0     0     0     0     0     0     0     0     1
  32:01     0     0     0     0     0     0     0     0     0     0     0     0
  68:01     0     0     0     0     0     0     0     0     0     0     0     0
  ...       0     0     0     0     0     0     0     0     0     0     0     0
       True
Predict 32:01 68:01
  01:01     0     0
  02:01     0     0
  02:06     0     0
  03:01     0     0
  11:01     0     0
  23:01     0     0
  24:02     0     0
  24:03     0     0
  25:01     0     0
  26:01     0     0
  29:02     0     0
  31:01     0     0
  32:01     1     0
  68:01     0     1
  ...       0     0

$detail
   allele train.num train.freq valid.num valid.freq call.rate accuracy
1   01:01        13 0.19117647        12 0.23076923 1.0000000        1
2   02:01        21 0.30882353        22 0.42307692 0.8181818        1
3   02:06         1 0.01470588         0 0.00000000 0.0000000      NaN
4   03:01         4 0.05882353         5 0.09615385 0.8000000        1
5   11:01         3 0.04411765         2 0.03846154 1.0000000        1
6   23:01         1 0.01470588         2 0.03846154 0.0000000      NaN
7   24:02         8 0.11764706         3 0.05769231 0.6666667        1
8   24:03         1 0.01470588         0 0.00000000 0.0000000      NaN
9   25:01         4 0.05882353         1 0.01923077 0.0000000      NaN
10  26:01         2 0.02941176         1 0.01923077 0.0000000      NaN
11  29:02         3 0.04411765         1 0.01923077 1.0000000        1
12  31:01         2 0.02941176         1 0.01923077 1.0000000        1
13  32:01         3 0.04411765         1 0.01923077 1.0000000        1
14  68:01         2 0.02941176         1 0.01923077 1.0000000        1
   sensitivity specificity ppv npv miscall miscall.prop
1            1           1   1   1    <NA>          NaN
2            1           1   1   1    <NA>          NaN
3          NaN         NaN NaN NaN    <NA>          NaN
4            1           1   1   1    <NA>          NaN
5            1           1   1   1    <NA>          NaN
6          NaN         NaN NaN NaN    <NA>          NaN
7            1           1   1   1    <NA>          NaN
8          NaN         NaN NaN NaN    <NA>          NaN
9          NaN         NaN NaN NaN    <NA>          NaN
10         NaN         NaN NaN NaN    <NA>          NaN
11           1           1   1   1    <NA>          NaN
12           1           1   1   1    <NA>          NaN
13           1           1   1   1    <NA>          NaN
14           1           1   1   1    <NA>          NaN

HIBAG model:
    4 individual classifiers
    264 SNPs
    14 unique HLA alleles
Prediction:
    based on the averaged posterior probabilities
Model assembly: hg19, SNP assembly: hg19
No allelic strand orders are switched.
# of samples: 26
CPU flags: 64-bit, AVX2
# of threads: 1
Predicting (2021-01-12 21:26:48)	0%
Predicting (2021-01-12 21:26:48)	100%
$overall
  total.num.ind crt.num.ind crt.num.haplo acc.ind acc.haplo call.threshold
1            26          21            42       1         1            0.5
  n.call call.rate
1     21 0.8076923

$confusion
       True
Predict 01:01 02:01 02:06 03:01 11:01 23:01 24:02 24:03 25:01 26:01 29:02 31:01
  01:01    12     0     0     0     0     0     0     0     0     0     0     0
  02:01     0    18     0     0     0     0     0     0     0     0     0     0
  02:06     0     0     0     0     0     0     0     0     0     0     0     0
  03:01     0     0     0     4     0     0     0     0     0     0     0     0
  11:01     0     0     0     0     2     0     0     0     0     0     0     0
  23:01     0     0     0     0     0     0     0     0     0     0     0     0
  24:02     0     0     0     0     0     0     2     0     0     0     0     0
  24:03     0     0     0     0     0     0     0     0     0     0     0     0
  25:01     0     0     0     0     0     0     0     0     0     0     0     0
  26:01     0     0     0     0     0     0     0     0     0     0     0     0
  29:02     0     0     0     0     0     0     0     0     0     0     1     0
  31:01     0     0     0     0     0     0     0     0     0     0     0     1
  32:01     0     0     0     0     0     0     0     0     0     0     0     0
  68:01     0     0     0     0     0     0     0     0     0     0     0     0
  ...       0     0     0     0     0     0     0     0     0     0     0     0
       True
Predict 32:01 68:01
  01:01     0     0
  02:01     0     0
  02:06     0     0
  03:01     0     0
  11:01     0     0
  23:01     0     0
  24:02     0     0
  24:03     0     0
  25:01     0     0
  26:01     0     0
  29:02     0     0
  31:01     0     0
  32:01     1     0
  68:01     0     1
  ...       0     0

$detail
   allele train.num train.freq valid.num valid.freq call.rate accuracy
1   01:01        13 0.19117647        12 0.23076923 1.0000000        1
2   02:01        21 0.30882353        22 0.42307692 0.8181818        1
3   02:06         1 0.01470588         0 0.00000000 0.0000000      NaN
4   03:01         4 0.05882353         5 0.09615385 0.8000000        1
5   11:01         3 0.04411765         2 0.03846154 1.0000000        1
6   23:01         1 0.01470588         2 0.03846154 0.0000000      NaN
7   24:02         8 0.11764706         3 0.05769231 0.6666667        1
8   24:03         1 0.01470588         0 0.00000000 0.0000000      NaN
9   25:01         4 0.05882353         1 0.01923077 0.0000000      NaN
10  26:01         2 0.02941176         1 0.01923077 0.0000000      NaN
11  29:02         3 0.04411765         1 0.01923077 1.0000000        1
12  31:01         2 0.02941176         1 0.01923077 1.0000000        1
13  32:01         3 0.04411765         1 0.01923077 1.0000000        1
14  68:01         2 0.02941176         1 0.01923077 1.0000000        1
   sensitivity specificity ppv npv miscall miscall.prop
1            1           1   1   1    <NA>          NaN
2            1           1   1   1    <NA>          NaN
3          NaN         NaN NaN NaN    <NA>          NaN
4            1           1   1   1    <NA>          NaN
5            1           1   1   1    <NA>          NaN
6          NaN         NaN NaN NaN    <NA>          NaN
7            1           1   1   1    <NA>          NaN
8          NaN         NaN NaN NaN    <NA>          NaN
9          NaN         NaN NaN NaN    <NA>          NaN
10         NaN         NaN NaN NaN    <NA>          NaN
11           1           1   1   1    <NA>          NaN
12           1           1   1   1    <NA>          NaN
13           1           1   1   1    <NA>          NaN
14           1           1   1   1    <NA>          NaN

Open "/usr/lib/R/site-library/HIBAG/extdata/HapMap_CEU.bed" in the individual-major mode.
Open "/usr/lib/R/site-library/HIBAG/extdata/HapMap_CEU.fam".
Open "/usr/lib/R/site-library/HIBAG/extdata/HapMap_CEU.bim".
Import 3932 SNPs within the xMHC region on chromosome 6.
HIBAG model:
    4 individual classifiers
    264 SNPs
    14 unique HLA alleles
Prediction:
    based on the averaged posterior probabilities
Model assembly: hg19, SNP assembly: hg19
No allelic strand orders are switched.
# of samples: 90
CPU flags: 64-bit, AVX2
# of threads: 1
Predicting (2021-01-12 21:26:48)	0%
Predicting (2021-01-12 21:26:48)	100%
  sample.id allele1 allele2      prob     matching
1   NA10859   01:01   03:01 0.9374958 0.0050823317
2   NA11882   01:01   29:02 0.9999996 0.0112456881
3   NA11881   03:01   31:01 0.2962896 0.0002162734
4   NA10860   02:01   02:01 0.3478309 0.4392334299
5   NA11993   25:01   29:02 0.7499603 0.0003245102
6   NA11992   01:01   02:01 0.9920706 0.0466853516

HIBAG documentation built on March 24, 2021, 6 p.m.