hlaCheckSNPs: Check the SNP predictors in a HIBAG model

Description Usage Arguments Value Author(s) See Also Examples

View source: R/DataUtilities.R

Description

Check the SNP predictors in a HIBAG model, by calculating the overlapping between the model and SNP genotypes.

Usage

1
2
hlaCheckSNPs(model, object,
    match.type=c("RefSNP+Position", "RefSNP", "Position"), verbose=TRUE)

Arguments

model

an object of hlaAttrBagClass, or an object of hlaAttrBagObj

object

a genotype object of hlaSNPGenoClass, or a character vector like c("rs2523442", "rs9257863", ...)

match.type

"RefSNP+Position" (by default) – using both of RefSNP IDs and positions; "RefSNP" – using RefSNP IDs only; "Position" – using positions only

verbose

if TRUE, show information

Value

Return a data.frame for individual classifiers:

NumOfValidSNP

the number of non-missing SNPs in an individual classifier

NumOfSNP

the number of SNP predictors in an individual classifier

fraction

NumOfValidSNP / NumOfSNP

Author(s)

Xiuwen Zheng

See Also

hlaAttrBagging, predict.hlaAttrBagClass

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# make a "hlaAlleleClass" object
hla.id <- "DQB1"
hla <- hlaAllele(HLA_Type_Table$sample.id,
    H1 = HLA_Type_Table[, paste(hla.id, ".1", sep="")],
    H2 = HLA_Type_Table[, paste(hla.id, ".2", sep="")],
    locus=hla.id, assembly="hg19")

# training genotypes
region <- 100   # kb
snpid <- hlaFlankingSNP(HapMap_CEU_Geno$snp.id, HapMap_CEU_Geno$snp.position,
    hla.id, region*1000, assembly="hg19")
train.geno <- hlaGenoSubset(HapMap_CEU_Geno,
    snp.sel = match(snpid, HapMap_CEU_Geno$snp.id))

# train a HIBAG model
set.seed(1000)
model <- hlaAttrBagging(hla, train.geno, nclassifier=2)
print(model)


hlaCheckSNPs(model, train.geno)

# close the HIBAG model explicitly
hlaClose(model)

Example output

HIBAG (HLA Genotype Imputation with Attribute Bagging)
Kernel Version: v1.3
Supported by Streaming SIMD Extensions (SSE2) [64-bit]
Remove 1 monomorphic SNP
Build a HIBAG model with 2 individual classifiers:
# of SNPs randomly sampled as candidates for each selection: 9
# of SNPs: 77, # of samples: 60
# of unique HLA alleles: 12
Wed Jan  1 15:47:20 2020,   1 individual classifier, out-of-bag acc: 98.00%, # of SNPs: 13, # of haplo: 20
Wed Jan  1 15:47:20 2020,   2 individual classifier, out-of-bag acc: 90.91%, # of SNPs: 15, # of haplo: 21
Gene: DQB1
Training dataset: 60 samples X 77 SNPs
	# of HLA alleles: 12
	# of individual classifiers: 2
	total # of SNPs used: 20
	average # of SNPs in an individual classifier: 14.00, sd: 1.41, min: 13, max: 15
	average # of haplotypes in an individual classifier: 20.50, sd: 0.71, min: 20, max: 21
	average out-of-bag accuracy: 94.45%, sd: 5.01%, min: 90.91%, max: 98.00%
Genome assembly: hg19
The HIBAG model:
	There are 77 SNP predictors in total.
	There are 2 individual classifiers.
Summarize the missing fractions of SNP predictors per classifier:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      0       0       0       0       0       0 

HIBAG documentation built on March 24, 2021, 6 p.m.