createFullHaplotype | R Documentation |
The createFullHaplotype
functions infers haplotype based on an anchor gene.
createFullHaplotype( clip_db, toHap_col = c("v_call", "d_call"), hapBy_col = "j_call", hapBy = "IGHJ6", toHap_GERM = NULL, relative_freq_priors = TRUE, kThreshDel = 3, rmPseudo = TRUE, deleted_genes = c(), nonReliable_Vgenes = c(), min_minor_fraction = 0.3, single_gene = TRUE, chain = c("IGH", "IGK", "IGL", "TRB") )
clip_db |
a |
toHap_col |
a vector of column names for which a haplotype should be inferred. Default is v_call and d_call |
hapBy_col |
column name of the anchor gene. Default is j_call |
hapBy |
a string of the anchor gene name. Default is IGHJ6. |
toHap_GERM |
a vector of named nucleotide germline sequences matching the allele calls in |
relative_freq_priors |
if TRUE, the priors for Bayesian inference are estimated from the relative frequencies in clip_db. Else, priors are set to |
kThreshDel |
the minimum lK (log10 of the Bayes factor) to call a deletion. Default is 3. |
rmPseudo |
if TRUE non-functional and pseudo genes are removed. Default is TRUE. |
deleted_genes |
double chromosome deletion summary table. A |
nonReliable_Vgenes |
a list of known non reliable gene assignments. A |
min_minor_fraction |
the minimum minor allele fraction to be used as an anchor gene. Default is 0.3 |
single_gene |
if to only consider genes from single assignment. If true then calls where genes appear with others are discarded. If false then the calls are seperated an counted for all genes that appeared. Default is True. |
chain |
the IG/TR chain: IGH,IGK,IGL,TRB. Default is IGH. |
Function accepts a data.frame
in AIRR format (https://changeo.readthedocs.io/en/stable/standard.html) containing the following columns:
'subject'
: The subject name
'v_call'
: V allele call(s) (in an IMGT format)
'd_call'
: D allele call(s) (in an IMGT format, only for heavy chains)
'j_call'
: J allele call(s) (in an IMGT format)
A data.frame
, in which each row is the haplotype inference summary of a gene from the column selected in toHap_col
.
The output containes the following columns:
subject
: the subject name.
gene
: the gene name.
Anchor gene allele 1: the haplotype inference for chromosome one. The column name is the anchor gene with the first allele.
Anchor gene allele 2: the haplotype inference for chromosome two. The column name is the anchor gene with the second allele.
alleles
: allele calls for the gene.
proirs_row
: priors based on relative allele usage of the anchor gene.
proirs_col
: priors based on relative allele usage of the inferred gene.
counts1
: the appereance count on each chromosome of the first allele from alleles
, the counts are seperated by a comma.
k1
: the Bayesian factor value for the first allele (from alleles
) inference.
counts2
: the appereance count on each chromosome of the second allele from alleles
, the counts are seperated by a comma.
k2
: the Bayesian factor value for the second allele (from alleles
) inference.
counts3
: the appereance count on each chromosome of the third allele from alleles
, the counts are seperated by a comma.
k3
: the Bayesian factor value for the third allele (from alleles
) inference.
counts4
: the appereance count on each chromosome of the fourth allele from alleles
, the counts are seperated by a comma.
k4
: the Bayesian factor value for the fourth allele (from alleles
) inference.
# Load example data and germlines data(samples_db, HVGERM, HDGERM) # Selecting a single individual clip_db = samples_db[samples_db$subject=='I5', ] # Infering haplotype haplo_db = createFullHaplotype(clip_db,toHap_col=c('v_call','d_call'), hapBy_col='j_call',hapBy='IGHJ6',toHap_GERM=c(HVGERM,HDGERM))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.