PolyPhenDbColumns: PolyPhenDb Columns

PolyPhenDbColumnsR Documentation

PolyPhenDb Columns

Description

Description of the PolyPhen Sqlite Database Columns

Column descriptions

These column names are displayed when columns is called on a PolyPhenDb object.

  • rsid : rsid

Original query :

  • OSNPID : original SNP identifier from user input

  • OSNPACC : original protein identifier from user input

  • OPOS : original substitution position in the protein sequence from user input

  • OAA1 : original wild type (reference) aa residue from user input

  • OAA2 : original mutant (reference) aa residue from user input

Mapped query :

  • SNPID : SNP identifier mapped to dbSNP rsID if available, otherwise same as o_snp_id. This value was used as the rsid column

  • ACC : protein UniProtKB accession if known protein, otherwise same as o_acc

  • POS : substitution position mapped to UniProtKB protein sequence if known, otherwise same as o_pos

  • AA1 : wild type aa residue

  • AA2 : mutant aa residue

  • NT1 : wild type allele nucleotide

  • NT2 : mutant allele nucleotide

PolyPhen-2 prediction :

  • PREDICTION : qualitative ternary classification FPR thresholds

PolyPhen-1 prediction :

  • BASEDON : prediction basis

  • EFFECT : predicted substitution effect on the protein structure or function

PolyPhen-2 classifiers :

  • PPH2CLASS : binary classifier outcome ("damaging" or "neutral")

  • PPH2PROB : probability of the variation being dammaging

  • PPH2FPR : false positive rate at the pph2_prob level

  • PPH2TPR : true positive rate at the pph2_prob level

  • PPH2FDR : false discovery rate at the pph2_prob level

UniProtKB-SwissProt derived protein sequence annotations :

  • SITE : substitution SITE annotation

  • REGION : substitution REGION annotation

  • PHAT : PHAT matrix element for substitution in the TRANSMEM region

Multiple sequence alignment scores :

  • DSCORE : difference of PSIC scores for two aa variants (Score1 - Score2)

  • SCORE1 : PSIC score for wild type aa residue (aa1)

  • SCORE2 : PSIC score for mutant aa residue (aa2)

  • NOBS : number of residues observed at the substitution position in the multiple alignment (sans gaps)

Protein 3D structure features :

  • NSTRUCT : initial number of BLAST hits to similar proteins with 3D structures in PDB

  • NFILT : number of 3D BLAST hits after identity threshold filtering

  • PDBID : protein structure identifier from PDB

  • PDBPOS : position of substitution in PDB protein sequence

  • PDBCH : PDB polypeptide chain identifier

  • IDENT : sequence identity between query and aligned PDB sequences

  • LENGTH : PDB sequence alignment length

  • NORMACC : normalized accessible surface

  • SECSTR : DSSP secondary structure assignment

  • MAPREG : region of the phi-psi (Ramachandran) map derived from the residue dihedral angles

  • DVOL : change in residue side chain volume

  • DPROP : change in solvent accessible surface propensity resulting from the substitution

  • BFACT : normalized B-factor (temperature factor) for the residue

  • HBONDS : number of hydrogen sidechain-sidechain and sidechain-mainchain bonds formed by the residue

  • AVENHET : average number of contacts with heteroatoms per residue

  • MINDHET : closest contact with heteroatom

  • AVENINT : average number of contacts with other chains per residue

  • MINDINT : closest contact with other chain

  • AVENSIT : average number of contacts with critical sites per residue

  • MINDSIT : closest contact with a critical site

Nucleotide sequence features (CpG/codon/exon junction) :

  • TRANSV : whether substitution is a transversion

  • CODPOS : position of the substitution within the codon

  • CPG : whether or not the substitution changes CpG context

  • MINDJNC : substitution distance from exon/intron junction

Pfam protein family :

  • PFAMHIT : Pfam identifier of the query protein

Substitution scores :

  • IDPMAX : maximum congruency of the mutant aa residue to all sequences in multiple alignment

  • IDPSNP : maximum congruency of the mutant aa residue to the sequence in alignment with the mutant residue

  • IDQMIN : query sequence identity with the closest homologue deviating from the wild type aa residue

Comments :

  • COMMENTS : Optional user comments

Author(s)

Valerie Obenchain

See Also

?PolyPhenDb


Bioconductor/VariantAnnotation documentation built on March 28, 2024, 10 a.m.