README.md

hsvdrg

minimal R
version GitHub repo
size GitHub code size in
bytes

Overview

hsvdrg is a R package to enable antiviral drug resistance genotyping, with Herpes Simplex Virus 1 sequencing data. Accepted inputs are FASTA (whole genomes & fragments) which will be mapped to RefSeq NC_001806.2. NGS variant data assembled to NC_001806.2 is accepted in VCF >= ver4.0 & Varscan2 tab formats.

Database

All data extracted from https://academic.oup.com/jac/article/71/1/6/2363653 and contains the relationships between:

Web service

A user-friendly Shiny Applications has been bundled with this package. The same application is available over the internet here http://cmv-resistance.ucl.ac.uk/hsvdrg/ where the terms of use are contained.

Installation

You can install the current version from GitHub with:

# install.packages("devtools")
devtools::install_github("ojcharles/hsvdrg")

Dependencies for FASTA file handling are MAFFT and SNP-Sites available preferably via conda. snp-sites >= 2.3 has been tested.

conda config --add channels bioconda
conda install snp-sites
conda install mafft

Usage

vcf data

library("hsvdrg")

## call resistant variants
my_sample = system.file("testdata", "F716L.vcf", package = "hsvdrg")

data = call_resistance(infile = my_sample, all_mutations = F)

data[ , c("change", "freq", "Aciclovir", "Pencyclovir", "Foscarnet")]
#>       change freq Aciclovir Pencyclovir Foscarnet
#> 1 UL30_F716L 100%       1.8                   2.9


## call all variants
mutations_all = call_resistance(infile = my_sample, all_mutations = T)

#to view all mutations in resistance genes we can filter
mutations_res = mutations_all[mutations_all$GENEID %in% c("UL30"),]

head(mutations_res[,c(1,8,21,32:40)])
#>        change freq   CONSEQUENCE Aciclovir Cidofovir Foscarnet Brivudin
#> 1 UL30_A1235A 100%    synonymous      <NA>      <NA>      <NA>     <NA>
#> 2    UL30_F2F 100%    synonymous      <NA>      <NA>      <NA>     <NA>
#> 3  UL30_F716L 100% nonsynonymous       1.8       2.9       2.9         
#> 4 UL30_G1006G 100%    synonymous      <NA>      <NA>      <NA>     <NA>
#>   Pencyclovir ref_link ref_doi            test_method tm_class
#> 1        <NA>     <NA>      NA                   <NA>     <NA>
#> 2        <NA>     <NA>      NA                   <NA>     <NA>
#> 3                   84      NA plaque reduction assay         
#> 4        <NA>     <NA>      NA                   <NA>     <NA>


## call nonsynonymous variants
# are there any non-synonymous (DNA variants that result in a change of amino acid) variants in resistance genes
mutations_res_nonsyn = mutations_res[mutations_res$CONSEQUENCE == "nonsynonymous",]


# here the top 3 mutations are nonsynonymous, with no identified resistance effect.
head(mutations_res_nonsyn[,c(1,8,21,32:40)])
#>       change freq   CONSEQUENCE Aciclovir Cidofovir Foscarnet Brivudin
#> 3 UL30_F716L 100% nonsynonymous       1.8       2.9       2.9         
#>   Pencyclovir ref_link ref_doi            test_method tm_class
#> 3                   84      NA plaque reduction assay

fasta sequences

# hsvdrg accepts sequence fragments or whole-genomes, one fasta sequence at a time.

# load example data
my_sequence = system.file("testdata", "F716L.fasta", package = "hsvdrg")

dat = call_resistance(infile = my_sequence, all_mutations = T)
#> [1] "fasta found"

head(dat)
#>        change    seqnames start   end width strand         id freq RefCount
#> 1 UL30_A1235A NC_001806.2 66511 66511     1      + single run 100%        0
#> 2    UL30_F2F NC_001806.2 62812 62812     1      + single run 100%        0
#> 3  UL30_F716L NC_001806.2 64952 64952     1      + single run 100%        0
#> 4 UL30_G1006G NC_001806.2 65824 65824     1      + single run 100%        0
#> 5  UL31_M290T NC_001806.2 66511 66511     1      - single run 100%        0
#>   VarCount VarAllele varAllele CDSLOC.start CDSLOC.end CDSLOC.width PROTEINLOC
#> 1        1         G         G         3705       3705            1       1235
#> 2        1         C         C            6          6            1          2
#> 3        1         C         C         2146       2146            1        716
#> 4        1         C         C         3018       3018            1       1006
#> 5        1         G         C          869        869            1        290
#>   QUERYID TXID  CDSID GENEID   CONSEQUENCE REFCODON VARCODON REFAA VARAA
#> 1       4   15 18, 62   UL30    synonymous      GCA      GCG     A     A
#> 2       1   15     18   UL30    synonymous      TTT      TTC     F     F
#> 3       2   15     18   UL30 nonsynonymous      TTC      CTC     F     L
#> 4       3   15     18   UL30    synonymous      GGA      GGC     G     G
#> 5       4   62 18, 62   UL31 nonsynonymous      ATG      ACG     M     T
#>   aachange mutation_id virus genotype gene aa_change Aciclovir Cidofovir
#> 1   A1235A          NA  <NA>       NA <NA>      <NA>      <NA>      <NA>
#> 2      F2F          NA  <NA>       NA <NA>      <NA>      <NA>      <NA>
#> 3    F716L         473  HSV1       NA UL30     F716L       1.8       2.9
#> 4   G1006G          NA  <NA>       NA <NA>      <NA>      <NA>      <NA>
#> 5    M290T          NA  <NA>       NA <NA>      <NA>      <NA>      <NA>
#>   Foscarnet Brivudin Pencyclovir ref_link ref_doi            test_method
#> 1      <NA>     <NA>        <NA>     <NA>      NA                   <NA>
#> 2      <NA>     <NA>        <NA>     <NA>      NA                   <NA>
#> 3       2.9                            84      NA plaque reduction assay
#> 4      <NA>     <NA>        <NA>     <NA>      NA                   <NA>
#> 5      <NA>     <NA>        <NA>     <NA>      NA                   <NA>
#>   tm_class co_gene co_aa created_date created_by                    note status
#> 1     <NA>      NA    NA         <NA>       <NA>                    <NA>   <NA>
#> 2     <NA>      NA    NA         <NA>       <NA>                    <NA>   <NA>
#> 3               NA    NA   10/03/2022  OJCharles cdv 2.9. but sensetive?      A
#> 4     <NA>      NA    NA         <NA>       <NA>                    <NA>   <NA>
#> 5     <NA>      NA    NA         <NA>       <NA>                    <NA>   <NA>
#>    X
#> 1 NA
#> 2 NA
#> 3 NA
#> 4 NA
#> 5 NA

other features

## view the full database
db = hsvdrg_data()
head(db$aa_change)
#> [1] "C6G"  "H7P"  "A12P" "A12T" "D14V" "D14Y"


## run the shiny application
# runShinyHSV()

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on the GitHub Issues page. For questions and other discussions feel free to contact. Oscar Charles - maintainer



ojcharles/hsvdrg documentation built on Jan. 19, 2021, 2:02 p.m.