Home

/

GitHub

/

c1au6i0/crispR

/

In c1au6i0/crispR: find protospacers

crispR

Takehome exercises (v1[2]) for Bioinformatics Software Engineer position at Vertex Pharmaceuticals.

I wrote the functions directly in a R package. This facilitates the installation of all the dependencies...

Installation

# install.packages("devtools")
devtools::install_github("c1au6i0/crispR")

Part1 Answers

a) the code for the function

You can access the code of the function find_proto here.

b) the code to call the function with the example variables (and others, if desired)

library(crispR)
find_proto(d_seq = "TGATCTACTAGAGACTACTAACGGGGATACATAG",
           l = 2,
           PAM = "NGG")

..or using DNA of the Dopamine Transporter (DAT internal data):

library(crispR)
print(DAT)

library(crispR)
find_proto(d_seq = DAT, 
           l = 20, 
           PAM = "NGG")

c) the time complexity for the function (in big-O notation)

I am not explicitly using any loop, but my function is in any case iterating and looking at each nucleotide of the sequence by using grep (stringr and regular expressions).

time Complexity: O(n)

Part2 Answers.

a) The code for the function

You can access the code of the function find_FASTA here.

b) The source of the FASTA file used for the reference genome in the example problem

I downloaded the Reference Genome Sequence GRCh38 from here.

c) How many candidate guide (protospacer) sequences were identified in the example problem

A total of 54 protospacers were identified on strand (+). Please note the arguments "start", "end" and "l" are 1-indexed and intervals are fully closed.

d) The list of candidate guide (protospacer) sequences in a tab-delimited file...

A tab-delimited file can be downloaded here.

Dependencies

All the dependencies are listed in the Description file in my github account here.

Time needed to right the code

A quick and dirty version can be written probably in 1 hour or less. I polished the code, wrote the documentation too, and in total it took me few hours... but I also spent quite some time thinking about the reverse complementary strand!

c1au6i0/crispR documentation built on Feb. 27, 2020, 12:42 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

c1au6i0/crispR
find protospacers

In c1au6i0/crispR: find protospacers

crispR

Installation

Part1 Answers

a) the code for the function

b) the code to call the function with the example variables (and others, if desired)

c) the time complexity for the function (in big-O notation)

Part2 Answers.

a) The code for the function

b) The source of the FASTA file used for the reference genome in the example problem

c) How many candidate guide (protospacer) sequences were identified in the example problem

d) The list of candidate guide (protospacer) sequences in a tab-delimited file...

Dependencies

Time needed to right the code

R Package Documentation

Browse R Packages

We want your feedback!

c1au6i0/crispR find protospacers

In c1au6i0/crispR: find protospacers

crispR

Installation

Part1 Answers

a) the code for the function

b) the code to call the function with the example variables (and others, if desired)

c) the time complexity for the function (in big-O notation)

Part2 Answers.

a) The code for the function

b) The source of the FASTA file used for the reference genome in the example problem

c) How many candidate guide (protospacer) sequences were identified in the example problem

d) The list of candidate guide (protospacer) sequences in a tab-delimited file...

Dependencies

Time needed to right the code

R Package Documentation

Browse R Packages

We want your feedback!

c1au6i0/crispR
find protospacers