crispR

Takehome exercises (v1[2]) for Bioinformatics Software Engineer position at Vertex Pharmaceuticals.

I wrote the functions directly in a R package. This facilitates the installation of all the dependencies...

Installation

# install.packages("devtools")
devtools::install_github("c1au6i0/crispR")

Part1 Answers

a) the code for the function

You can access the code of the function find_proto here.

b) the code to call the function with the example variables (and others, if desired)

library(crispR)
find_proto(d_seq = "TGATCTACTAGAGACTACTAACGGGGATACATAG",
           l = 2,
           PAM = "NGG")

..or using DNA of the Dopamine Transporter (DAT internal data):

library(crispR)
print(DAT)
library(crispR)
find_proto(d_seq = DAT, 
           l = 20, 
           PAM = "NGG")

c) the time complexity for the function (in big-O notation)

I am not explicitly using any loop, but my function is in any case iterating and looking at each nucleotide of the sequence by using grep (stringr and regular expressions).

time Complexity: O(n)

Part2 Answers.

a) The code for the function

You can access the code of the function find_FASTA here.

b) The source of the FASTA file used for the reference genome in the example problem

I downloaded the Reference Genome Sequence GRCh38 from here.

c) How many candidate guide (protospacer) sequences were identified in the example problem

A total of 54 protospacers were identified on strand (+). Please note the arguments "start", "end" and "l" are 1-indexed and intervals are fully closed.

d) The list of candidate guide (protospacer) sequences in a tab-delimited file...

A tab-delimited file can be downloaded here.

Dependencies

All the dependencies are listed in the Description file in my github account here.

Time needed to right the code

A quick and dirty version can be written probably in 1 hour or less. I polished the code, wrote the documentation too, and in total it took me few hours... but I also spent quite some time thinking about the reverse complementary strand!



c1au6i0/crispR documentation built on Feb. 27, 2020, 12:42 a.m.