TileSeqMave is a bioinformatics pipeline for processing sequencing data from MAVE experiments based on the TileSeq protocol. For more an introduction and more background on TileSeq and MAVEs see Weile et al., 2017 and Weile & Roth., 2018. The pipeline allows for the calculation of fitness scores from sequencing reads and provides a suite of quality control visualizations.
TileSeqMave constitutes the second half of the TileSeq Pipeline. The first half is implemented in tileseq_mutcount
and provides alignment and variant calling of TileSeq sequencing reads.
tileseqMave
is compatible with Unix-based operating systems, requires R 3.6.3 or greater and has a number of R package dependencies, which will be installed automatically. It also uses the output of tileseq_mutcount
, which should be installed separately on an HPC platform.
Finally, some of the modular components within tileseqMave also rely on other (optional) dependencies. Namely, to condense QC reports into single PDFs, ghostscript
needs to be installed; and to conveniently draw a heatmap of the variant effect maps for the score sets calculated by tileseqMave, the mavevis
package would need to be installed. But these functionalities are optional.
For the following instructions, you will need a basic familiarity with the Unix command line. To install the tileseqMave
package, use the R
package remotes
. Open an R session and execute:
install.packages("remotes")
remotes::install_github("jweile/tileseqMave")
After the installation completes successfully, you can close the R session again. If the installation fails for some reason, look up the exact error message and consult the "Troubleshooting" section below.
Important: To ensure that the TileseqMave suite can be executed easily on the command line, its main executable needs to be available via your $PATH
variable. The easiest way to achieve this is to symlink them to your preferred bin/
directory after having installed the package as shown above. For UNIX
systems, pick a directory that's already on your $PATH
, for example ~/bin/
or ~/.local/bin/
. For example, if you want to use the latter, you can create your symlinks as follows:
#determine the repository location
REPO=$(Rscript -e 'cat(system.file(package="tileseqMave"),"\n",sep="")')
#copy the tsm script to your bin directory
cp $REPO/scripts/tsm ~/.local/bin/
Below are the solutions to a few problems we previously encountered:
If you're attempting to install software from github from a busy location with a shared IP address, such as a university HPC server, it is possible that other users have already exhausted the maximum allowed number of unauthenticated downloads from github that day. To get around this, you can create an account on github and use your account credentials for the download. After creating your account, follow these instructions to create a 'personal access token' (PAT). Then, go to your home directory and create file called .Renviron
(if it doesn't already exist). Add the following line to the file, with your PAT token after the quals sign:
GITHUB_PAT=your_token_here
Packages in R must be compiled using the same C compiler that was used to compile R itself when it was installed. If R was installed by someone else, e.g. your system administrator, they may have used a different C compiler to do so. Make sure you use the same one.
For users of the GALEN cluster at the Roth Lab: R was installed using the default gcc 8.4.0 by the administrator, but by default, the modules
system provides a newer version (gcc 9.0). Thus before trying to install any R packages, you must first disable that module using
module unload gnu9
If you don't use C compilation for any other purposes, I suggest you configure your .bashrc
to automatically do this upon login, by adding the following lines to it:
#disable R-incompatible C-compiler
case "$HOSTNAME" in
galen) module unload gnu9;;
esac
Most likely, you or the local system administrator upgraded your local R installation. If this is a major upgrade, it comes with its own local package repository and you'll have to re-install your packages, including tileseqMave. After re-installing, you'll also have to update your script symlinks to point to the new repo (see next issue below).
If R was recently upgraded to a newer version on your system, and tileseqMave was re-installed into the corresponding repo, the symlinks in your bin
directory will still point to the old repo instead. You'll have to delete your old symlinks and create new ones (as shown above).
As part of the greater TileSeq workflow, the TileSEQ pipeline constitutes the last link in the chain. A flowchart of the entire workflow is shown in Figure 1 below. The workflow begins with Illumina sequencing and FASTQ conversion. The FASTQ files are then fed to tileseq_mutcount
, which performs the steps highlighted in orange: Reads are aligned to the target sequence template and then a Bayesian method is used to call the most likely true variants. This brings us to TileSeqMave, which performs the steps highlighted in blue. Using the variants called by in the previous step, TileSeqMave translates variant calls into protein-level amino acid changes, tallies them and calculates enrichment between assay conditions, while modeling the magnitude of underlying measurement error. Finally, it also provides a suite of quality control outputs. The workflow uses a standardized parameter sheet which controls the details of the execution across the entire workflow. Both tileseq_mutcount
and TileseqMave directly use this parameter sheet extensively.
Figure 1: Overview of the TileSeq workflow. The orange section is covered by tileseq_mutcount
, while the blue section is covered by tileseqMave.
The functions performed by TileSeqMave are organized in a set of discrete modules, which are all accessed via a common umbrella command: tsm
. The general syntax for running any modules is tsm <moduleName> [parameters]
; for example tsm csv2json parameterSheet.csv -o parameters.json
. The detailed parameters for each module are explained in detail in each module's respective section below. Overall, the following modules are available:
tileseq_mutcount
. The data is combined into a single table, tabulating the read counts and relative frequencies for each unique variant in each condition and replicate. It also calculates the protein-level consequences of each variant and produces marginal counts and frequencies.In addition to these primary modules, there are a number of secondary helper tools:
mavevis
at the command line level and can automatically visualize the output of the scaleScores module as variant effect maps.TileseqMave and tileseq_mutcount
use a common central input document called the parameter sheet, which is designed to capture all relevant aspects of the TileSEQ experiment from which the sequencing data was generated. The parameter sheet is provided as a comma-separated-values (CSV) file, following a strict template. The easiest way to create such a file is to use a spreadsheet editor, such as Google Sheets, LibreOffice or Excel. An example spreadsheet can be found here. From there it can be exported to CSV format and processed using the csv2json
module which will validate the contained data and convert it into a JSON document, which is used by all downstream modules.
The parameter sheet has the following sections:
is_selection_for
, and is_wt_control_for
. For example, if we have defined three conditions in section 4; sel
, non
and ctrl
, then sel
could be the selection condition for non
; and ctrl
could be the wt control for both of them. These relationships are defined using a fixed table with three columns:is_selection_for
or is_wt_control_for
10
, then a disgreement 10x higher than expectation triggers the filter)calcEnrichment
. scaleScores
module, its exact contents cannot reasonably be chosen until after inspection of the selectionQC
results. It is therefore not necessary to include until this step is performed. Its purpose is to manually define the pivot points for the score scaling. The pivot points represent the best guess for the expected behaviour of null-like and WT-like variants. The table has the following columns:A template for the parameter sheet can be found here. You can use any spreadsheet software to fill it out according to the instructions above; then export it into CSV format.
Internally, both tileseq_mutcount
and all modules of tileseqMave
use a JSON-formatted version of the sheet. To convert the parameter sheet from CSV to JSON format you can use the csv2json
script. This will also perform a validation of the parameter sheet, ensuring that requirements have been met. The module is used as follows:
tsm csv2json [--] [--help] [--srOverride] [--silent]
[--outfile OUTFILE] [--logfile LOGFILE] infile
Checks TileSeq parameter CSV file for validity and converts it to JSON
format.
positional arguments:
infile input file
flags:
-h, --help show this help message and exit
-s, --srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
--silent Turn off message printing to stdout
optional arguments:
-o, --outfile output file. Defaults to parameters.json in the
same directory.
-l, --logfile log file. Defaults to csv2json.log in the same
directory.
As you can see, the module allows for a flag called --srOverride
which enables a compatibility mode with a limited feature set that allows for the processing of data sets with only a single replicate. Using this flag will show a prominent warning message about all the potential problems and side-effects of this mode. To proceed with analyzing such data sets, the flag must also be set during all subsequent analysis steps.
This step is performed by tileseq_mutcount. Check out the its github page for detailed instructions. Executing the variant caller requires a HPC environment. In the Roth Lab, we use the BC2 or DC clusters at the Donnelly Centre or the Galen cluster at the Lunenfeld Tanenbaum Research Institute.
As a first step, you will need want to create a project directory in which the output will be collected. Place your parameter sheet CSV and JSON file for this project inside the folder to ensure all input and output is organized together. Ideally, you will want to name the file parameters.json
, as this is the assumed default name and will allow you to excuted most of the downstream modules without having to specify the parameter file as an argument. Then, run the script following the instruction on the tilseq_mutcount repository page, using the project directory as the output folder.
Among the output of tileseq_mutcount, you will find a subdirectory with a name following the pattern <tag>_<timestamp>_mut_count/
, where <timestamp>
indicates the date and time of execution, for example: mthfr_2020-02-20-11-15-49_mut_count/
. This *mut_count
subdirectory will serve as the input for the next step. It contains individual count files for each sequencing sample. So you may want to move it to the top level of your project directory.
In this step, we will collate the counts from the individual sequencing samples, organize them by variant and condition/replicate, and translate the called variants to protein level. This step is performed by joinCounts
module. By default, it will look for the parameter sheet JSON file and the *mut_count
directory in your current working directory and automatically proceed from there. However the command has a number of arguments that allow for more flexibility:
usage: tsm joinCounts [--] [--help] [--srOverride] [--covOverride]
[--silent] [--workspace WORKSPACE] [--input INPUT]
[--output OUTPUT] [--parameters PARAMETERS] [--logfile LOGFILE]
[--cores CORES]
Reads the output of fastq2Count to construct allCounts.csv and
marginalCounts.csv
flags:
-h, --help show this help message and exit
-s, --srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
--covOverride Manual override to ignore positional coverage. For
backwards-compatibility with data prior to v0.7
--silent Turn off message printing to stdout
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-i, --input input directory containing the count data. Defaults
to subdirectory with latest timestamp ending in
_mut_count
-o, --output output directory. Defaults to same as input
directory
-p, --parameters parameter file. Defaults to parameters.json in the
workspace directory.
-l, --logfile log file. Defaults to joinCounts.log in the
workspace directory
-c, --cores number of CPU cores to use in parallel for
multi-threading [default: 6]
Notably, joinCounts
also has the --srOverride
flag, which allows for limited functionality on single replicate datasets. The -w
, -i
, -o
, -p
, and -l
options allow you to specify the locations of the input, output, parameter sheet and log files, in case you are not using the default directory structure or naming schemes for your project.
The -c
parameter allows you to specify the number of available CPU cores on your hardware that you are willing to use. This can help speed up the variant translation process, however it also multiplies the consumption of RAM, as each parallel segment will be run using separate process forks. For example, a process that takes 3 hours and 500MB of RAM on a single CPU, would take ~30min and 3GB of RAM when using 6 CPUs. We therefore recommend that you choose this trade-off carefully.
The --covOverride
provides backwards compatibility with data generated prior to tileseqMut version 0.7. Those earlier versions did not record position-specific sequencing coverage information int the tileseq_mutcount output. Using this flag ignores these records and instead assumes even coverage across each tile.
After completion, joinCounts
will have added three new files to the mut_count
directory: sample_depths.csv
, all_counts.csv
and marginal_counts.csv
.
sample_depths.csv
file contains a table listing the total read depth for each sequencing sample, as well as other summarizing information regarding the samples from the parameter sheet.positionalDepths.csv
file contains a table listing the position-specific effective depth at each nucleotide position in each sample. The effective depth indicates how many of the basecalls in that position were confidently decidable and thus useful for estimating variant frequency.all_counts.csv
file will contain a huge table of all variant combinations found in any tile, as well as the read count and frequency of each of these combinations across all samples, organized by condition, replicate and time point. Inspecting the file using less
will reveal that the file also contains a header that lists the project name, pipeline version and parameter sheet location.marginal_counts.csv
file contains a table of all unique amino acid changes, as well as their marginal counts and marginal frequencies. The term "marginal" indicates that these counts and frequencies explicitly ignore any co-occurrences between variants. They explicitly do not add up to 1. Instead, they add up to the average number of mutations per read (also called λ below). Like the previous file, marginal_counts.csv
also starts with a header with key information about the project.After joinCounts
has completed, a library QC analysis can be performed using runLibraryQC
. This will generate a number of diagnostic plots for each non-select condition in the data set. Like the previous module, by default, runLibraryQC
will look for the parameter sheet JSON file and the *mut_count
directory in your current working directory and automatically proceed from there, but a number of arguments allow for more flexibility:
usage: tsm runLibraryQC [--] [--help] [--srOverride] [--allConditions]
[--silent] [--workspace WORKSPACE] [--input INPUT]
[--output OUTPUT] [--parameters PARAMETERS] [--logfile LOGFILE]
[--cores CORES] [--wmThreshold WMTHRESHOLD]
Performs a library QC analysis on the output of joinCounts, yielding
informative plots.
flags:
-h, --help show this help message and exit
-s, --srOverride Manual override to allow singleton replicates.
USE WITH EXTREME CAUTION!
-a, --allConditions Run on all conditions instead of just nonselect.
--silent Turn off message printing to stdout
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-i, --input input directory containing the count data.
Defaults to subdirectory with latest timestamp
ending in _mut_count
-o, --output output directory. Defaults to name of input
directory with _QC tag
-p, --parameters parameter file. Defaults to parameters.json in
the data directory.
-l, --logfile log file. Defaults to libraryQC.log in the same
directory
-c, --cores number of CPU cores to use in parallel for
multi-threading [default: 6]
--wmThreshold Define the marginal frequency threshold for
well-measuredness. [default: 5e-5]
The --srOverride
, -w
, -i
, -o
, -p
, -l
, and -c
options operate as before (see above). But you may notice a few additional arguments. As mentioned above, this QC function is intended to be run on nonselect conditions only. To override this default behaviour: -a
allows for the library QC to be performed on all conditions instead. However, it should be noted that this prevents normalization against the WT control, as most other conditions do not have a defined relationship with the control in the parameter sheet. Finally, the --wmThreshold
allows for modifying the marginal frequency threshold at which variants are to be considered "well-measured". This only affects the appearance of a few output plots, which mark this threshold. There is also a -a
flag which causes the program to generate libraryQC plots not only for the nonselect condition, but for all given conditions in the data, including any selection conditions or wildtype controls.
This module produces a number of diagnostic plots in PDF format. All the plots will be saved as PDF files in a new folder called <timestamp>_QC
(where timestamp
is the time of the original variant caller execution). A helper tool---condenseQC
---can be used to combine all the individual plots into a single printer-friendly PDF document.
ΦX
reads. The leftmost (dark-gray) bar shows the intended error rate according to PHRED specifications for comparison.For simple library QC runs that are only interesting in assessing the quality of a PopCode library, the analysis ends here. However if the experiment you are analyzing is a full MAVE assay, you will want to proceed with the enrichment function next.
After joinCounts
has completed (and optionally runLibraryQC
), the enrichment function calcEnrichment
can be called. This module will analyze the enrichment of variants as a result of selection in the underlying assay.
Thus, it requires that at least one selection condition was declared in the parameter sheet. The enrichment function will then:
Like the previous modules, calcEnrichment
will by default look for the parameter sheet JSON file and the *mut_count
directory in your current working directory and automatically proceed from there, but a number of arguments allow for more flexibility:
usage: tsm calcEnrichment [--] [--help] [--srOverride] [--bnOverride]
[--bcOverride] [--wtFilter] [--optimistic] [--useQuorum]
[--silent] [--workspace WORKSPACE] [--input INPUT]
[--output OUTPUT] [--parameters PARAMETERS] [--logfile LOGFILE]
[--cores CORES] [--bootstrap BOOTSTRAP]
Calculates logPhi enrichments based on the output of joinCounts
flags:
-h, --help show this help message and exit
-s, --srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
-b, --bnOverride Manual override to disable bottleneck filtering.
--bcOverride Manual override to disable bias correction.
--wtFilter Enable WT filter. Removes variants with excess WT
ctrl counts.
--optimistic Uses bayesian regularization instead of maximizing
between prior and empiric SD
-u, --useQuorum Use majority vote to exclude frequency outliers
--silent Turn off message printing to stdout
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-i, --input input directory containing the count data. Defaults
to subdirectory with latest timestamp ending in
_mut_count
-o, --output output directory. Defaults to name of input
directory with _scores tag
-p, --parameters parameter file. Defaults to parameters.json in the
data directory.
-l, --logfile log file. Defaults to 'enrichment.log' in the same
directory
-c, --cores number of CPU cores to use in parallel for
multi-threading [default: 6]
--bootstrap Number of bootstrap samples to use. NA uses
heuristics instead. [default: 10000]
The --srOverride
, -w
, -i
, -o
, -p
, -l
, and -c
options operate as before (see above). But you may notice a few additional arguments.
The --bnOverride
flag allows the disabling of bottleneck filters. This is not recommended and generally intended only for testing purposes. Instead, you may wish to adjust the filter parameters in the parameter sheet.
The --optimistic
argument adjusts the error regularization method to use a more optimistic model. Rather than using the maximum error predicted by either the model or the empirical replicate disagreement, it uses forms Bayesian weighted average (based on the pseudo-replicates assigned in the parameter sheet).
The --bootstrap
argument This can be used to set the number of samples to use for bootstrapping. Greater values increase runtime, but may result in higher precision of error estimates (subject to diminishing returns). To disable bootstrapping, this can be set to NA or 0, in which case analytical error propagation is used instead.
The --bcOverride
disables the bias correction procedure. Instead of using regression of logPhi enrichments of synonymous and nonsense variants against the underlying marginal frequency to calculate the BCE (bias-corrected enrichment) outputs, the module will instead use simple averages of synonymous and nonsense logPhis. The resulting pseudo-BCE values thus resemble more a preliminary scaling operation.
The --wtFilter
Enables the WT-control filter, which removes variants in which the WT control marginal frequencies are excessively high (i.e. above the quantile defined in the parameter sheet).
The --useQuorum
enables a quorum voting method in which outlier replicates for each given variant are discarded if enough replicates exist.
The results of the enrichment function will be produced in a new directory called <timestamp>_scores
(where timestamp
is again the time of the original variant caller execution). It will contain a CSV-formatted file for each selection condition called enrichment.csv
; a table containing all intermediate results for all variants of the internal calculations, regardless of filter status. A column called "filter" indicates whether the variant was filtered out and if so, for what reason. The table will be preceded by a header which recapitulates key parameter settings from the parameter sheet as well as the project name, date and software version.
At the end of the enrichment step, either a selection QC step can be performed to gain insight into the quality of the assay, or alternatively, one may proceed directly to the final score scaling step.
After the enrichment module has been executed, a selection QC can be performed using runSelectionQC
. Similarly to the library QC function, this will generate a number of diagnostic plots as PDF files. Like the previous modules, it will by default look for the parameter sheet JSON file, the *_mut_count
, and the *_scores
directory in your current working directory and automatically proceed from there, but a number of arguments allow for more flexibility:
usage: tsm runSelectionQC [--] [--help] [--srOverride] [--silent]
[--workspace WORKSPACE] [--counts COUNTS] [--scores
SCORES] [--output OUTPUT] [--parameters PARAMETERS] [--logfile
LOGFILE]
Performs a selection QC analysis on the output of runScoring,
yielding informative plots.
flags:
-h, --help show this help message and exit
--srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
--silent Turn off message printing to stdout
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-c, --counts input directory containing the count data. Defaults
to subdirectory with latest timestamp ending in
_mut_count
-s, --scores input directory containing the score data. Defaults
to subdirectory with latest timestamp ending in
_scores
-o, --output output directory. Defaults to name of input
directory with _scores tag
-p, --parameters parameter file. Defaults to parameters.json in the
data directory.
-l, --logfile log file. Defaults to selectionQC.log in the same
directory
The --srOverride
, -w
, -o
, -p
, and -l
, options operate as before (see above). However, you will notice that instead of the generic input argument of previous modules, this module distinguishes between two different inputs: -c
for the mut_counts
folder and -s
for the _scores
folder.
The runSelectionQC
module will write its output as PDF files into the same _QC
that was previously used by the library QC module. Using the helper tool condenseQC
the individual files can be condensed into a single printer-friendly PDF document. The output will contain the following diagnostic plots:
After the enrichment process has been completed, it is recommended that you run the selectionQC as shown above and inspect the BCE distributions. These will yield insights into where to place place the pivot points for the scaling process. The aim is to find BCE values for each tile (and each selection condition if there is more than one) that represent the expected behavior of a null-like and WT-like variant. Often, this is well-approximated by the median of the synonymous and null distributions, respectively, but other considerations can influence this decision. Once a decision has been made, you can enter the pivot points into the parameter sheet (in the scaling pivots section) and re-validate it using the csv2json
module. Then, the scaleScores
module can be used to enact the scaling based on these pivots.
Like the previous modules, scaleScores
will by default look for the parameter sheet JSON file, the *_mut_count
, and the *_scores
directory in your current working directory and automatically proceed from there, but a number of arguments allow for more flexibility:
usage: tsm scaleScores [--] [--help] [--srOverride] [--bnOverride]
[--autoPivot] [--silent] [--workspace WORKSPACE]
[--scores SCORES] [--output OUTPUT] [--parameters PARAMETERS]
[--logfile LOGFILE]
Performs scaling and scoring on the output of calcEnrichment.
flags:
-h, --help show this help message and exit
--srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
-b, --bnOverride Manual override to disable bottleneck filtering.
-a, --autoPivot Automatically determine scale pivots with requiring
definitions in parameter sheet
--silent Turn off message printing to stdout
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-s, --scores input directory containing the score data. Defaults
to subdirectory with latest timestamp ending in
_scores
-o, --output output directory. Defaults to name of input
directory with _scores tag
-p, --parameters parameter file. Defaults to parameters.json in the
data directory.
-l, --logfile log file. Defaults to selectionQC.log in the same
directory
The --srOverride
, --bnOverride
, -w
, -s
, -o
, -p
, and -l
, options all operate the same as in the calcEnrichment
module (see above). In addition, the --autoPivot
flag is available. It allows to bypass the requirement for setting pivot points in the parameter sheet and automatically uses the medians of the synonymous and nonsense distributions. Use with caution!
The output of scaleScores
will be written to the _scores
subfolder (like the output of calcEnrichment
before). For each selection condition, it will generate:
simple.csv
: A simplified result file with the scaled scores in MaveDB-compatible format, containing only the variants that passed all filters with their final scores and their associated error.simple_aa.csv
: The scaled scores at amino acid change level scores via confidence-weighted averaging of equivalent codon changes. This file is also in MaveDB format.simple_aa_floored.csv
: Same as above but with scores floored to 0 before averaging.All the above files are preceded by headers that list crucial parameters, a link to the parameter sheet, as well as the project name and the software version.
Mavevis is an optional tool for visualizing heatmaps of variant effect maps. If you have it installed, tileseqMave provides a useful wrapper script to incorporate it into your workflows. However, installing mavevis requires first installing a few additional software tools; namely: DSSP, FreeSASA, and ClustalOmega. For example on how to do this on an Ubuntu system with root access, see the mavevis github page. If you don't have Ubuntu with root access, you'll have to install these tools manually.
Only after having installed these tools, you can install mavevis in R via remotes::install_github("VariantEffect/mavevis")
.
Having completed the installation, you can now use the wrapper script as follows:
usage: tsm mavevisLocal [--] [--help] [--squish] [--srOverride] [--opts
OPTS] [--workspace WORKSPACE] [--input INPUT] [--output OUTPUT]
[--parameters PARAMETERS] [--pdb PDB]
Runs mavevis locally on a MaveDB-formatted file to produce a
genophenogram.
flags:
-h, --help show this help message and exit
-s, --squish Compress x-axis to fit map onto single screen.
--srOverride Manual override to allow singleton replicates. USE
WITH EXTREME CAUTION!
optional arguments:
-w, --workspace workspace data directory. Defaults to current
working directory
-i, --input either input file or input directory containing the
count data. Defaults to subdirectory with latest
timestamp ending in _scores
-o, --output either output pdf file or output directory. Default
adjusts automatically depending on input type
-p, --parameters parameter file. Defaults to parameters.json in the
data directory.
--pdb PDB structures. Semicolon-separated list of
#-separated pairings between PDB IDs and chain IDs.
The output of mavevisLocal
will be written to the _scores
subfolder (like the output of calcEnrichment
and scaleScores
before). It will generate a PDF file with a heatmap for each MaveDB-formatted file in the folder by default.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.