The QTL Viewer utilizes R and several different libraries in order to calculate the data for various types of QTL projects. The following sections will explain each element in detail.
Please note that some data element must be pre-computed.
The following elements should be contained within the R environment. These can be in one or multiple RData and/or Rds files.
Element | Description
--- | ---
ensembl_release
| the numerical version of Ensembl
genoprobs
| the genotype probabilities
K
| the kinship matric
map
| list of one element per chromosome, with the genomic position of each marker
markers
| marker names and positions
The following element is a special element. A good practice is to keep a one to one matching between dataset and Rds file.
Element | Description
--- | ---
dataset.*
| where * should be a very short, unique and informative name. This element will contain most of the data and will be detailed in the section below.
Exact case of element and variable names is very important.
Other meta data can be included in the RData file as long as there are no conflicting names.
R data type:
numeric
This specifies the genome release version for the genomic marker positions and for annotations attached to molecular phenotypes IF any, i.e. mRNA. Please see the documentation at Ensembl for build and release information.
R data type:
list, calc_genoprobs
This is the genotype probabilities and must be supplied by the user. This is a list with one element per chromosome of N * K * Mj arrays, where:
rownames(genoprobs)
are the same value of the sample id column in the samples element
colnames(genoprobs)
are strains, for the founder strains they are symbols A,B,C,D,E,F,G,H
dimnames(genoprobs[[j]])
are marker names on chromosome j
May be produced by qtl2convert::probs_doqtl_to_qtl2. Please see the documentation of R/qtl2geno.
R data type:
list
A list of kinship matrices, with one element per chromosome of N * N matrices, where:
rownames(K)
are the same value of the sample id column in the samples element
colnames(K)
are the same value of the sample id column in the samples element
May be produced by qtl2geno::calc_kinship(genoprobs, type=”loco”). Please see the documentation of R/qtl2geno.
R data type:
list
This is a list with one element per chromosome of named numeric vector. Elements of the vector are positions along the chromosome in Mb units. Element names are marker names and must match the dimnames of genoprobs.
Users can download maps for MUGA platforms or for 69k pseudomarker grid.
May be produced by qtl2convert::map_df_to_list. Please see the documentation of R/qtl2geno.
R data type:
tibble
Marker information containing the following information:
The environment must contain at least one object of this type, multiple are allowed. The * should be a very short, unique and informative name. It is for internal use only and will not appear in the QTL Viewer interface.
The main purpose of the dataset.* element is to store multiple datasets per RData file with informative information regarding the data.
The dataset.* element is a list that should contain the following named elements:
Element | Description
--- | ---
annot.**datatype**
| annotations, where datatype is one of mrna, protein, phos or phenotype
annot_samples
| annotation data for the samples
covar_info
| (optional) information describing the covariates
data
| either a matrix containing data or a list containing several kinds of data
datatype
| one of mrna, protein, phos or phenotype
display_name
| name of the dataset, for QTL Viewer display purposes
lod_peaks
| (optional) a list of LOD peaks over a certain threshold
R data type:
tibble
The annot_datatype element will have different data and column names depending on whether this is a mrna, protein, phos or phenotype dataset.
For mrna, the following fields are required:
Field | Description
--- | ---
gene_id
| character string, Ensembl gene id
symbol
| character string, Symbol of the gene
chr
| character string, chromosome
start
| numeric, position in Mbp
end
| numeric, position in Mbp
For protein, all mrna fields PLUS the following field:
Field | Description
--- | ---
protein_id
| character string, Ensembl protein id
For phos, all protein fields PLUS the following field:
Field | Description
--- | ---
phos_id
| character string, Phosphopetide ID
For phenotype, the following fields are required:
Field | Description
--- | ---
data_name
| character string, phenotype id
short_name
| character string, short descriptive name
category
| character string, category if any
description
| character string, phenotype description
is_id
| logical, should only be 1 TRUE
is_pheno
| logical, is this an actual phenotype
is_numeric
| logical, is this a numeric field
omit
| logical, T to omit, F to include
use_covar
| character string, colon seperated covar values
Also for phenotype, the following fields are legacy fields (not required):
Field | Description
--- | ---
units
| character string, measureing units
R_name
| character string, name used by R
R_category
| character string, category used by R
is_date
| logical, does this contain a date
is_factor
| logical, is this a factor
factor_levels
| character string, “:” separated values
is_covar
| logical, is this a covariate
is_derived
| logical, is this phenotype derived
Extra information in the tibble will be ignored by the QTL Viewer.
R data type:
tibble
Annotations for the samples in this dataset. The unique identifying column is sample_id. We use a regular expression to determine the unique sample_id column. Examples that work are mouse.id, mouse_id, sample.id, sample_id. There should be a unique value for sample_id in every row.
For the purpose of doing certain scans, there will need to be other columns that match the information stored in the covar_info element.
R data type:
tibble
This element controls how we scan and interact with the RData object. The following columns must be present:
Field | Description
--- | ---
sample_column
| name of the column in the annot_samples element
display_name
| QTL Viewer uses this to display a nice name
interactive
| TRUE for an interactive covariate, must also set lod_peaks if TRUE. If FALSE, lod_peaks value should be NA. This controls whether or not interactive scans are performed for a particular covariate.
primary
| which covariate to display preselected in the Effect Plot
lod_peaks
| named tibble in the lod_peaks element
R data type:
matrix or list
This element is either a matrix or a list.
If it is a matrix, there is one and only set of data for this dataset.
If it is a list, each named element in the list should be a matrix with the following controlled vocabulary for the names:
Each matrix will contain numerical data with samples (rows) by annotations (columns).
R data type:
character
This will be used to identify the type of dataset. This is a controlled vocabulary consisting of the following values:
Based upon the value of this element, the QTL Viewer will treat the data as accordingly.
R data type:
character
This will be used to display the name of the dataset to the user in the QTL Viewer. This will be used in a dropdown menu to switch among the datasets.
R data type:
list
This is a list with each value in the list being either additive (the default) or one of the interactive covariates (if set in covar_info). The additive values should always be present.
The covar_info element should have values with interactive set to TRUE and lod.peaks set to the name of the element in this list.
Depending on the value of datatype (mrna, protein, phos, phenotype), the annotation column identifier will match to the appropriate column in the annot_ datatype element.
The following shows the required fields in each tibble.
If datatype is mrna, the following fields are required:
Field | Description
--- | ---
gene_id
| the Ensembl gene identifier in the annot_mrna element
marker_id
| the marker identifier in the markers element
lod
| the lod score
If datatype is protein, the following fields are required:
Field | Description
--- | ---
protein_id
| the Ensembl protein id in the annot_protein element
marker_id
| the marker identifier in the markers element
lod
| the lod score
If datatype is phos, the following fields are required:
Field | Description
--- | ---
phos_id
| the Phosphopeptide identifer
marker_id
| the marker identifier in the markers element
lod
| the lod score
If datatype is phenotype, the following fields are required:
Field | Description
--- | ---
data_name
| the unique identifier in the annot_phenotype element
marker_id
| the marker identifier in the markers element
lod
| the lod score
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.