Description Usage Arguments Value Coordinate system BLAT installation Author(s) References See Also
localize
returns genomic coordinates (chromosome, strand, starting position, ending position) of a set of probes into a given genome. It relies on the external Blast-Like Alignment Tool
to perform fuzzy both-strands matching, and provides various filters suitable to CGH probes.
blatInstall
needs to be executed once after the R package installation in order to use localize
.
1 2 3 4 5 |
blat |
Single character value, path to the BLAT executable file to use for localization. |
cygwin |
Single character value, path to the cygwin1.dll file that might be needed to run BLAT on Windows. |
probeFile |
Single character value, path to a multi-fasta file describing the probes to compute the bias for. FASTA comments are used as probe names, and should be unique. |
chromFiles |
Character vector, paths to chromosome sequences (a single fasta file for each chromosome). |
chromPattern |
Single character value, a regular expression to be used for chromosome name extraction from |
blatArgs |
Character vector, arguments to be passed to BLAT ("name=value" or "-flag"). See the BLAT documentation in 'References' for further details. |
rawOutput |
Single logical value, whether to return the merged BLAT output or the processed one (see 'Value'). Notice raw output is not filtered. |
noMulti |
Single logical value, whether to filter out probes located in multiple genomic positions or not. Ignored if |
noOverlap |
Single logical value, whether to filter out overlapping probes or not (when two overlapping probes are detected, both are discarded). Ignored if |
noPartial |
Single logical value, whether to filter out partial matches or not (they will still be used by other filters, to disable them completely consider using different BLAT arguments). Ignored if |
verbose |
Single numeric value, the level of verbosity (0, 1 or 2). |
If rawOutput
, localize
returns the tabular section of merged psLayout 3
file returned by BLAT (see the BLAT documentation in 'References' for further details).
Else returns a data.frame
with a row for each probe that was found and not filtered, ordered by chrom
, start
then name
:
name |
Character, the probe names, as defined by comments in |
chrom |
Character, the chromosomal location of the probe, as defined by the |
strand |
Character, "+" for a forward match, "-" for a reverse complement match. |
start |
Integer, the lower position of the probe in the chromosome. See 'Coordinate system'. |
end |
Integer, the upper position of the probe in the chromosome. See 'Coordinate system'. |
insertions |
Integer, amount of nucleotides inserted in the probe when refering to the chromosome sequence. |
deletions |
Integer, amount of nucleotides deleted in the probe when refering to the chromosome sequence. |
mismatches |
Integer, amount of mismatching nucleotides between probe and chromosome sequence. |
freeEnds |
Integer, amount of nucleotides at probe extremities ignored in the alignment. |
When rawOutput
is FALSE
, coordinates begin at 1, both boundaries are comprised in the sequence and length can be computed as end - start + 1
(Biostrings
behavior).
When rawOutput
, refer to BLAT specifications (See 'References').
In both cases, backward matches (strand = "-") are expressed in forward coordinates (start < end) (BLAT behavior).
BLAT relies on a single executable file, so installation is straight-forward.
Download the executable file or compile it for your computer architecture, then simply use the blatInstall
function to copy it to the proper package folder for further uses. Precompiled executables for various systems can be found on the author website (see 'References'), as part of the BlatSuite (only 'blat.exe' or 'blat' is needed).
Running BLAT on Windows needs Cygwin. You can install Cygwin entirely on your system (see 'References'), or download the "cygwin1.dll" file and provide it to blatInstall
, as it is the only Cygwin component needed. DLL is a common format for informatic viruses, so be sure of the website you download this file from. You can safely (no guarantee !) download it from the official website (see 'References') mirrors, they generally keep compressed archives in /release/cygwin in which you can find the DLL (in /usr/bin).
Sylvain Mareschal
BLAT is an open-source software freely available for academic, nonprofit and personal use. See the FAQ for further details. FAQ, specifications, source code and executables
Cygwin is a free and open-source software under GNU General Public Licencing. Official website
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.