Description Usage Arguments Details Value Note Author(s) References See Also
This function reads SNP genotype data and creates an object of class
"snp.matrix"
or "X.snp.matrix"
.
Input data are assumed to be arranged as one line per
SNP-call (without any headers). This function can read gzipped files.
1 2 3 | read.snps.long.old(file, chip.id, snp.id, codes, female,
conf = 1, threshold = 0.9, drop=FALSE,
sorted=FALSE, progress=interactive())
|
file |
Name of file containing the input data. Input files
which have been compressed by the |
chip.id |
Array of type |
snp.id |
Array of type |
codes |
For autosomal SNPs, an array of length 3 giving the codes
for the three genotypes, in the order homozygous(AA), heterozygous(AB),
homozygous(BB). For X SNPs, an additional two codes for the male
genotypes (AY and BY) must be supplied. All other codes will be treated
as "no call". The default codes are |
female |
If the data to be read refer to SNPs on the X chromosome, this
argument must be supplied and should indicate whether each row of
data refers to a female ( |
conf |
Confidence score. See details |
drop |
If |
threshold |
Acceptance threshold for confidence score |
sorted |
Is input file already sorted into the correct order (see details)? |
progress |
If |
Data are assumed to be input with one line per call, in free
format:
<chip-id> <snp-id> <code for genotype call>
[<confidence>] ...
Currently, any fields following the first three (or four) are
ignored. If the argument sorted
is TRUE
, the file is
assumed to be sorted
with snp-id as primary key and
chip-id as secondary key using the current locale. The rows and
columns of the returned matrix will also be ordered in this manner. If
sorted
is set to FALSE
, then an algorithm which avoids
this assumption is used. The rows and columns of the returned matrix
will then be in the same order as the input chip_id
and
snp_id
vectors. Calls in which both id fields match elements in the
chip.id
and
snp.id
arguments are read in, after (optionally) checking that
the level of confidence achieves a given threshold.
Confidence level checking is
controlled by the conf
argument. conf=0
indicates that
no confidence score is present and no checking is done. conf>0
indicates that calls with scores above threshold
are accepted,
while conf<0
indicates that only calls with scores below
threshold
should be accepted.
The routine is case-sensitive and it is important that the
<chip-id> and <snp-id> match the cases of
chip.id
and snp.id
exactly.
An object of class snp.matrix
.
If more than one instance of any
combination of chip_id
element and snp_id
element
passes the confidence threshold, the called to be used is decided by
the following rules:
1Any call trumps "no-call"
2In the event of call conflict, "no-call" is returned
Use of sorted=TRUE
is usually discouraged since the alternative
algorithm is safer and, usually, not appreciably slower. However, if
the input file is to be read multiple times and there is a reasonably
close correspondence between cells of the matrix to be returned and
lines of the input file, the sorted option can be faster.
This function has been replaced by the more flexible function
read.snps.long
.
David Clayton david.clayton@cimr.cam.ac.uk and Hin-Tak Leung
http://www-gene.cimr.cam.ac.uk/clayton
snp.matrix-class
, X.snp.matrix-class
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.