read.wtccc.signals: read normalized signals in the WTCCC signal file format

Description Usage Arguments Details Value Note Author(s) References Examples

View source: R/wtccc.signals.R

Description

read.wtccc.signals takes a file and a list of snp ids (either Affymetrix ProbeSet IDs or rs numbers), and extract the entries into a form suitable for plotting and further analysis

Usage

1

Arguments

file

file contains the signals. There is no need to gunzip.

snp.list

A list of snp id's. Some Affymetrix SNPs don't have rsnumbers both rsnumbers and Affymetrix ProbeSet IDs are accepted

Details

Do not specify both rs number and Affymetrix Probe Set ID in the input; one of them is enough.

The signal file is formatted as follows, with the first 5 columns being the Affymetrix Probe Set ID, rs number, chromosome position, AlleleA and AlleleB. The rest of the header containing the sample id appended with "\_A" and "\_B".

1
2
3
4
5
  
  AFFYID         RSID       pos   AlleleA AlleleB 12999A2_A 12999A2_B ...
  SNP_A-4295769  rs915677   14433758  C     T     0.318183  0.002809 
  SNP_A-1781681  rs9617528  14441016  A     G     1.540461  0.468571 
  SNP_A-1928576  rs11705026 14490036  G     T     0.179653  2.261650

The routine matches the input list against the first and the 2nd column.

(some early signal files, have the first "AFFYID" missing - this routine can cope with that also)

Value

The routine returns a list of named matrices, one for each input SNP (NULL if the SNP is not found); the row names are sample IDs and columns are "A", "B" signals.

Note

TODO: There is a built-in limit to the input line buffer (65535) which should be sufficient for 2000 samples and 30 characters each. May want to seek backwards, re-read and dynamically expand if the buffer is too small.

Author(s)

Hin-Tak Leung htl10@users.sourceforge.net

References

http://www.wtccc.org.uk

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
answer <-
  read.wtccc.signals("NBS_22_signals.txt.gz", c("SNP_A-4284341","rs4239845"))
> summary(answer)
              Length Class  Mode
SNP_A-4284341 2970   -none- numeric
rs4239845     2970   -none- numeric

> head(a$"SNP_A-4284341")
               A        B
12999A2 1.446261 0.831480
12999A3 1.500956 0.551987
12999A4 1.283652 0.722847
12999A5 1.549140 0.604957
12999A6 1.213645 0.966151
12999A8 1.439892 0.509547
>

## End(Not run)

chopsticks documentation built on Nov. 8, 2020, 7:51 p.m.