2._Cell_coordinates_and_cell_indices: 2. Cell coordinates and cell indices

Description Cell coordinates Cell indices and cell-index offsets Converting between cell indices and (x,y) coordinates in R Note on the zero-based "index" field of Affymetrix CDF files Author(s)

Description

This part describes how Affymetrix cells, also known as probes or features, are addressed.

Cell coordinates

In Affymetrix data files, cells are uniquely identified by there cell coordinates, i.e. (x,y). For an array with N*K cells in N rows and K columns, the x coordinate is an integer in [0,K-1], and the y coordinate is an integer in [0,N-1]. The cell in the upper-left corner has coordinate (x,y)=(0,0) and the one in the lower-right corner (x,y)=(K-1,N-1).

Cell indices and cell-index offsets

To simplify addressing of cells, a coordinate-to-index function is used so that each cell can be addressed using a single integer instead (of two). Affymetrix defines the cell index, i, of cell (x,y) as

i = K*y + x + 1,

where one is added to give indices in [1,N*K]. Continuing, the above definition means that cells are ordered row by row, that is from left to right and from top to bottom, starting at the upper-left corner. For example, with a chip layout (N,K)=(1600,1600) the cell at (x,y)=(0,0) has index i=1, and the cell at (x,y)=(1599,1599) has index i=2560000. A cell at (x,y)=(1498,3) has index i=6299.

Given the cell index i, the coordinate (x,y) can be calculated as

y = floor((i-1)/K)

x = (i-1)-K*y.

Continuing the above example, the coordinate for cell i=1 is be found to be (x,y)=(0,0), for cell i=2560000 it is (x,y)=(1599,1599), for cell i=6299 is it (x,y)=(1498,3).

Converting between cell indices and (x,y) coordinates in R

Although not needed to use the methods in this package, to get the cell indices for the cell coordinates or vice versa, see xy2indices() and indices2xy() in the affy package.

Note on the zero-based "index" field of Affymetrix CDF files

An Affymetrix CDF file provides information on which cells should be grouped together. To identify these groups of cells, the cells are specified by their (x,y) coordinates, which are stored as zero-based coordinates in the CDF file.

All methods of the affxparser package make use of these (x,y) coordinates, and some methods make it possible to read them as well. However, it is much more common that the methods return cell indices calculated from the (x,y) coordinates as explained above.

In order to conveniently work with cell indices in R, the convention in affxparser is to use one-based indices. Hence the addition (and subtraction) of 1:s in the above equations. This is all taken care of by affxparser.

Note that, in addition to (x,y) coordinates, a CDF file also contains a one-based "index" for each cell. This "index" is redundant to the (x,y) coordinate and can be calculated analogously to the above cell index while leaving out the addition (subtraction) of 1:s. Importantly, since this "index" is redundant (and exists only in CDF files), we have decided to treat this field as an internal field. Methods of affxparser do neither provide access to nor make use of this internal field.

Author(s)

Henrik Bengtsson


affxparser documentation built on Nov. 8, 2020, 7:26 p.m.