Spreadsheet Implementation Technology: Basics and Extensions Peter Sestoft https://mitpress.mit.edu/books/spreadsheet-implementation-technology
1.3 Cell reference formats, p. 3
| | A1 | R1C1 |
|-----|:-----|:-----------|
| abs | $B$4 | R4C2 |
| rel | B4 | R[4]C[2]
|
Two main points:
Two subtleties the table above glosses over:
By default, an A1-formatted cell reference is relative. Use a dollar sign $
to indicate absolute.
By default, an R1C1-formatted cell reference is absolute. Relative references are indicated by putting the offset inside square brackets. Except when the offset if zero, when it is left out. So RC
is how to say "this cell" -- not R[0]C[0]
.
We like R1C1 format, especially for relative references, because it is invariant when you copy a formula. And it is just more sane, in every possible sense, actually.
Meaning of cell references assuming found in cell B2
| A1 | R1C1 | Meaning |
|:-----|:-------------|:----------------------------|
| A1 | R[-1]C[-1]
| rel: prev row, prev col |
| A2 | RC[-1]
| rel: this row, prev col |
| B1 | R[-1]C
| rel: prev row, this col |
| B2 | RC
| rel: this cell |
| C3 | R[1]C[1]
| rel: next row, next col |
| $A$1 | R1C1
| abs: row 1, col 1 or A |
| $A2 | RC1
| mixed: this row, col 1 or A |
| C$3 | R3C[1]
| mixed: row 3, next col |
From Peter Sestoft's book Spreadsheet Implementation. Section 2.1.1, p28-->.
A subset of the core concepts that seem immediately relevant to us:
colRef
and rowRef
specifying the column and row PLUS a pair of Booleans colAbs
and rowAbs
which indicate whether the associated reference is absolute vs. relative.(row, col)
. For Sestoft, it's zero-based.The contents of a cell must be either:
=
followed by an expressionHow do we go from the string representation of a formula (concrete syntax) to the internal representation (abstract syntax)? Formula text is first scanned or tokenized, then it is parsed into an abstract syntax tree (AST). Sestoft presents a full grammar for cell contents (p34) but here I just focus on the cell references.
Here's what cell references look like in the string representation:
[WORKBOOK.xlsx]'WORKSHEET NAME'!A1
Note: Sestoft never mentions the possibility of the filename appearing at the front, but that's definitely possible in Excel.
Here's the relevant subset of the Sestoft cell contents grammar:
Expr ::=
Raref
| Raref : Raref
Sheetref ::=
Name ! Raref
| Name ! Raref : Raref
Raref ::=
Column Row
| $ Column Row
| Row $ Column
| $ Column $ Row
| R Offset C Offset
Offset ::=
<empty>
| Uint
| [ Int ]
where Column
is a column name A, B, ...; Row
is a row number 1, 2, ...; Uint
is a non-negative integer; and Int
is an integer.
Sestoft doesn't allow 3-D cell references, e.g. Sheet1:Sheet12!B2
, and I don't plan to deal with them either.
Sestoft doesn't allow for full-column references, e.g. B:B
, but I think I need to. Not sure why he doesn't mention full-row references.
Sestoft describes full-on scanning/tokenization + parsing for string representations of cell contents (p33); this returns an AST for each cell as an object of class Cell
. In cellranger
we only deal with cell references and I'll just use regular expressions.
Revisiting classes relevant to cellranger
:
colAbs
and rowAbs
are logical indicators of absolutenesscolRef
and rowRef
specify row and column, in an absolute or relative senseThe Addr method for a RARef returns a CellAddr. English: single cell reference goes in, (row, column) pair comes out. FYI, the result of this is always used in a context where sheet info is available separately, e.g. from the CellRef object that contained the associated RARef.
The constructor for a CellAddr takes the row and column of the host cell and a RARef and returns absolute cell address that the RARef points to.
The ToString method for CellAddr take an absolute cell address and returns a typical string representation. Looks like Sestoft always returns in A1 format in this case. Why?
The Show method for a RARef generates one of the usual string representations of a cell reference. Sestoft's Show takes row and column (so he Shows only after applying Addr method, I guess?) and a format (A1 vs R1C1) as input.
A CellRef is a valid Expr in Sestoft's Corecalc. When you Eval it, the cell address gets resolved (row, column, sheet) and the formula in that cell is evaluated. We obviously won't do all of that in cellranger
(i.e. the formula retrieval and evaluation), but should do the cell resolution part. I think we need to a function that takes a host cell and a (potentially relative) cell reference and gives back another (absolute) cell reference. File and sheet reference (or lack thereof) should propagate.
The Show method for a CellRef builds up its output from the the Show method for a RARef and the sheet reference, if such exists.
Example given in 1.4 Formulas, functions, and arrays p. 5
Consider formula =SUM(A$1:A1)
in the absolute cell B1
.
I should run through what happens when you move/copy that formula into B2
or B3
or column C
. Come back to this.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.