# Parse Chemical Formulas

### Description

Count the charge and number of elements in a chemical formula.

### Usage

1 2 3 4 |

### Arguments

`formula` |
character, a chemical formula |

`multiplier` |
numeric, multiplier for the elemental counts in each formula |

`sum` |
logical, add together the elemental counts in all formulas? |

`count.zero` |
logical, include zero counts for elements? |

### Details

`makeup`

parses a chemical formula expressed in string notation, returning the numbers of each element in the formula. The formula may carry a charge, indicated by a + or - sign, possibly followed by a magnitude, after the uncharged part of the formula. The formula may have multiple subformulas enclosed in parentheses (but the parentheses may not be nested), each one optionally followed by a numeric coefficient. The formula may have one suffixed subformula, separated by * or :, optionally preceded by a numeric coefficient. All numbers may contain a decimal point.

`makeup`

calls a sequence of supporting functions depending on specific characters present in the formula. If the formula has a charge, it is first parsed using `count.charge`

. If the formula has subformulas, in parentheses or as a suffix, they are separated and counted using `count.formulas`

. Finally, the elements in each subformula are counted using `count.elements`

.

`count.elements`

processes a simple chemical formula that must adhere to the following pattern: it starts with an elemental symbol; all elemental symbols start with an uppercase letter, and are followed by another elemental symbol, a number (possibly fractional, possibly signed), or nothing (the end of the formula).

Any sequence of one uppercase letter followed by zero or more lowercase letters is recognized as an elemental symbol by `count.elements`

, but `makeup`

will issue a warning for elemental symbols that are not present in `thermo$element`

.

`makeup`

can handle numeric and length > 1 values for the `formula`

argument. If the argument is numeric, it identifies row number(s) in `thermo$obigt`

from which to take the formulas of species. If `formula`

has length > 1, the function returns a list containing the elemental counts in each of the formulas. If `count.zero`

is TRUE, the elemental counts for each formula include zeros to indicate elements that are only present in any of the other formulas.

The `multiplier`

argument must have either length = 1 or length equal to the number of formulas. The elemental count in each formula is multiplied by the respective value. If `sum`

is true, the elemental counts in all formulas (after any multiplying) are summed together to yield a single bulk formula.

### Value

`count.charge`

returns a list with named elements `charge`

and `uncharged`

, indicating, respectively, the numeric value of the charge, and the original formula string excluding the charge. `count.formulas`

returns a numeric vector with names refering to each of the subformulas or the whole formula if there are no subformulas. `count.elements`

and `makeup`

return numeric vectors with names refering to each of the elemental symbols in the formula. For `makeup`

, if more than one formula is provided, a list of numeric vectors is returned, unless `sum`

is TRUE.

### See Also

Many other functions in CHNOSZ rely on `makeup`

for their operation: `mass`

and `entropy`

for calculating properties of chemical compounds from their elements; `basis`

and `i2A`

for constructing stoichiometric matrices (with `count.zero=TRUE`

); `subcrt`

for checking mass balance of chemical reactions; and others.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ```
# the composition of a simple compound
makeup("CO2") # 1 carbon, 2 oxygen
# the formula of lawsonite, with a parenthetical part and a suffix
makeup("CaAl2Si2O7(OH)2*H2O")
# fractional coefficients are ok
redfield <- c(106, 16, 1)
reddiv10 <- makeup("C10.6N1.6P0.1")
stopifnot(10*reddiv10 == redfield)
# the coefficient for charge is a number with a *preceding* sign
# e.g., ferric iron, with a charge of +3 is expressed as
makeup("Fe+3")
# transcribing the formula the way it appears in many
# publications produces a likely unintended result:
# 3 iron atoms and a charge of +1
makeup("Fe3+")
# these all represent a single negative charge, i.e., electron
makeup("-1")
makeup("Z0-1")
makeup("Z-1+0")
# hypothetical compounds with negative numbers of elements
makeup("C-4(O-2)") # -4 carbon, -2 oxygen
makeup("C-4O-2") # -4 carbon, 1 oxygen, -2 charge
makeup("C-4O-2-2") # -4 carbon, -2 oxygen, -2 charge
# the 'sum' argument can be used to check mass and charge
# balance in a chemical reaction
formula <- c("H2O", "H+", "Z0-1", "O2")
(mf <- makeup(formula, c(-1, 2, 2, 0.5), sum=TRUE))
stopifnot(all(mf==0))
``` |