CHNOSZ is a package for thermodynamic calculations, primarily with applications in geochemistry and biochemistry. It can be used to calculate the standard molal thermodynamic properties and chemical affinities of reactions relevant to geobiochemical processes, and to visualize the equilibrium activities of species on chemical speciation and predominance diagrams. The package can be used interactively and in batch mode, through the use of R source files containing a sequence of commands.
The major features of the package are outlined in the Overview given below, with links to specific help topics.
If you are a new user, the ‘anintro’ vignette (An introduction to CHNOSZ) may offer a more comfortable way to get started with using the package.
The help pages have been given either keywords or “concept index entries” which are visible to
??primary to browse the most commonly used functions and
??secondary to see other high-level, but less often-used functions.
??protein shows functions for working with proteins, and
??extra lists functions with extra functionality (beyond the main workflow).
For help on the thermodynamic database, use
??utilities (one of the standard R keywords) can be used to locate utility functions in the package; these include useful tools for modifying the database, converting units, reading protein sequence files, parsing chemical formulas, etc.
All thermodynamic data and examples are provided on an as-is basis.
It is up to you to check not only the accuracy of the data, but also the suitability of the data AND computational techniques for your problem.
By combining data taken from different sources, it is possible to build an inconsistent and/or nonsensical calculation.
An attempt has been made to provide a primary database (OBIGT.csv) that is internally consistent, but no guarantee can be made.
Where possible, data with known or suspected inconsistencies have been placed into a secondary database (OBIGT-2.csv) that should be regarded as experimental.
If there is any doubt about the accuracy or suitability of data for a particular problem, please consult the primary sources (see
Do not assume that by adding any species to your calculation (or to any of the examples), you will necessarily obtain a reasonable answer.
Do not assume that the examples are correct, or that they can be applied to your problem.
As with the data, please compare the construction and output of the examples to the primary sources, cited in the reference list in each help page.
Examples without a reference (and some with references) demonstrate experimental features of CHNOSZ.
Major features in CHNOSZ:
Thermodynamic database - assembles literature values of the standard thermodynamic properties and equations of state parameters of minerals, aqueous organic and inorganic species, gases and liquids (
Group additivity for proteins - estimate the standard thermodynamic properties and equations of state parameters for unfolded proteins from their amino acid composition; includes an additive calculation of ionization state of proteins as a function of temperature and pH (
File and internet access - read protein sequences from FASTA files, and download sequence information from UniProt (
Equations of state - calculate the standard thermodynamic properties of proteins or other species in the database, and reactions between them, as a function of temperature and pressure (
Stoichiometry - count elements in chemical formulas of species, check and optionally correct mass balance of chemical reactions (
System of interest - define the basis species for a system together with one or more species of interest; compute the stoichiometries of the formation reactions of the species of interest (
Chemical affinity - calculate the chemical affinities of the formation reactions of the species of interest at a single point, or as a function of one or more of chemical activities of the basis species, temperature and/or pressure (
Chemical activity - calculate the equilibrium activities of the species of interest as a function of the same variables used in the affinity calculation, using a reference state transformation (either the Boltzmann distribution or a reaction matrix approach). (
Activity diagrams - plot the equilibrium activities at a single point (as barplots), or as a function of one (species activity diagrams) or two (predominance diagrams) variables (
Buffer calculations (experimental) - compute activities of basis species that are determined by a buffer of one or more species (e.g., pyrite-pyrrhotite-magnetite; acetic acid-CO2) (
Activity statistics (experimental) - calculate summary statistics for equilibrium activities of species (
Multidimensional optimization (new in 0.9-3) (experimental) - using an iterative gridded optimization, find a combination of chemical activities of basis species, temperature and/or pressure that maximize or minimize the value of a target statistic (
Mass transfer calculations (experimental) - calculate changes in solution composition and formation of secondary species as a function of incremental reaction of a mineral (or protein) (
Here are some tips for new users:
Install the package from CRAN using
install.packages or its GUI menu equivalent.
To begin working with the package after installation, type
library(CHNOSZ) at the command line (or select the name of the package from the GUI menu).
Running the examples shown in the various help topics is a great way to become more familiar with the usage of the functions. From
help.start, select ‘Packages’ then ‘CHNOSZ’ and then select a function of interest. Individual examples can be run by pasting the example block directly into the R console.
Type the command
examples() to run all of the examples provided in CHNOSZ. This takes a few minutes depending on your system. If things go as expected, the entire set will run without any warnings or errors.
Some of the examples require internet or file access or user intervention, or are intentionally written to demonstrate conditions that lead to errors. This code is hidden from R's package checking mechanism using the
dontrun tag. You can experiment with
dontrun examples by pasting the code to the R console.
A couple of other things to note about the examples: 1) There are some
stopifnot statements that represent expected outcomes of the calculations; if the expectation is not met, the
stopifnot statement causes an error. These tests are useful for checking the code during package development cycles, but are usually not of critical importance for the set-up of the problem (though they do sometimes employ useful programming tricks). 2) Commands written with an enclosing pair of parentheses
(z <- "like this one") are used to display the result of an assignment operation (
<-), the value of which is needed later in the calculation. In interactive use, the outermost pair of enclosing parentheses is generally not needed.
Also try out
demos() to run the demos (some of these are longer running examples that are not part of the main help pages).
To learn how to update the thermodynamic database, look at its documentation in
Starting with version 1.0.5-1, the package depends on R version 3.1.0 or greater (for
extendInt argument of
uniroot, used in coderho.IAPWS95).
As of version 1.0.4 (release 1.0.5 on CRAN), the package depends on R version 3.0.0 or greater (previous versions use
Stangle to extract R code from vignettes when installing the source package, leading to failure processing hotspring.Rnw, which now uses knitr instead of Sweave).
Before version 1.0.4, the recommended version of R was 2.14.0 or greater (to find vignettes in the
vignettes directory, and for availability of parallel in the standard library).
As of version 0.9-9, the package depends on R version 2.12.0 or greater (so useDynLib in the NAMESPACE can find the shared library on Windows).
Starting with version 0.9-6 of the package, the dependency was given as R version 2.10.0 or greater (to read compressed data files).
Before version 0.9-6 of the package, the dependency was given as R version 2.7.0 or greater (major update of the X11 device in 2.7.0).
Without accessing the compressed data files in
extdata it should be possible to run CHNOSZ on Unix-alikes under R versions 2.4.0 or greater (availability of the stringsAsFactors argument of
This package would not exist without the fearless leadership and encouragement of Professor Harold C. Helgeson. Hal and his associates are in some way responsible for many of the equations and data contained in this package. A direct contribution of code is the file
H2O92D.f, taken from the SUPCRT92 distribution, with only cosmetic modifications (masking of WRITE and STOP statements) made for compatibility with an R environment. The revised Helgeson-Kirkham-Flowers equations of state are used in this package, together with the thermodynamic properties and parameters for many species taken from numerous papers coauthored by Helgeson.
Work on this package at U.C. Berkeley from ~2003–2008 was supported by research grants solicited by HCH from the U.S. National Science Foundation and Department of Energy. In 2009–2011, the major research project stimulating development of this package at Arizona State University was funded by the National Science Foundation under grant EAR-0847616. The files in
extdata/bison are excerpts of results of BLAST calculations made on the Saguaro high performance computer at ASU.
subcrt does not correctly identify the stable polymorph of some minerals at high temperature.
The values generated by
buffer may not be applied correctly by
affinity in calculating the affinities of the formation reactions. (The values returned by
affinity(..., return.buffer=TRUE) do appear to be correct in the examples).
There is an unidentified inconsistency in
transfer causing the reaction boundaries in one of the examples (
apc("closed")) to be offset from the stability diagram. OTOH,
feldspar("closed") appears to work correctly.
Values of activity coefficients may be affected by an unidentified bug in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
### Getting Started ## the 'thermo' object contains thermodynamic data and is also where ## user's settings (definition of chemical system) are stored data(thermo) ## standard thermodynamic properties of species subcrt("H2O") subcrt("alanine") # names of proteins have an underscore subcrt("LYSC_CHICK") # custom temperature range T <- seq(0, 500, 100) subcrt("H2O", T=T, P=1000) # temperature - pressure grid P <- seq(1000, 4000, 1000) subcrt("H2O", T=T, P=P, grid="P") ## information about species # query the database using formulas info("C6H12O6") info("SiO2") # query using names info("quartz") si <- info(c("glucose", "mannose")) # show the equations of state parameters info(si) # approximate matches for names or formulas info("acid ") info("SiO2 ") ## standard thermodynamic properties of reactions # fermentation example info(c("fructose", "ethanol")) subcrt(c("fructose", "C2H5OH", "CO2"), c(-1, 2, 2)) # weathering example -- also see transfer() subcrt(c("k-feldspar", "H2O", "H+", "kaolinite", "K+", "SiO2"), c(-2, -1, -2, 1, 2, 4)) # partial reaction auto-completion is possible basis(c("SiO2", "H2O", "K+", "H+", "O2")) subcrt(c("k-feldspar", "kaolinite"), c(-2, 1)) ## chemical affinities # set basis species and their activities or fugacities basis(c("CO2", "H2O", "O2"), c(-3, 0, -80)) # set species of interest species(c("CH4", "C2H4O2", "CO2")) # chemical affinities of formation reactions # take off $values for complete output affinity()$values affinity(O2=c(-90, -60, 4))$values