#' @title crickets data
#'
#' @description
#'
#' name crickets
#'
#' @docType data
#' @format This data frame contains:
#' \describe{
#' \item{species:}{Species, see details}
#' \item{temp:}{temperature}
#' \item{pps:}{pulse per second}
#' }
#' @keywords datasets
#'
#
#' @details
#'
#' Walker (1962) studied the mating songs of male
#' tree crickets. Each wingstroke by a cricket produces a pulse of
#' song, and females may use the number of pulses per second to
#' identify males of the correct species. Walker (1962) wanted to
#' know whether the chirps of the crickets Oecanthus
#' exclamationis and Oecanthus niveus had different pulse
#' rates. See \url{http://www.biostathandbook.com/ancova.html} for
#' details. He measured the pulse rate of the crickets (variable
#' `pps`) at a variety of temperatures (`temp`):
#'
#' @examples
#'
#' data(crickets)
#' coplot(pps ~ temp | species, data=crickets)
"crickets"
##
## beets
##
#' @title beets data
#'
#' @description Yield and sugar percentage in sugar beets from a split plot
#' experiment. Data is obtained from a split plot experiment. There are 3
#' blocks and in each of these the harvest time defines the "whole plot" and
#' the sowing time defines the "split plot". Each plot was 25 square meters and
#' the yield is recorded in kg. See 'details' for the experimental layout.
#'
#' @name beets
#'
#' @docType data
#' @format The format is: chr "beets"
#'
#' @details
#' \preformatted{
#' Experimental plan
#' Sowing times 1 4. april
#' 2 12. april
#' 3 21. april
#' 4 29. april
#' 5 18. may
#' Harvest times 1 2. october
#' 2 21. october
#' Plot allocation:
#' Block 1 Block 2 Block 3
#' +-----------|-----------|-----------+
#' Plot | 1 1 1 1 1 | 2 2 2 2 2 | 1 1 1 1 1 | Harvest time
#' 1-15 | 3 4 5 2 1 | 3 2 4 5 1 | 5 2 3 4 1 | Sowing time
#' |-----------|-----------|-----------|
#' Plot | 2 2 2 2 2 | 1 1 1 1 1 | 2 2 2 2 2 | Harvest time
#' 16-30 | 2 1 5 4 3 | 4 1 3 2 5 | 1 4 3 2 5 | Sowing time
#' +-----------|-----------|-----------+
#' }
#'
#' @references Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger
#' Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed
#' Models - The R Package pbkrtest., Journal of Statistical Software,
#' 58(10), 1-30., \url{https://www.jstatsoft.org/v59/i09/}
#'
#' @keywords datasets
#'
#' @examples
#' data(beets)
#'
#' beets$bh <- with(beets, interaction(block, harvest))
#' summary(aov(yield ~ block + sow + harvest + Error(bh), beets))
#' summary(aov(sugpct ~ block + sow + harvest + Error(bh), beets))
#'
"beets"
#' Gene expression signatures for p53 mutation status in 250 breast cancer
#' samples
#'
#'
#' Perturbations of the p53 pathway are associated with more aggressive and
#' therapeutically refractory tumours. We preprocessed the data using Robust
#' Multichip Analysis (RMA). Dataset has been truncated to the 1000 most
#' informative genes (as selected by Wilcoxon test statistics) to simplify
#' computation. The genes have been standardized to have zero mean and unit
#' variance (i.e. z-scored).
#'
#' The factor \code{code} defines whether there was a mutation in the p53
#' sequence (code=case) or not (code=control).
#'
#' @name breastcancer
#'
#' @docType data
#'
#' @format A data frame with 250 observations on 1001 variables. The
#' first 1000 columns are numerical variables; the last column
#' (named \code{code}) is a factor with levels \code{case} and
#' \code{control}.
#'
#' @references Miller et al (2005, PubMed
#' ID:16141321)
#'
#' @source Chris Holmes, \email{c.holmes@@stats.ox.ac.uk}
#'
#' @keywords datasets
#'
#' @examples
#'
#' data(breastcancer)
#' bc <- breastcancer
#' pairs(bc[,1:5], col=bc$code)
#'
#' train <- sample(1:nrow(bc), 50)
#' table(bc$code[train])
#' \dontrun{
#' library(MASS)
#' z <- lda(code ~ ., data=bc, prior = c(1,1)/2, subset = train)
#' pc <- predict(z, bc[-train, ])$class
#' pc
#' bc[-train, "code"]
#' table(pc, bc[-train, "code"])
#' }
#'
"breastcancer"
##
## carcass
##
#' Lean meat contents of 344 pig carcasses
#'
#' Measurement of lean meat percentage of 344 pig carcasses together with
#' auxiliary information collected at three Danish slaughter houses
#'
#' @name carcass
#'
#' @aliases carcass carcassall
#' @format carcassall: A data frame with 344 observations on the following 17
#' variables.
#' \describe{
#' \item{\code{weight}}{Weight of carcass}
#' \item{\code{lengthc}}{Length of carcass from back toe to head (when
#' the carcass hangs in the back legs)}
#' \item{\code{lengthf}}{Length of carcass from back toe to front leg
#' (that is, to the shoulder)}
#' \item{\code{lengthp}}{Length of carcass from back toe to the pelvic bone}
#' \item{\code{Fat02, Fat03, Fat11, Fat12, Fat13, Fat14, Fat16}}{Thickness of fat
#' layer at different locations on the back of the carcass (FatXX
#' refers to thickness at (or rather next to) rib no. XX. Notice that
#' 02 is closest to the head}
#' \item{\code{Meat11, Meat12, Meat13}}{Thickness of meat layer at different
#' locations on the back of the carcass, see description above}
#' \item{\code{LeanMeat}}{Lean meat percentage determined by dissection}
#' \item{\code{slhouse}}{Slaughter house; a factor with levels \code{slh1} and \code{slh2}.}
#' \item{\code{sex}}{Sex of the pig; a factor with levels \code{castrate} and \code{female}.}
#' \item{\code{size}}{Size of the carcass; a factor with levels \code{normal} and \code{large}.
#' Here, `normal` refers to carcass weight under 80 kg; `large` refers to carcass weights between 80 and 110 kg.}
#'
#' }
#'
#' @details: Notice that there were slaughtered large pigs only at one slaughter house.
#' @note carcass: Contains only the variables Fat11, Fat12, Fat13,
#' Meat11, Meat12, Meat13, LeanMeat
#' @source Busk, H., Olsen, E. V., Brøndum, J. (1999) Determination of
#' lean meat in pig carcasses with the Autofom classification
#' system, Meat Science, 52, 307-314
#' @keywords datasets
#' @examples
#' data(carcass)
#' head(carcass)
#'
"carcass"
"carcassall"
#' Diet of Atlantic cod in the Gulf of St. Lawrence (Canada)
#'
#' Stomach content data for Atlantic cod (\emph{Gadus morhua}) in the Gulf of
#' St.Lawrence, Eastern Canada. Note: many prey items were of no interest for
#' this analysis and were regrouped into the "Other" category.
#'
#' Cod are collected either by contracted commerical fishing vessels
#' (\code{ship.type} 90 or 99) or by research vessels. Commercial vessels are
#' identified by a unique \code{ship.id}.
#'
#' Either one research vessel or several commercial vessels conduct a survey
#' (\code{trip}), during which a trawl, gillnets or hooked lines are set
#' several times. Most trips are random stratified surveys (depth-based
#' stratification).
#'
#' Each trip takes place within one of the \code{region}s. The \code{trip}
#' label is only guaranteed to be unique within a region and the \code{set}
#' label is only guaranteed to be unique within a \code{trip}.
#'
#' For each fish caught, the \code{fish.length} is recorded and the fish is
#' allocated a \code{fish.id}, but the \code{fish.id} is only guaranteed to be
#' unique within a \code{set}. A subset of the fish caught are selected for
#' stomach analysis (stratified random selection according to fish length; unit
#' of stratification is the set for research surveys, the combination ship.id
#' and stratum for surveys conducted by commercial vessels, although strata are
#' not shown in codstom).
#'
#' The basic experimental unit in this data set is a cod stomach (one stomach
#' per fish). Each stomach is uniquely identified by a combination of
#' \code{region}, \code{ship.type}, \code{ship.id}, \code{trip}, \code{set},
#' and \code{fish.id}.
#'
#' For each prey item found in a stomach, the species and mass of the prey item
#' are recorded, so there can be multiple observations per stomach. There may
#' also be several prey items with the same \code{prey.type} in the one stomach
#' (for example many \code{prey.types} have been recoded \code{Other}, which
#' produced many instances of \code{Other} in the same stomach).
#'
#' If a stomach is empty, a single observation is recorded with
#' \code{prey.type} \code{Empty} and a \code{prey.mass} of zero.
#'
#' @name codstom
#' @docType data
#'
#' @format A data frame with 10000 observations on the following 10 variables.
#' \describe{
#' \item{\code{region}}{a factor with levels \code{SGSL} \code{NGSL}
#' representing the southern and northern Gulf of St. Lawrence, respectively}
#' \item{\code{ship.type}}{a factor with levels \code{2} \code{3} \code{31}
#' \code{34} \code{90} \code{99}}
#' \item{\code{ship.id}}{a factor with levels \code{11558} \code{11712}
#' \code{136148} \code{136885}
#' \code{136902} \code{137325} \code{151225} \code{151935} \code{99433}}
#' \item{\code{trip}}{a factor with levels \code{10} \code{11}
#' \code{12} \code{179} \code{1999}
#' \code{2} \code{2001} \code{20020808} \code{3} \code{4} \code{5}
#' \code{6} \code{7} \code{8}
#' \code{88} \code{9} \code{95}}
#' \item{\code{set}}{a numeric vector}
#' \item{\code{fish.id}}{a numeric vector}
#' \item{\code{fish.length}}{a numeric vector, length in mm}
#' \item{\code{prey.mass}}{a numeric vector, mass of item in stomach, in g}
#' \item{\code{prey.type}}{a factor with levels \code{Ammodytes_sp}
#' \code{Argis_dent}
#' \code{Chion_opil} \code{Detritus} \code{Empty} \code{Eualus_fab}
#' \code{Eualus_mac} \code{Gadus_mor} \code{Hyas_aran}
#' \code{Hyas_coar}
#' \code{Lebbeus_gro} \code{Lebbeus_pol} \code{Leptocl_mac}
#' \code{Mallot_vil}
#' \code{Megan_norv} \code{Ophiuroidea} \code{Other} \code{Paguridae}
#' \code{Pandal_bor} \code{Pandal_mon} \code{Pasiph_mult}
#' \code{Sabin_sept}
#' \code{Sebastes_sp} \code{Them_abys} \code{Them_comp} \code{Them_lib}}
#' }
#' @source Small subset from a larger dataset (more stomachs, more variables,
#' more \code{prey.types}) collected by D. Chabot and M. Hanson, Fisheries &
#' Oceans Canada \email{chabotd@@dfo-mpo.gc.ca}.
#'
#' @keywords datasets
#' @examples
#'
#' data(codstom)
#' str(codstom)
#' # removes multiple occurences of same prey.type in stomachs
#' codstom1 <- summaryBy(prey.mass ~
#' region + ship.type + ship.id + trip + set + fish.id + prey.type,
#' data = codstom,
#' FUN = sum)
#'
#' # keeps a single line per stomach with the total mass of stomach content
#' codstom2 <- summaryBy(prey.mass ~ region + ship.type + ship.id + trip + set + fish.id,
#' data = codstom,
#' FUN = sum)
#'
#' # mean prey mass per stomach for each trip
#' codstom3 <- summaryBy(prey.mass.sum ~ region + ship.type + ship.id + trip,
#' data = codstom2, FUN = mean)
#'
#' \dontrun{
#' # wide version, one line per stomach, one column per prey type
#' library(reshape)
#' codstom4 <- melt(codstom, id = c(1:7, 9))
#' codstom5 <- cast(codstom4,
#' region + ship.type + ship.id + trip + set + fish.id + fish.length ~
#' prey.type, sum)
#' k <- length(names(codstom5))
#' prey_col <- 8:k
#' out <- codstom5[,prey_col]
#' out[is.na(out)] <- 0
#' codstom5[,prey_col] <- out
#' codstom5$total.content <- rowSums(codstom5[, prey_col])
#' }
#'
"codstom"
#' crimeRate
#'
#' Crime rates per 100,000 inhabitants in states of the USA for different crime
#' types in 1977.
#'
#' @name crimeRate
#'
#' @docType data
#' @format This data frame contains:
#' \describe{
#' \item{state:}{State of the USA}
#' \item{murder:}{crime of murder}
#' \item{rape:}{}
#' \item{robbery:}{}
#' \item{assault:}{}
#' \item{burglary:}{residential theft}
#' \item{larceny:}{unlawful taking of personal property (pocket picking)}
#' \item{autotheft:}{}
#' }
#' @keywords datasets
#' @examples
#'
#' data(crimeRate)
#'
"crimeRate"
#' crimeRate
#'
#' Crime rates per 100,000 inhabitants in states of the USA for different crime
#' types in 1977.
#'
#' @name crime_rate
#'
#' @docType data
#' @format This data frame contains:
#' \describe{
## #' \item{State:}{State of the USA}
#' \item{murder:}{crime of murder}
#' \item{rape:}{}
#' \item{robbery:}{}
#' \item{assault:}{}
#' \item{burglary:}{residential theft}
#' \item{larceny:}{unlawful taking of personal property (pocket picking)}
#' \item{autotheft:}{}
#' }
#' @keywords datasets
#' @examples
#'
#' data(crime_rate)
#'
"crime_rate"
##
## dietox
##
#' Growth curves of pigs in a 3x3 factorial experiment
#'
#' The \code{dietox} data frame has 861 rows and 7 columns.
#'
#' @details Data contains weight of slaughter pigs measured weekly for 12
#' weeks. Data also contains the start weight (i.e. the weight at week
#' 1). The treatments are 3 different levels of Evit = vitamin E (dose: 0,
#' 100, 200 mg dl-alpha-tocopheryl acetat /kg feed) in combination with 3
#' different levels of Cu=copper (dose: 0, 35, 175 mg/kg feed) in the feed.
#' The cumulated feed intake is also recorded. The pigs are litter mates.
#'
#'
#' @format This data frame contains the following columns:
#'
#' \describe{
#' \item{Weight}{Weight in Kg}
#' \item{Feed}{Cumulated feed intake in Kg}
#' \item{Time}{Time (in weeks) in the experiment}
#' \item{Pig}{Factor; id of each pig}
#' \item{Evit}{Factor; vitamin E dose; see 'details'.}
#' \item{Cu}{Factor, copper dose; see 'details'}
#' \item{Start}{Start weight in experiment, i.e. weight at week 1.}
#' \item{Litter}{Factor, id of litter of each pig}
#' }
#'
#' @source Lauridsen, C., Højsgaard, S.,Sørensen, M.T. C. (1999) Influence of
#' Dietary Rapeseed Oli, Vitamin E, and Copper on Performance and
#' Antioxidant and Oxidative Status of Pigs. J. Anim. Sci.77:906-916
#' @keywords datasets
#' @examples
#'
#' data(dietox)
#' head(dietox)
#' if (require(ggplot2)){
#' qplot(Time, Weight, data=dietox, col=Pig) + geom_line() +
#' theme(legend.position = "none") + facet_grid(Evit~Cu)
#' } else {
#' coplot(Weight ~ Time | Evit * Cu, data=dietox)
#' }
#'
"dietox"
##
## fatacid
##
#' @title Fish oil in pig food
#'
#' @description Fish oil in pig food
#'
#' @name fatacid
#' @docType data
#'
#' @format A dataframe.
#'
#' @details A fish oil fatty acid \code{X14} has been added in
#' different concentrations to the food for pigs in a
#' study. Interest is in studying how much of the fatty acid can
#' be found in the tissue. The concentrations of \code{x14} in the
#' food are \code{verb+dose+=\{0.0, 4.4, 6.2, 9.3\}}.
#'
#' The pigs are fed with this food until their weight is 60 kg. From
#' thereof and until they are slaughtered at 100kg, their food does
#' not contain the fish oil.
#' At 60kg (sample=1) and 100kg (sample=2) muscle
#' biopsies are made and the concentration of x14 is
#' determined.
#' Measurements on the same pig are correlated, and pigs are additionally
#' related through litters.
#'
#' @references Data courtesy of Charlotte Lauridsen, Department of
#' Animal Science, Aarhus University, Denmark.
"fatacid"
#' @title Forced expiratory volume in children
#'
#' @description Dataset to examine if respiratory function in children
#' was influenced by smoking.
#'
#' @name fev
#' @docType data
#'
#' @format A data frame with 654 observations on the following 5 variables.
#' \describe{
#' \item{\code{Age}}{Age in years.}
#' \item{\code{FEV}}{Forced expiratory volume in liters per second.}
#' \item{\code{Ht}}{Height in centimeters.}
#' \item{\code{Gender}}{Gender.}
#' \item{\code{Smoke}}{Smoking status.}
#' }
#'
#'
#' @references I. Tager and S. Weiss and B. Rosner and F. Speizer (1979). Effect
#' of Parental Cigarette Smoking on the Pulmonary Function of
#' Children. American Journal of Epidemiology. 110:15-26
#'
#' @examples
#'
#' data(fev)
#' summary(fev)
#'
"fev"
#' Heat development in cement under hardening.
#'
#' Heat development in cement under hardening related to the chemical
#' composition.
#'
#' @name haldCement
#' @docType data
#'
#' @format A data frame with 13 observations on the following 5 variables.
#' \describe{
#' \item{\code{x1}}{Percentage (weight) of `[3Ca0][Al2O3]`}
#' \item{\code{x2}}{Percentage (weight) of `[3Cao][SiO2]`}
#' \item{\code{x3}}{Percentage (weight) of `[4Ca0][Al2O3][Fe03]`}
#' \item{\code{x4}}{Percentage (weight) of `[2Cao][SiO2]`}
#' \item{\code{y}}{Heat development measured in calories per
#' gram cement after 180 days}
#' }
#'
#' @references Anders Hald (1949); Statistiske Metoder; Akademisk Forlag (in
#' Danish), page 509.
#' @keywords datasets
#' @examples
#'
#' data(haldCement)
#'
#' if( interactive() ){
#' pairs( haldCement )
#' }
#' m <- lm(y ~ x1 + x2 + x3 + x4, data=haldCement)
#' summary(m)
#'
#' # Notice: The model explains practically all variation in data;
#' # yet none of the explanatory variables appear to be statistically
#' # significant.
#'
"haldCement"
#' Yield from Danish agricultural production of grain and root crop.
#'
#' @name cropyield
#' @docType data
#' @format A dataframe with 97 rows and 7 columns.
#' \describe{
#' \item{\code{year}}{From 1901 to 1997.}
#' \item{\code{precip}}{Milimeter precipitation.}
#' \item{\code{yield}}{Million feed units (see details).}
#' \item{\code{area}}{Area in 1000 ha for grains and root crop.}
#' \item{\code{fertil}}{1000 tons fertilizer.}
#' \item{\code{avgtmp1}}{Average temperature April-June (3 months).}
#' \item{\code{avgtmp2}}{Average temperature July-Octobre (4 months). }
#' }
#'
#' @details A feed unit is the amount of energy in a kg of barley.
#' @references Danmarks statistik (Statistics Denmark).
#' @keywords datasets
"cropyield"
#' Milk yield data for manually milked cows.
#'
#' Milk yield data for cows milked manually twice a day (morning and evening).
#'
#' There are data for 222 cows. Some cows appear more than once in the dataset
#' (in different lactations) and there are 288 different lactations.
#'
#' @aliases milkman_rdm1
#'
#' @name milkman
#' @docType data
#'
#' @format
#' A data frame with 161836 observations on the following 12 variables.
#' \describe{
#' \item{\code{cowno}}{a numeric vector; cow identification}
#' \item{\code{lactno}}{a numeric vector; lactation number}
#' \item{\code{ampm}}{a numeric vector; milking time: 1: morning; 2: evening}
#' \item{\code{dfc}}{a numeric vector; days from calving}
#' \item{\code{my}}{a numeric vector; milk yield (kg)}
#' \item{\code{fatpct}}{a numeric vector; fat percentage}
#' \item{\code{protpct}}{a numeric vector; protein percentage}
#' \item{\code{lactpct}}{a numeric vector; lactose percentage}
#' \item{\code{scc}}{a numeric vector; somatic cell counts}
#' \item{\code{race}}{a factor with levels \code{RDM} \code{Holstein} \code{Jersey}}
#' \item{\code{ecmy}}{a numeric vector; energy corrected milk}
#' \item{\code{cowlact}}{Combination of cowno and lactno; necessary
#' because the same cow may appear more than once in the dataset (in
#' different lactations)}
#' }
#'
#' @keywords datasets
#'
#' @references Friggens, N. C.; Ridder, C. and Løvendahl, P. (2007).
#' On the Use of Milk Composition Measures to Predict the Energy Balance of Dairy Cows.
#' J. Dairy Sci. 90:5453–5467 doi:10.3168/jds.2006-821.
#'
#' This study was part of the Biosens project used data from the
#' “Malkekoens energibalance og mobilisering” project; both were
#' funded by the Danish Ministry of Food, Agriculture and Fisheries
#' and the Danish Cattle Association.
#'
#' @examples
#'
#' data(milkman)
#'
"milkman"
"milkman_rdm1"
#' @title NIRmilk
#'
#' @description Near infra red light (NIR) measurements are made at
#' 152 wavelengths on 17 milk samples. While milk runs through a
#' glass tube, infra red light is sent through the tube and the
#' amount of light passing though the tube is measured at
#' different wavelengths. Each milk sample was additionally
#' analysed for fat, lactose, protein and dry matter.
#'
#' @name NIRmilk
#' @docType data
#' @format This data frame contains 17 rows and 158 columns. The
#' first column is the sample number. The columns `Xklm` contains
#' the transmittance (fraction of electromagnetic power)
#' transmittance through the sample at wavelength `klm`. The
#' response variables are fat, protein, lactose and dm (dry
#' matter).
#' @keywords datasets
#' @examples
#'
#' data(NIRmilk)
#'
"NIRmilk"
#' @title nir_milk
#'
#' @description Near infra red light (NIR) measurements are made at
#' 152 wavelengths on 17 milk samples. While milk runs through a
#' glass tube, infra red light is sent through the tube and the
#' amount of light passing though the tube is measured at
#' different wavelengths. Each milk sample was additionally
#' analysed for fat, lactose, protein and dry matter.
#'
#' @name nir_milk
#' @docType data
#' @format A list with two components
#' x Datafrane with infra red light amount at different wavelengths (column names are the wavelengths; just remove the leading X).
#' y Datafrane with response variables fat, protein,
#' lactose and dm (drymatter)
#'
#' @keywords datasets
#' @seealso [NIRmilk]
#' @examples
#'
#' data(nir_milk)
#'
"nir_milk"
#' Weight and size of 20 potatoes
#'
#' Weight and size of 20 potatoes. Weight in grams; size in millimeter. There
#' are two sizes: \code{length} is the longest length and \code{width} is the
#' shortest length across a potato.
#'
#' @name potatoes
#' @docType data
#'
#' @format A data frame with 20 observations on the following 3 variables.
#' \describe{
#' \item{\code{weight}}{a numeric vector}
#' \item{\code{length}}{a numeric vector}
#' \item{\code{width}}{a numeric vector}
#' }
#'
#' @author Søren Højsgaard, \email{sorenh@@math.aau.dk}
#' @source My own garden; autumn 2015.
#' @keywords datasets
#' @examples
#'
#' data(potatoes)
#' plot(potatoes)
#'
"potatoes"
#' Mathematics marks for students
#'
#' The \code{mathmark} data frame has 88 rows and 5 columns.
#'
#' @name data-mathmark
#'
#' @aliases mathmark math
#' @format This data frame contains the following columns: mechanics, vectors,
#' algebra, analysis, statistics.
#' @author Søren Højsgaard, \email{sorenh@@math.aau.dk}
#' @references David Edwards, An Introduction to Graphical Modelling, Second
#' Edition, Springer Verlag, 2000
#' @keywords datasets
#' @usage data(mathmark)
#'
#' @examples
#'
#' data(mathmark)
#'
"mathmark"
"math"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.