Description Usage Arguments Details Value Examples
Cast a data.table of trawl data to an array
1 2 3 |
x |
A data.table with column names to be used in |
formula |
Formula describing array dimensions, in order as would be given by |
valueName |
Column name whose elements will fill the array. Passed to |
valFill |
Value to use for filling in missing combinations; defaults to NA. Passed to |
fixAbsent |
Logical (default TRUE) to indicate the need to fill one value for no sampling ( |
allNA_noSamp |
A character indicator the column/ dimension, which, if all its elements are NA's, indicates a no-sampling event, as opposed to an absence. When |
valAbsent |
value to be used in lieu of |
grandNamesOut |
Grand dimension names for output array (e.g., |
... |
Other arguments to be passed to |
Many columns in bottom trawl data can be described as summarizing 3 aspects of metadata: when, where, and what. This same logic is expressed in the function trawlAgg
, which prompts users to conceptualize aggregating trawl data as aggregating at different specificities for time, space, and biological dimensions. In this function's default for formula
, the "where" is described by "stratum" (a sampling site), "when" by "year", and "what" by "spp" (species). The "K" value is a replicate, which could mean either "when" or "what" (and is similar to "haulid" in trawlAgg
, which describes it as being indicative of both time and space). Given those identifying dimensions, we can then appropriately contextualize a measured value, e.g. "weight". Not all cases need these same dimensions to be in formula
(e.g., if the measured value is bottom temperature ("btemp") the "what" dimension is not needed), which is why this function doesn't impose as much structure on what categories of columns should comprise formula
.
However, it can be useful to think of that structure for formula
when trying to understand the distinction and between elements to be filled with valFill
vs. valAbsent
.
For species data, there is an important distinction between a species not being present, and no sampling occurring. For example, entries for species data often do not include 0's, but 0's are implied for Species X when a site is sampled and no value is reported for Species X, even though a value is reported for other species in this instance and Species X is reported in other sampling events. In this case, the observation is 0, not NA.
In the context just described, valFill
would be NA (the default); if we wanted to change Species X (-esque) values from NA to 0 (under appropriate conditions), set fixAbsent
to TRUE (default) and valAbsent
to 0 (default). More generally, the allNA_noSamp
argument defines the array dimension(s) that, if all elements are NA while varying allNA_noSamp
and holding other dimensions constant, that the NA values are appropriate and that those NA's should not be switched to valAbsent
when fixAbsent=TRUE
. For the species example given above, the default allNA_noSamp="spp"
would be appropriate. In general, it may be fair to say that allNA_noSamp
should be set to the "what" dimension(s) (as described above), and that valAbsent
should be set to the value taken on by valueName
when a measurement is attempted for a particular factor level of valueName
that is absent.
As implied the previous Details, casting data expands the number of explicit valueName
elements in the data set. This function casts to an array because casting to a data.frame or data.table will take up far more RAM. The the difference in RAM increases with the number of identifying variables and how many unique levels they have (but also depends on whether those identifying variables are encoded as characters, factors, integers, doubles, etc).
An array with dimensions equal to the number of unique values in each column in formula
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | mini.t <- trawlTrim(copy(clean.ebs), c.add="Picture")[Picture=="y"][pick(spp, 9)]
mini.a <- trawlAgg(
mini.t,
FUN=meanna,
bio_lvl="spp",
space_lvl="stratum",
time_lvl="year",
metaCols="common",
meta.action="unique1"
)
mini.c <- trawlCast(mini.a, time_lvl~stratum~spp, grandNamesOut=c("t","j","i"))
(smry <- t(apply(mini.c, 3,
function(x){
c(
"NA"=sum(is.na(x)),
"0"=sum(!is.na(x)&x==0),
">0"=sum(!is.na(x)&x>0)
)
}
)))
## Not run:
par(mfrow=c(3,3), mar=c(0.15,0.15,0.15,0), ps=8)
for(i in 1:nrow(smry)){
tspp <- rownames(smry)[i]
sppImg(tspp,
mini.a[spp==tspp,unique(common)],
side=3, line=-2, xpd=T
)
}
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.