Description Usage Arguments Details Value Note Warning See Also Examples
This function takes an input raster, that is, an image on a categorical variable, and generates an image (or images) of other data using the input raster as a key or index variable. It is used to render maps of variables that are would not otherwise be possible, such as Nearest Neighbour algorithms on continuous data.
1 |
inRdata |
a raster* object with the spatial data—typically a map of indices to the lookup table. |
iData |
or imputation data; the lookup table that takes map values (e.g. |
outFilename |
a file to hold the output resulting raster object. |
fx |
(optional) a formula object specifying which column(s) in |
x |
(optional) a list of columns from |
y |
(optional) the pivot column that connects the model and the imputation data; if neither fx nor this is specified then it is
assumed to be the first column in |
It is perhaps easiest to explain the concept of imputation using an example: consider the case where the input raster represents the ID
of the nearest neighbour to that pixel; it is possible to impute environmental data by looking up each ID in the original dataset and
assigning that pixel the value (of the environmental variable) found at that site. So, if this pixel is nearest (in phase space) to site
No.153, then we infer – impute – that it is also most likely to have similar environmental characteristics. This is significantly
better than generating a map of classes, then inferring values from the mean of the class as there is a huge amount of information lost
in mapping N
sites to k
classes, especially where N >> k.
This function is functionally similar to the SQL/database command JOIN; that is, it joins two groups of data using a common column, such
that every time a value y
occurs in the first table, some or all of the addition columns in the second table x
are appended
to the result. It is a glorified form of lookup table in which the vector of lookup values is all the pixels in the image.
Of course it is, in principle, possible to use this function to impute data that has been generated by some other type of model, however, the other methods included in this package are all able to generate continuous variable output directly. Imputation has only the benefit that it is possible to produce multiple output from a single rendering simply by imputing a different (suite of) variable(s). However, this computational benefit may be relevant for categorical data.
Note: that the notation used for fx may not be intuitive: the y
variable, usually the ‘dependent’ variable is used as the
pivot, which can intuitively seem like the dependent variable; in a like way the x
variables, which are usually the
‘independent’ variables, are output here. Use caution when specifying the formula; that this function expects only a single term
on the left and multiple terms on the right is a good clue as to which variables should be where.
a raster.brick of the imputed data with as many layers as specified.
An analysis that can be useful is to look at the frequency each site is used as a nearest neighbour. This is straightforward using the output of the imputation map. Example code is given below.
In an effort to streamline usage, this function will attempt to coerce non-numeric data into something that can be written using the
raster package. To this end, if the data is found to be other than numeric, it is converted to numeric using the command
as.numeric(factor(x))
, which, as has been observed before in this documentation returns the indices of the factors (see
warning section for factor
. It should be possible to recover the values of the indices using this same typecast,
however, there is a risk that there could be some glitch or error, and a mis-mapping could result between factor indices and actual
values.
It would be much safer to do your own typecast before passing the data to impute
! 'Nuf said...
factor
and ecoGroup
for more information on the factor index gotcha.
generateModels
, and writeTile
for more information on building models for imputation purposes.
nnErrMap
for outputting nearest neighbour distances, and generating accuracies from these.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | egTile <- readTile(file.path(system.file("extdata", "egTile", package = "NPEL.Classification"),''),
layers=c('base','grnns','wetns','brtns','dem','slp','asp','hsd'))
fx <- formula('siteID ~ brtns + grnns + wetns + dem + slp + asp + hsd')
nnData <- cbind(siteID=factor(1:nrow(siteData)),siteData)
nnData <- get_all_vars(fx, nnData)
models <- generateModels(nnData, suppModels[!suppModels %in% contModels], fx)
fNN <- paste0(dirname(tempfile()),'/Tmp_nn.tif')
egData <- writeTile (models[[1]], egTile, fNN, layers='class')
fImpute <- paste0(dirname(tempfile()),'/Tmp_nnImpute.tif')
egImpute <- impute (egData, nnData, fImpute, formula('siteID~ecoType+bedrockD+parentMaterial'))
plot (egImpute)
## Frequency/sensitivity of nearest neighbour site dependency
freq <- table(getValues(egData))
plot (freq/sum(freq), ylab='freq')
hiFreq <- freq[freq > 100]
index <- as.integer(rownames(hiFreq))
print (cbind(freq=hiFreq, nnData[nnData$siteID %in% index,]))
unlink (fNN)
unlink (fImpute)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.