knitr::opts_chunk$set(cache=FALSE, fig.width=7, fig.height=5.5) options(width=120)
library(ipv) library(rgdal) library(rgeos) library(tidyverse, verbose=FALSE) data("venta.dep") data("alquiler.dep")
(Beware: if you are using a publicly available version of this package, such data may have been obscured, distorted and shuffled to preserve confidentiality. You will not be able to reproduce the results in this vignette, although the data provided will suffice to test the use of functions.)
The package contains some ancillary functions which are not meant to be called directly by the user(pols
and
CompFechas
) and one (cloromap
) which affords easy representation of a variable onto the map of the CAPV (Autonomous Community of the Basque Country) or parts of it.
Tomemos las observaciones de inmuebles a la venta y contabilicemos el número de
ellas que existen por cada área funcional (codificada en la variable AF
):
Let+s take the observations corresponding to houses for sale and count the number
in each "functional are" (coded in variable AF
):
Obs <- venta.dep %>% group_by(AF) %>% summarise( Obs= n() ) %>% as.data.frame()
The Obs
data frame has the following structure:
head(Obs)
In order to represent these numbers on a map, we need an spatial dataframe
(spframe) with polygons for the spatial entities concerned (here, functional areas, as
defined in AF
). We can obtain these polygons as follows:
dsn <- system.file("extdata/CB_AREAS_FUNCIONALES_5000_ETRS89.shp", package = "ipv")[1] af <- readOGR(dsn)
We have to check that the number and correspondence of areas in Obs
and in af
is consistent:
af@data unique(Obs$AF) cbind(af@data$A_FUNC_CAS, levels(sort(unique(Obs$AF))))
There is one-to-one correspondence. We rename the levels in venta.dep
with the names in af
.
levels(Obs$AF) <- af@data$A_FUNC_CAS
Now we can invoke:
cloromap(sp=af, var.sp="A_FUNC_CAS", df=Obs, var.df="AF", var="Obs", legend.title="Obs.\ndisponibles", graf.title="Obs. por área funcional", midpoint=36000)
Lo que es crucial para que lo anterior funcione es la correspondencia 1-1 entre
los niveles de la variable var.sp
de la dataframe espacial sp
y la variable
var.df
de la dataframe ordinaria df
. El argumento midpoint
da el valor "neutro" (codificado con blanco en la escala cromática divergente empleada), que habitualmente querremos hacer coincidir con la mediana de los valores, pero podemos fijar a nuestro antojo.
Quite often the names identifying locations are not homogeneous across sources, which
makes it difficult to merge information from different files. The function Fusion
tries to simplity life a bit in such situation. It can be used in both ways, with one or two dataframes.
Whe used with one dataframe (which has to be 'X', de first argument), it scans the column
named in varX
, performs a match with the column locX
in the translation dataframe canon
, and overwrites varX
with the canonical names in canon
(or fills an NA if a match is not possible).
This sort of invocation with a single dataframe is useful to aggregate. Assume a number of areas, some of which make part of the CAPV (Basque Autonomous Community) while others do not.
If we want to aggregate CAPV and Non-CAPV areas, we could construct a translation dataframe
canon
as follows:
canon <- data.frame(X=c("Bilbao","San Sebastian","Vitoria","Teruel","Soria"), Y=c("Bilbao","Donostia","Vitoria/Gazteiz", "Teruel","Soria"), Canon=c("CAPV","CAPV","CAPV","NoCAPV","NoCAPV")) canon
Now considera a dataframe such as:
X <- data.frame(a=c("Bilbao","San Sebastian","Vitoria","Teruel"), b=c(100,70,35,12)) X
In order to reclassify observations in CAPV and Non-CAPV, we can invoke:
Fusion(X, varX="a", locX=1, locCanon=3, canon=canon)
We may also invoke Fusion
with two data frames, X
and Y
, each of them containing, among others, a variable "naming" records: these could be, for instance, names of municipalities, of streets, of districts, or neighbourhoods.
For instance, besides X
above we may have:
Y <- data.frame(e=c("Bilbao","Donostia","Vitoria/Gazteiz","Teruel","Soria"), g=c("F","G","H","J","K")) Y
Here the names of the communities (in column e
) do not match those in column a
of X
above, although
we are referring to the same entities (e.g., Donostia is the same as San Sebastian). Merging directly the
two dataframes X
and Y
would fail, but Fusion
may be used:
Fusion(X,Y,varX="a",varY="e",locX=1, locY=2, locCanon=2, canon=canon)
Above we are saying that we want to merge X
and Y
using the columns a
and e
, after replacing their values by the canonical names in column locCanon
of canon
. What this in effect does is to take the names in the dataframe
Y
as canonical. If locCanon
is not stated, the canonical names would be taken from the column namec Canon
in dataframe canon
; this would not make much sense in this case:
Fusion(X,Y,varX="a",varY="e",locX=1, locY=2, canon=canon)
A translation dataframe canon
can contain any number of columns, affording flexibility in dealing with data from different sources, in different languages, character codes, etc.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.