Aim : assessing landscape variables and find appropriate variables to be included in analysis.
Requirements
library(corrplot)
add to analysis_nonpublic.R
landscape <- data.table::fread(paste(pathtodata, "/data_assembly/raw_data/31018_5_Dataset/31018_5_data.csv", sep="")) landscape <- data.table(BEplotZeros(landscape, column = "EP_PlotID", plotnam = "EP")) setnames(landscape, old = c("EP_PlotID", "EP"), new = c("oldEP", "Plotn")) landscape2 <- data.table::fread(paste(pathtodata, "/data_assembly/raw_data/LandUse-31319_31369/LandUse/LandscapeMetrics_acrossYears_acrossScales_Long_UpdatedOct22.csv", sep="")) landscape2 <- data.table(BEplotZeros(landscape2, column = "plot_id", plotnam = "EP")) landscape2[, plot_id := NULL] setnames(landscape2, old = c("EP"), new = c("Plotn")) landscape2_long <- landscape2 landscape2[, (c("Land.te", "Land.pr")) := lapply(.SD, FUN = as.numeric), .SDcols = c("Land.te", "Land.pr")] landscape2 <- melt(landscape2, id.vars = c("Plotn", "Year", "scale")) landscape2 <- dcast(landscape2, Plotn ~ variable + Year + scale, value.var = "value", fill = NA)
1 km radius whenever possible. Sensitivity with other radii.
Various measures possible. Use PCA to determine.
Question 1 : Do the years correlate? Use Landscape configuration measures if yes.
If yes : we can use earlier years.
# landscape2_long[, .(Plotn, Year, scale, Land.shdi, Land.te)] ylandscape2 <- landscape2[, .( Land.shdi_2012_1000, Land.shdi_2015_1000, Land.shdi_2018_1000, Land.te_2012_1000, Land.te_2015_1000, Land.te_2018_1000)] m <- cor(ylandscape2, method = "spearman") par(mfrow = c(1, 2)) corrplot(m, diag = F, order = "hclust", type = "lower") corrplot(m, diag = F, order = "hclust", type = "lower", method = "number") dev.off() test <- merge(landscape2[, .(Plotn, Land.shdi_2012_1000)], landscape[, .(Plotn, Shannon.1000)]) plot(test$Land.shdi_2012_1000, test$Shannon.1000) # correlation of 2008 and 2012 measure rm(test)
Years strongly correlate. Decide for a year. The earlier the better, because some response variables were measured more recently. LUI is included from 2006 - 2016.
Use from 2012.
ylandscape2 <- landscape2[, .( Land.shdi_2012_500, Land.shdi_2012_1000, Land.shdi_2012_1500, Land.shdi_2012_2000, Land.te_2012_500, Land.te_2012_1000, Land.te_2012_1500, Land.te_2018_2000)] m <- cor(ylandscape2, method = "spearman") par(mfrow = c(1, 2)) corrplot(m, diag = F, order = "hclust", type = "lower") corrplot(m, diag = F, order = "hclust", type = "lower", method = "number") dev.off()
All positively correlate, between 0.45 and 0.95. larger radii cluster. 500m radius clusters rather across measures than across radii. Use 1000m based on ecological thinking rather than based on mathematical necessity. Will run sensitivity analysis.
Idea : Find 2-3 measures to include, representative of most important axes. Hypothesis : use grassland permanency, as well as a measure of "landscape LUI".
plandscape2 <- landscape2[, .(Plotn, Land.te_2012_1000)] plandscape <- landscape[, .(Plotn, Shannon.1000, Arable.1000, Forest.1000, Grassland.perm.1000, Grassland.1000)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) # Plot information is taken as rownames of a data.frame (instead of data.table). plandscape_rownames <- plandscape$Plotn plandscape <- as.data.frame(plandscape[, !"Plotn", with=FALSE]) rownames(plandscape) <- plandscape_rownames rm(plandscape_rownames) # run the pca pca_landscape <- prcomp(plandscape, scale = T) plot_grid(plotlist = list(ggplot(data.frame(PCs = paste0("PC", seq(1, 6)), var = (pca_landscape$sdev)^2), aes(x = PCs, y = var)) + geom_bar(stat = "identity"), autoplot(pca_landscape, x = 1, y = 2, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 2, y = 3, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 3, y = 4, loadings = T, loadings.label = T) ), nrow = 2) # saved as "include_new_variables_landscape_pca_1000m.pdf"
grassland cover vs. grassland cover diversity
the choice of measures should not depend (too much) on the chosen radius.
# # # # # # 2 km # # # # # plandscape2 <- landscape2[, .(Plotn, Land.te_2012_2000)] plandscape <- landscape[, .(Plotn, Shannon.2000, Arable.2000, Forest.2000, Grassland.perm.2000, Grassland.2000)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) # Plot information is taken as rownames of a data.frame (instead of data.table). plandscape_rownames <- plandscape$Plotn plandscape <- as.data.frame(plandscape[, !"Plotn", with=FALSE]) rownames(plandscape) <- plandscape_rownames rm(plandscape_rownames) # run the pca pca_landscape <- prcomp(plandscape, scale = T) plot_grid(plotlist = list(ggplot(data.frame(PCs = paste0("PC", seq(1, 6)), var = (pca_landscape$sdev)^2), aes(x = PCs, y = var)) + geom_bar(stat = "identity"), autoplot(pca_landscape, x = 1, y = 2, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 2, y = 3, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 3, y = 4, loadings = T, loadings.label = T) ), nrow = 2) # saved as : include_new_variables_landscape_pca_2000m.pdf # # # # # # 500 m # # # # # plandscape2 <- landscape2[, .(Plotn, Land.te_2012_500)] plandscape <- landscape[, .(Plotn, Shannon.500, Arable.500, Forest.500, Grassland.perm.500, Grassland.500)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) # Plot information is taken as rownames of a data.frame (instead of data.table). plandscape_rownames <- plandscape$Plotn plandscape <- as.data.frame(plandscape[, !"Plotn", with=FALSE]) rownames(plandscape) <- plandscape_rownames rm(plandscape_rownames) # run the pca pca_landscape <- prcomp(plandscape, scale = T) plot_grid(plotlist = list(ggplot(data.frame(PCs = paste0("PC", seq(1, 6)), var = (pca_landscape$sdev)^2), aes(x = PCs, y = var)) + geom_bar(stat = "identity"), autoplot(pca_landscape, x = 1, y = 2, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 2, y = 3, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 3, y = 4, loadings = T, loadings.label = T) ), nrow = 2) # saved as : include_new_variables_landscape_pca_500m.pdf # # # # # # all radii # # # # # plandscape2 <- landscape2[, .(Plotn, Land.te_2012_500, Land.te_2012_1000, Land.te_2012_2000)] plandscape <- landscape[, .(Plotn, Shannon.500, Shannon.1000, Shannon.2000, Arable.500, Arable.1000, Arable.2000, Forest.500, Forest.1000, Forest.2000, Grassland.500, Grassland.1000, Grassland.2000)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) # Plot information is taken as rownames of a data.frame (instead of data.table). plandscape_rownames <- plandscape$Plotn plandscape <- as.data.frame(plandscape[, !"Plotn", with=FALSE]) rownames(plandscape) <- plandscape_rownames rm(plandscape_rownames) # run the pca pca_landscape <- prcomp(plandscape, scale = T) plot_grid(plotlist = list( ggplot(data.frame(PCs = factor(paste0("PC", seq(1, length((pca_landscape$sdev)))), levels = paste0("PC", seq(1, length((pca_landscape$sdev))))), var = (pca_landscape$sdev)^2), aes(x = PCs, y = var)) + geom_bar(stat = "identity"), autoplot(pca_landscape, x = 1, y = 2, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 2, y = 3, loadings = T, loadings.label = T), autoplot(pca_landscape, x = 3, y = 4, loadings = T, loadings.label = T) ), nrow = 2) # saved as : include_new_variables_landscape_pca_500_to_2000m.pdf
Sensitivity : not perfectly the same, but main directions. Arable vs. Land.te/ Shannon.
Shannon diversity and edge density. Landscape diversity and landscape configuration. Measures are correlated.
plandscape2 <- landscape2[, .(Plotn, Land.te_2012_1000)] plandscape <- landscape[, .(Plotn, Shannon.1000)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) plot(plandscape$Land.te_2012_1000, plandscape$Shannon.1000) # Plot information is taken as rownames of a data.frame (instead of data.table). plandscape_rownames <- plandscape$Plotn plandscape <- as.data.frame(plandscape[, !"Plotn", with=FALSE]) rownames(plandscape) <- plandscape_rownames rm(plandscape_rownames) # run the pca pca_landscape <- prcomp(plandscape, scale = T) ggplot(data.frame(PCs = paste0("PC", seq(1, 2)), var = (pca_landscape$sdev)^2), aes(x = PCs, y = var)) + geom_bar(stat = "identity") autoplot(pca_landscape, x = 1, y = 2, loadings = T, loadings.label = T)
Higher distance means that the landscape around the plots is very differently structured, in terms of how many different land uses are there, and how large the patches are. E.g. high diversity would indicate many different land uses. High total edges would indicate that there are many small patches (because many small patches have more edges than few large patches).
will be included. PCA corroborated idea that historic land-use explains other dimension than present land-use.
histLUI <- landscape[, .(Plotn, Grassland.perm.1000)]
We know from study https://doi.org/10.1038/s41467-021-23931-1 that % arable land is correlated with landscape LUI. The measure is treated as "lanscape LUI" there.
Diversity of landscape use and total edge are correlated. Makes sense because the more diverse land-uses are present in a landscape, the more edges we can observe. Is not completely the same tough: total edge measures landscape fragmentation, the diversity measures the number and abundance of land-uses at the landscape scale.
If we take them separately, we can interpret separately. Higher edges means smaller patches. But actually some of the functions (e.g. Groundwater recharge) need more space --> better if fragementation is not too high. And some functions might profit from a diverse surrounding in terms of habitat, or not.
We take as separate covariates, because we don't only want to correct for them, but even might interpret their effects.
plandscape2 <- landscape2[, .(Plotn, Land.te_2012_1000)] plandscape <- landscape[, .(Plotn, Grassland.perm.1000, Arable.1000, Shannon.1000)] plandscape <- merge(plandscape2, plandscape, by = "Plotn") rm(plandscape2) m <- cor(plandscape[, -1], method = "spearman") corrplot(m, diag = F, order = "hclust", type = "lower") corrplot(m, diag = F, order = "hclust", type = "lower", method = "number")
OK to include them in a model together (from a local perspective). Will be checked again at distance level.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.