knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(sf)
library(mapview)
library(leaflet)
library(dplyr)
library(spud)

Spatial usage data is information about where an application is used. Usage data from location-based services has unique properties and challenges. Each record is associated with at least one user, time, and spatial position. However, one or more of these properties is often unavailable or removed from the data in the interest of user privacy. So what can we learn from this data?

Introduction

The goal of this package is to give researchers and app developers an easy way to visualize and analyze how and where people use a given service.

For my thesis I am looking at how to reduce the social isolation of forced-migrants (e.g. refugees and asylum seekers) through the use of location-based services. Such a tool will be particularly useful after I collect data through a real-world study to identify patterns that are not otherwise obvious.

Loading data

The spud package is designed to import data from a csv that looks something like this:

x = read.table(system.file("extdata", "dummy_data.csv", package = "spud"), header = TRUE, sep = ",")
knitr::kable(data.frame(head(x, 10)))

This is achieved via the read.spud method, which takes the source file's name and the coordinate reference system of the data as parameters.

spud = read.spud(file = "dummy_data.csv", crs = 4326)

This method returns a simple feature collection:

spud

Classes

The spud package provides an object-oriented framework for quickly and intuitively exploring spatial usage data. I chose to implement R6 classes because I wanted to learn something new and I read that R6 classes are nice as they resemble classes in other languages.

The spud package's two classes represent the most important objects in a location-based service: the application and its users.

App class

The App class is a simple container for application usage data with methods that allow us to visualize the data from a holistic perspective. It is initialized from a simple feature collection, such as the one returned by the read.spud method:

app = App$new(name = "My fancy app", usage_data = spud)

The spud package also provides a convenience method read.spud_app to load data directly from a file into an App instance:

app = read.spud_app(file = "dummy_data.csv", crs = 4326, name = "My fancy app")

App has a print method defined that allows us to quickly know which app we're looking at:

print(app)

Once we have our App object initialized, we might want to know who its users are:

app$users() # The unique ids of our users
app$user_count() # The total number of users

More interestingly, we might want to know where our app has been used. In any app, a user can take certain actions. When we see where our app is used, it is helpful to know what functionality was being used in those locations. Therefore spud provides an actions_map method that shows where and for what the app was being used at the same time:

app$actions_map()

This method uses the leaflet package by default, but can also be run using the mapview package. The mapview-flavored map shows more information on each data point but uses a less detailed basemap.

app$actions_map(flavor = "mapview")

We might also want to know what actions users take and where they tend to be when they first use the app. For this we have the first_actions_map method, which plots where each user took their first action:

app$first_actions_map()

User class

The User class is a also simple container for application usage data, but with methods focused on the perspective of just one user. It can also be initialized from a file or simple feature collection, but often we simply ask an App to get one of its Users for us:

user = app$get_user(user_id = 12)

Like the App class, the User class also has a print method defined that identifies who we are looking at:

print(user)

This method makes use of the first_action method, which returns information on the very first location record to be recorded for this user:

user$first_action()

The User class has the same actions_map method as the App class. It works the same way but only displays the user's own data:

user$actions_map()

For those who are interested in the temporal dimension as well, spud also allows you to see the user's path around their environment:

user$path_map()

General map methods

Both classes make use of shared base plotting methods, so as not to duplicate code unnecessarily.

There are two plot_usage_actions methods, one for each mapping package:

plot_usage_actions_leaflet(data = spud)
plot_usage_actions_mapview(data = spud)

The plotting of user paths is also generalized, to allow multiple user paths to be plotted on the same map in the future:

plot_user_path(data = spud, user_id = 1)

Development notes

Package check

The package checks cleanly on my machine except for one note, which complains about the dplyr syntax in the plot_user_path method. dplyr allows you to refer to previously undeclared variables in its filter and arrange functions, but the package checker thinks these are global variables without a proper definition.

Future work

There are many ways I would like to expand this package if I have time. Here are a few of the tasks that didn't fit into the initial development time frame:



lbraun/spud documentation built on May 9, 2019, 5:50 a.m.