run_rbc: Run RBC

Description Usage Arguments Value Examples

View source: R/run_rbc.R

Description

A function that attempts to determine long-term migration statuses, and pre-crossing and post-crossing residence statuses, for all border crossings where these statuses are not known.

Usage

1
2
3
run_rbc(crossing_data, init_res_status_data = NULL, window_size = 487,
  threshold_year = 365, parallel = FALSE, n_core = 2, max_ram = 2,
  include_error_columns = FALSE, mc.cleanup = FALSE)

Arguments

crossing_data

A pre-processed group data contain journeys, movements and other raw crossing data. The data should contain columns in the set of 'journeyId', 'personId', 'date_crossing', 'is_arrival', 'journey_sequence', and 'journeyId_prev'.

init_res_status_data

Optional, the raw data of the initial residence status in the format of data frame. The journey data should contain columns in the set of 'personId', 'res_status_initial', and 'date_finalised' if applied. The initial data is a supplementary to the crossing_data that provides the initial residence status of the target people who made the border crossing (journey).

window_size

The maximum length of the scanning period. Can be an integer giving the number of days, the result of a call to function difftime, or an object of class Duration.

threshold_year

The length of the yearly test period. It can be an integer giving the number of days, the result of a call to function difftime, or an object of class Duration.

parallel

Logical. Whether to use parallel processing, to speed up the calculation of migration statuses. Defaults to TRUE.

n_core

The number of cores to use, if parallel is TRUE. Defaults to 2. Higher values will typically result in faster calculations on computers with more than two cores.

max_ram

Optional, it is used to limit the RAM that can be used by this function. The default value is 2 Gb.

include_error_columns

Optional, if it is TRUE, the returned result of error_data will contain two extra columns error_code and error_message.

mc.cleanup

Optional, if set to TRUE then all children that have been forked by this function will be killed (by sending SIGTERM) before this function returns. Under normal circumstances mclapply waits for the children to deliver results, so this option usually has only effect when mclapply is interrupted. If set to FALSE then child processes are collected, but not forcefully terminated. As a special case this argument can be set to the number of the signal that should be used to kill the children instead of SIGTERM.

Value

A list type of object that contains two items: one is a data frame object that contains classified journeys and the other contains journeys that have been marked as error. Both items contain the same table structure in the set of 'journeyId', 'journeyId_prev', 'personId', 'date_crossing', 'is_arrival', 'journey_sequence','days_to_next_crossing', 'res_status_before', 'res_status_after', 'is_long_term_mig', 'date_finalised_res_before', 'date_finalised_res_after' and 'date_finalised_LTM'. The Boolean value (0, and 1) in the column 'is_long_term_mig' is the key classified result that tells us which journey derived the person to be a long term migrant.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## generate test data 100 people and each person has 
## 10 journeys

## to suppresse log messages on the screen
migrbc::initialize_logger(log_level = 1)

number_of_people <- 100
person_data <- migrbc::setup_random_test_data(
    number_of_people, 
    initial_date = '2001-01-01', 
    numJourneys = 10,
    min = 0, 
    max = 100)
head(person_data)

cross_spaces <- migrbc::pre_process(person_data)

## run in non-parallel
res <- migrbc::run_rbc(cross_spaces, 
                       window_size = 487, 
                       threshold_year = 365, 
                       parallel=FALSE)

## run in parallel with n_core = 2
cross_spaces <- migrbc::pre_process(person_data, n_groups = 2)
res <- migrbc::run_rbc(cross_spaces, 
                       window_size = 487, 
                       threshold_year = 365, 
                       parallel=TRUE,
                       n_core = 2)

head(res$journeys)
head(res$error_data)

migrbc documentation built on July 1, 2020, 8:14 p.m.