sampledf: devide the data into test and training set, and prepare the...

Description Usage Arguments Details Value Examples

View source: R/sampledf.R

Description

devide the data into test and training set, and prepare the dataframe for modeling

Usage

1
2
3
4
5
6
7
sampledf(
  originaldata,
  fraction = 0.8,
  country2digit = NA,
  grepstring_rm = "ID|LATITUDE|LONGITUDE|ROAD_0|geometry|countryfullname",
  rm_neg_col = F
)

Arguments

originaldata

original dataframe

fraction,

fraction for the training

country2digit

country code, 2 digit, if NA or not in the database, the world is returned, the sampling is equal fraction per country

grepstring_rm

the variables that are to be removed, grepl style, e.g. 'ID|LATITUDE|LONGITUDE|ROAD_0|geometry|countryfullname'

rm_neg_col

if True, the columes containing all negative values or 0s are removed

Details

NA values are removed. This function is used for preprocessing, can be improved.

Value

inde_var the matrix containing response and predictors. Default is false.

Examples

1
2
3
4
a = sampledf(merged, fraction = 0.8, 'ES')
test = a$test
training = a$training
inde_var = a$indevar

mengluchu/APMtools documentation built on Jan. 27, 2022, 2:41 a.m.