r_data_frame: Data Frame Production (From Variable Functions)

Description Usage Arguments Value Author(s) References See Also Examples

Description

Produce a tbl_df data frame that allows the user to lazily pass unnamed wakefield variable functions (optionally, without call parenthesis).

Usage

1
r_data_frame(n, ..., rep.sep = "_")

Arguments

n

The length to pass to the randomly generated vectors.

rep.sep

A separator to use for repeated variable names. For example if the age is used three times (r_data_frame(age, age, age)), the name "Age" will be assigned to all three columns. The results in column names c("Age_1", "Age_2", "Age_3"). To turn of this behavior use rep.sep = NULL. This results in c("Age", "Age.1", "Age.2") column names in the data.frame.

...

A set of optionally named arguments. Using wakefield variable functions require no name or call parenthesis.

Value

Returns a tbl_df.

Author(s)

Josh O'Brien and Tyler Rinker <tyler.rinker@gmail.com>.

References

https://stackoverflow.com/a/29617983/1000343

See Also

r_list, r_series r_dummy

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
r_data_frame(n = 30,
    id,
    race,
    age,
    sex,
    hour,
    iq,
    height,
    died,
    Scoring = rnorm,
    Smoker = valid
)

r_data_frame(n = 30,
    id,
    race,
    age(x = 8:14),
    Gender = sex,
    Time = hour,
    iq,
    grade, grade, grade,  #repeated measures
    height(mean=50, sd = 10),
    died,
    Scoring = rnorm,
    Smoker = valid
)

r_data_frame(n = 500,
    id,
    age, age, age,
    grade, grade, grade
)

## Repeated Measures/Time Series
r_data_frame(n=100,
    id,
    age,
    sex,
    r_series(likert, 3),
    r_series(likert, 4, name = "Item", integer = TRUE)
)

## Expanded Dummy Coded Variables
r_data_frame(n=100,
    id,
    age,
    r_dummy(sex, prefix=TRUE),
    r_dummy(political)
)

## `peek` to view al columns
## `plot` (`table_heat`) for a graphic representation
library(dplyr)
r_data_frame(n=100,
    id,
    dob,
    animal,
    grade, grade,
    death,
    dummy,
    grade_letter,
    gender,
    paragraph,
    sentence
) %>%
   r_na() %>%
   peek %>%
   plot(palette = "Set1")

Example output

# A tibble: 30 x 10
   ID    Race       Age Sex    Hour        IQ Height Died  Scoring Smoker
   <chr> <fct>    <int> <fct>  <times>  <dbl>  <dbl> <lgl>   <dbl> <lgl> 
 1 01    Hispanic    24 Male   00:00:00    99     65 FALSE   1.43  TRUE  
 2 02    White       40 Male   00:00:00    82     67 TRUE   -0.338 FALSE 
 3 03    White       44 Male   00:00:00   103     68 TRUE   -0.630 FALSE 
 4 04    White       18 Female 01:00:00    96     69 TRUE    1.14  TRUE  
 5 05    White       49 Female 02:00:00    86     65 TRUE    1.26  TRUE  
 6 06    White       84 Female 02:30:00    85     68 FALSE   0.962 TRUE  
 7 07    White       52 Male   03:00:00    95     75 FALSE   0.324 FALSE 
 8 08    White       59 Female 04:30:00    79     69 TRUE    0.996 FALSE 
 9 09    White       25 Male   04:30:00   106     69 TRUE    1.32  FALSE 
10 10    Native      45 Female 05:00:00   116     69 TRUE    1.32  TRUE  
# … with 20 more rows
Warning message:
`tbl_df()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
# A tibble: 30 x 13
   ID    Race    Age Gender Time     IQ Grade_1 Grade_2 Grade_3 Height Died 
   <chr> <fct> <int> <fct>  <tim> <dbl>   <dbl>   <dbl>   <dbl>  <dbl> <lgl>
 1 01    White    13 Male   01:091    85.6    91.8    86.1     79 TRUE 
 2 02    White     8 Female 03:386    88      89.1    92.1     56 FALSE
 3 03    White     8 Male   04:392    84.6    74.3    88       39 TRUE 
 4 04    White    12 Male   05:089    90.1    88.5    85.9     41 TRUE 
 5 05    White    14 Male   05:3105    97      90.2    87.7     49 TRUE 
 6 06    White     8 Male   06:0113    89.7    87.9    96       63 TRUE 
 7 07    White    14 Male   07:0109    89      85.4    83.8     57 TRUE 
 8 08    Hisp11 Male   08:092    90.3    88.1    91.4     65 TRUE 
 9 09    White    14 Male   09:3110    80.8    89.5    91.6     54 FALSE
10 10    Hisp9 Male   10:3121    80.7    87.6    93.4     40 FALSE
# … with 20 more rows, and 2 more variables: Scoring <dbl>, Smoker <lgl>
# A tibble: 500 x 7
   ID    Age_1 Age_2 Age_3 Grade_1 Grade_2 Grade_3
   <chr> <int> <int> <int>   <dbl>   <dbl>   <dbl>
 1 001      41    82    25    94.1    87.4    85.5
 2 002      69    85    49    86.7    84.2    97.9
 3 003      18    63    39    86.4    90.6    84  
 4 004      53    47    49    84.4    87.4    91.8
 5 005      70    84    82    90.1    82.6    93.7
 6 006      39    43    45    83.5    85      85.2
 7 007      60    32    45    92      91.4    88.9
 8 008      49    19    81    92.7    91.6    79.1
 9 009      82    74    89    87.6    95.5    82.8
10 010      23    47    43    90.1    88.2    86.2
# … with 490 more rows
# A tibble: 100 x 10
   ID      Age Sex    Likert_1   Likert_2  Likert_3  Item_1 Item_2 Item_3 Item_4
   <chr> <int> <fct>  <ord>      <ord>     <ord>      <int>  <int>  <int>  <int>
 1 001      36 Female Disagree   Agree     Strongly5      5      3      3
 2 002      45 Female Disagree   Agree     Strongly3      1      4      5
 3 003      66 Female StronglyStronglyAgree          3      1      5      4
 4 004      66 Male   StronglyStronglyNeutral        4      3      5      5
 5 005      27 Female Agree      StronglyAgree          3      1      2      3
 6 006      89 Male   StronglyDisagree  Agree          5      1      3      3
 7 007      28 Female StronglyAgree     Neutral        2      4      3      1
 8 008      72 Male   StronglyStronglyStrongly5      3      3      2
 9 009      52 Female StronglyAgree     Disagree       3      4      1      4
10 010      20 Male   Agree      Neutral   Strongly3      4      4      4
# … with 90 more rows
# A tibble: 100 x 6
   ID      Age Sex_Male Sex_Female Democrat Republican
   <chr> <int>    <int>      <int>    <int>      <int>
 1 001      41        0          1        1          0
 2 002      86        0          1        1          0
 3 003      25        1          0        1          0
 4 004      21        1          0        1          0
 5 005      58        1          0        1          0
 6 006      87        1          0        0          1
 7 007      36        1          0        1          0
 8 008      34        0          1        1          0
 9 009      52        0          1        1          0
10 010      62        0          1        0          1
# … with 90 more rows

Attaching package:dplyrThe following object is masked frompackage:wakefield:

    id

The following objects are masked frompackage:stats:

    filter, lag

The following objects are masked frompackage:base:

    intersect, setdiff, setequal, union

Source: local data frame [100 x 11]

    ID        DOB     Animal Grade_1 Grade_2 Death Dummy Grade_Lett Gender  Paragraph   Sentence
1  001 2006-12-01 Roseate Sp    81.7      89  TRUE     0         B+ Female Lorem ipsu It makes a
2  002 2007-05-03        Ant    85.9    88.4  TRUE     1         B+ Female Varius odi But when w
3  003 2007-11-12    Aye Aye    83.2    90.1  TRUE     1         A-   Male Ex a, sagi The price 
4  004 2007-03-04        Ant    92.3    84.7  TRUE     1         C+ Female In ut. Mau The top fi
5  005 2006-08-23 Radiated T    82.8    86.9  TRUE     0         B+ Female Justo lobo You said t
6  006 2006-07-14      Dhole    89.9    78.7  TRUE     1         A- Female Leo himena Through ou
7  007 2006-10-11 Giant Afri    82.1    88.6  TRUE     0          B Female Vestibulum Candy, wha
8  008 2006-07-30 Roseate Sp    90.5    88.8  TRUE     1          A Female Sociis mae That's the
9  009 2006-11-22 Indian Sta    91.7    <NA> FALSE     0       <NA> Female Sed bibend Let me men
10 010 2006-08-26 Flying Squ    87.3    87.4  <NA>     1          A Female Nisi preti Well, of c
.. ...        ...        ...     ...     ...   ...   ...        ...    ...        ...        ... 

wakefield documentation built on Sept. 14, 2020, 1:07 a.m.