View source: R/rowwise_summaries.R
summarise_numerical_variables | R Documentation |
Summarises numerical variables with repeated measurements either by field (i.e. all available measurements) or by instance (i.e. for all measurements at each assessment visit). Currently available summary options are mean, minimum, maximum, sum and number of non-missing values.
summarise_numerical_variables(
ukb_main,
data_dict = NULL,
ukb_data_dict = get_ukb_data_dict(),
summary_function = "mean",
summarise_by = "Field",
.drop = FALSE
)
ukb_main |
A UK Biobank main dataset data frame. Column names must match
those under the |
data_dict |
a data dictionary specific to the UKB main dataset file,
created by |
ukb_data_dict |
The UKB data dictionary (available online at the UK
Biobank
data
showcase. This should be a data frame where all columns are of type
|
summary_function |
The summary function to be applied. Options: "mean", "min", "max", "sum" or "n_values" |
summarise_by |
Whether to summarise by "Field" or by "Instance". |
.drop |
If |
Note that when summary_function = "sum"
, missing values are converted
to zero. Therefore if a set of values are all missing then the sum
will summarised as 0
. See the documentation for
rowSums
for further details.
A data frame with new columns summarising numerical variables. The
names for these new columns are prefixed by the value for
summary_function
and end with 'x', FieldID +/- instance being
summarised e.g. if summarising FieldID 4080 instance 0, the new column
would be named 'mean_systolic_blood_pressure_automated_reading_x4080_0'.
library(magrittr)
# get dummy UKB data and data dictionary
dummy_ukb_data_dict <- get_ukb_dummy("dummy_Data_Dictionary_Showcase.tsv")
dummy_ukb_codings <- get_ukb_dummy("dummy_Codings.tsv")
dummy_ukb_main <- read_ukb(
path = get_ukb_dummy("dummy_ukb_main.tsv", path_only = TRUE),
ukb_data_dict = dummy_ukb_data_dict,
ukb_codings = dummy_ukb_codings
) %>%
dplyr::select(eid, tidyselect::contains("systolic_blood_pressure")) %>%
tibble::as_tibble()
# summarise mean values by Field, keep original variables
summarise_numerical_variables(
dummy_ukb_main,
ukb_data_dict = dummy_ukb_data_dict
)
# summarise mean values by Field, drop original variables
summarise_numerical_variables(
dummy_ukb_main,
ukb_data_dict = dummy_ukb_data_dict,
.drop = TRUE
)
# summarise min values by instance, dropping original variables
summarise_numerical_variables(
dummy_ukb_main,
ukb_data_dict = dummy_ukb_data_dict,
summary_function = "min",
summarise_by = "Instance",
.drop = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.