plot_all_NG_biomarkers: Print a Forestplot for All Nightingale Biomarkers

Description Usage Arguments Details Value Author(s) Examples

View source: R/plot_all_NG_biomarkers.R

Description

Save a forestplot of all Nightingale biomarker associations in a 2-page, predefined layout (utilizes forestplot).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
plot_all_NG_biomarkers(
  df,
  machine_readable_name = machine_readable_name,
  name = NULL,
  estimate = estimate,
  se = se,
  pvalue = NULL,
  colour = NULL,
  shape = NULL,
  logodds = FALSE,
  psignif = 0.05,
  ci = 0.95,
  filename = NULL,
  paperwidth = 15,
  paperheight = sqrt(2) * paperwidth,
  xlims = NULL,
  layout = "2020",
  ...
)

Arguments

df

A data frame with the data to plot. It must contain at least three variables, a character column with the names to be displayed on the y-axis (see parameter name), a numeric column with the value (or the log of the value) to display (see parameter estimate) and a numeric value with the corresponding standard errors (see parameter se). It may contain additional columns, e.g. the corresponding p-values (see parameter pvalue) in which case, in conjuction with the threshold given in psignif, the non-significant results will be displayed as hollow points. Other variables may be used as aesthetics to define the colour and the shape of the points to be plotted.

machine_readable_name

the variable in df containing the machine readable names of Nightingale blood biomarkers. I.e. the names in this variable must be the same as in the machine_readable_name variable of df_NG_biomarker_metadata. (This argument is automatically quoted and evaluated in the context of the df data frame.)

name

the variable in df that contains the y-axis names. If NULL, names from df_NG_biomarker_metadata are used. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

estimate

the variable in df that contains the values (or log of values) to be displayed. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

se

the variable in the df data frame that contains the standard error values. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

pvalue

the variable in df that contains the p-values. Defaults to NULL. When explicitly defined, in conjuction with the p-value threshold provided in the psignif, the non-significant entries will be drawn as hollow points. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

colour

the variable in df by which to colour the different groups of points. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

shape

the variable in df by which to shape the different groups of points. This argument is automatically quoted and evaluated in the context of the df data frame. See Note.

logodds

logical (defaults to FALSE) specifying whether the estimate parameter should be treated as log odds/hazards ratio (TRUE) or not (FALSE). When logodds = TRUE the estimates and corresponding confidence intervals will be exponentiated and a log scale will be used for the x-axis.

psignif

numeric, defaults to 0.05. The p-value threshold for statistical significance. Entries with larger than psignif will be drawn with a hollow point.

ci

A number between 0 and 1 (defaults to 0.95) indicating the type of confidence interval to be drawn.

filename

a character string giving the name of the file.

paperwidth

page width in inches

paperheight

page height in inches

xlims

NULL or a numeric vector of length 2 specifying the common x limits across all biomarker subgroups.

layout

one of the predefined layouts in df_grouping_all_NG_biomarkers or custom layout tibble following the example of predifined layouts

...

ggplot2 graphical parameters such as title, ylab, xlab, xtickbreaks etc. to be passed along.

Details

The function uses a custom grouping specified by df_grouping_all_NG_biomarkers. The input df and df_grouping_all_NG_biomarkers are joined by machine_readable_name, while another df variable may be used for y-axis labels, defined in name input parameter.

Value

If filename is NULL, a list of plot objects (one for each page in layout) is returned.

Author(s)

Maria Kalimeri, Ilari Scheinin, Vilma Jagerroos

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
## Not run: 
# Join the built-in association demo dataset with a variable that contains
# the machine readable names of Nightingale biomarkers. (Note: if you
# have built your association data frame using the Nightingale CSV result file,
# then your data frame should already contain machine readable names.)
df <-
  df_linear_associations %>%
  left_join(
    select(
      df_NG_biomarker_metadata,
      name,
      machine_readable_name
    ),
    by = "name"
  )

# Print effect sizes for Nightingale biomarkers in a 2-page pdf
plot_all_NG_biomarkers(
  df = df,
  machine_readable_name = machine_readable_name,
  # Notice that when name is not defined explicitly, names from
  # df_NG_biomarker_metadata are used
  estimate = beta,
  se = se,
  pvalue = pvalue,
  colour = trait,
  filename = "biomarker_linear_associations.pdf",
  xlab = "1-SD increment in BMI
per 1-SD increment in biomarker concentration",
  layout = "2016"
)

# Custom layout can also be provided
layout <- df_NG_biomarker_metadata %>%
  dplyr::filter(
    .data$group == "Fatty acids",
    .data$machine_readable_name %in% df$machine_readable_name
  ) %>%
  dplyr::mutate(
    group_custom = .data$subgroup,
    column = dplyr::case_when(
      .data$group_custom == "Fatty acids" ~ 1,
      .data$group_custom == "Fatty acid ratios" ~ 2
    ),
    page = 1
  ) %>%
  dplyr::select(
    .data$machine_readable_name,
    .data$group_custom,
    .data$column,
    .data$page
  )

plot_all_NG_biomarkers(
  df = df,
  machine_readable_name = machine_readable_name,
  # Notice that when name is not defined explicitly, names from
  # df_NG_biomarker_metadata are used
  estimate = beta,
  se = se,
  pvalue = pvalue,
  colour = trait,
  xlab = "1-SD increment in BMI
per 1-SD increment in biomarker concentration",
  layout = layout
)

# log odds for type 2 diabetes
df <-
  df_logodds_associations %>%
  left_join(
    select(
      df_NG_biomarker_metadata,
      name,
      machine_readable_name
    ),
    by = "name"
  ) %>%
  # Set the study variable to a factor to preserve order of appearance
  # Set class to factor to set order of display.
  dplyr::mutate(
    study = factor(
      study,
      levels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS")
    )
  )

# Print effect sizes for Nightingale biomarkers in a 2-page pdf
plot_all_NG_biomarkers(
  df = df,
  machine_readable_name = machine_readable_name,
  # Notice that when name is not defined explicitly, names from
  # df_NG_biomarker_metadata are used
  estimate = beta,
  se = se,
  pvalue = pvalue,
  colour = study,
  logodds = TRUE,
  filename = "biomarker_t2d_associations.pdf",
  xlab = "Odds ratio for incident type 2 diabetes (95% CI)
per 1−SD increment in metabolite concentration",
  layout = "2016",
  # Restrict limits as some studies are very weak and they take over the
  # overall range.
  xlims = c(0.5, 3.2)
)

## End(Not run)

NightingaleHealth/ggforestplot documentation built on April 10, 2020, 7:01 p.m.