plotQ: Generate barplots from qlists.

Description Usage Arguments Details Value See Also Examples

View source: R/pophelper.R

Description

Generate separate or joined barplots (group-level) from qlists.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
plotQ(qlist = NULL, imgoutput = "sep", clustercol = NA, sortind = NA,
  grplab = NA, selgrp = NA, ordergrp = FALSE, subsetgrp = NA,
  grpmean = FALSE, panelspacer = 0.1, showsp = TRUE, sppos = "right",
  splab = NA, splabsize = NULL, splabangle = NULL, splabcol = "grey30",
  splabface = "plain", spbgcol = NA, showtitle = FALSE, titlelab = NA,
  titlehjust = 0, titlevjust = 0.5, titlesize = NULL,
  titlecol = "grey30", titleface = "plain", titlespacer = 1.4,
  titleangle = 0, showsubtitle = FALSE, subtitlelab = NA,
  subtitlehjust = 0, subtitlevjust = 0.5, subtitlesize = NULL,
  subtitlecol = "grey30", subtitleface = "plain", subtitlespacer = 1.5,
  subtitleangle = 0, grplabspacer = 0, grplabheight = NA,
  grplabpos = 0.25, grplabsize = NA, grplabangle = NA, grplabjust = NA,
  grplabcol = "grey30", grplabalpha = 1, grplabface = "plain",
  showindlab = FALSE, sharedindlab = TRUE, useindlab = FALSE,
  indlabwithgrplab = FALSE, indlabspacer = 1.5, indlabheight = 0.2,
  indlabsep = " ", indlabsize = NULL, indlabangle = 90,
  indlabvjust = 0.5, indlabhjust = 1, indlabcol = "grey30",
  pointsize = NA, pointcol = "grey30", pointbgcol = "grey30",
  pointtype = "|", pointalpha = 1, linepos = 0.75, linesize = NA,
  linecol = "grey30", linetype = 1, linealpha = 1, showdiv = TRUE,
  divgrp = NA, divcol = "white", divtype = "21", divsize = 0.25,
  divalpha = 1, showlegend = FALSE, legendlab = NA, legendpos = "right",
  legendkeysize = 4, legendtextsize = 3, legendmargin = c(0.5, 0.5, 0.5,
  0), barsize = 1, barbordersize = 0, barbordercolour = NA,
  showyaxis = FALSE, showticks = FALSE, ticksize = 0.1,
  ticklength = 0.03, outputfilename = NA, imgtype = "png", height = NA,
  width = NA, dpi = 300, units = "cm", theme = "theme_grey",
  basesize = 5, font = "", na.rm = TRUE, quiet = FALSE,
  panelratio = c(3, 1), exportplot = TRUE, returnplot = FALSE,
  returndata = FALSE)

Arguments

qlist

A qlist (list of dataframes). An output from readQ.

imgoutput

A character with options: 'sep' or 'join'.If set to "sep", each run is plotted as separate image file. If set to "join", multiple runs are joined into a single image.

clustercol

A vector of colours for clusters. If NA, colours are automatically generated. K=1 to K=12 are custom unique colours while K>12 are coloured by function gplots::rich.colors().

sortind

A character indicating how individuals are sorted. Default is NA (Same order of individuals as in input file). Other options are 'all' (sorting by values of all clusters), by any one cluster (eg. 'Cluster1') or 'labels' (sorting by individual labels). See details.

grplab

A dataframe with one or more columns (group label sets), and rows equal to the number of individuals. See details.

selgrp

A single character denoting a selected group label set. The selected label must be a group label title used in grplab. See details.

ordergrp

A logical indicating if individuals must be grouped into contiguous blocks based on grplab starting with selgrp.

subsetgrp

A character or character vector with group names to subset or reorder groups. Only applicable when grplab is in use. Default is NA. See details.

grpmean

A logical indicating if q-matrix must be converted from individual values to group mean values. Applicable only when grplab is in use and mean is calculated over selgrp.

panelspacer

A numeric indicating the spacing between barplot panels in cm. Defaults to 0.06cm.

showsp

A logical indicating if strip panels on right side must be shown. Strip panel by default displays file name and K value. Defaults to TRUE.

sppos

A character indicating position of strip panel. One of 'right' or 'left'. Defaults to 'right'.

splab

A character or character vector denoting items displayed in the strip panels. Length must be equal to number of runs.

splabsize

A numeric indicating the size of the strip panel label. Computed automatically when set to NULL. Note that overall text size can be controlled using basesize.

splabangle

A numeric indicating angle/rotation of the strip panel label. Defaults to NULL. Automatically set to -90.

splabcol

A character indicating the colour of the strip panel label. Defaults to "grey30".

splabface

A character indicating the font face of strip panel label. One of 'plain', 'italic', 'bold' or 'bold.italic'. Defaults to 'plain'. Applicable only when showsp=T.

spbgcol

A character denoting the background colour of the strip panel. Defaults to white.

showtitle

A logical indicating if plot title must be shown on the top. Defaults to FALSE. If TRUE and titlelab=NA, file name is displayed by default.

titlelab

A character or character vector for title text. Defaults to NA, and when showtitle=TRUE displays file name.

titlehjust

A numeric denoting the horizontal justification of the title. Defaults to 0 (left).

titlevjust

A numeric denoting the vertical justification of the title. Defaults to 0.5 (center).

titlesize

A numeric indicating the size of the title text. Computed automatically when set to NULL. Note that overall text size can be controlled using basesize.

titlecol

A colour character for title. Defaults to "grey30".

titleface

A character indicating the font face of title label. One of 'plain', 'italic', 'bold' or 'bold.italic'. Defaults to 'plain'. Applicable only when showtitle=T.

titlespacer

A numeric indicating the space below the title. Defaults to 1.2.

titleangle

A numeric indicating the angle/rotation of the title. Defaults to 0.

showsubtitle

A logical indicating if plot subtitle must be shown on the top. Defaults to FALSE. If TRUE and subtitlelab=NA, file name is displayed by default.

subtitlelab

A character or character vector for subtitle text. Defaults to NA, and when showsubtitle=TRUE displays file name.

subtitlehjust

A numeric denoting the horizontal justification of the subtitle. Defaults to 0 (left).

subtitlevjust

A numeric denoting the vertical justification of the subtitle. Defaults to 0.5 (center).

subtitlesize

A numeric indicating the size of the subtitle text. Computed automatically when set to NULL. Note that overall text size can be controlled using basesize.

subtitlecol

A colour character for subtitle. Defaults to "grey30".

subtitleface

A character indicating the font face of subtitle label. One of 'plain', 'italic', 'bold' or 'bold.italic'. Defaults to 'plain'. Applicable only when showsubtitle=T.

subtitlespacer

A numeric indicating the space below the subtitle. Defaults to 1.2.

subtitleangle

A numeric indicating the angle/rotation of the subtitle. Defaults to 0.

grplabspacer

A numeric indicating the space between the plot panels and the group label area in cm. Defaults to 0cm. Applicable only when grplab are in use.

grplabheight

A numeric indicating the height of the group label area in cm. Defaults to 0.4cm. Multiple group sets are multiplied by 0.4. Applicable only with grplab. See details.

grplabpos

A numeric indicating the y position of the group labels. Applicable only with group labels. Defaults to 0.

grplabsize

A numeric indicating the size of the group labels. Default range between 1.5 - 2.5 depending on number of individuals. This text size is not affected by basesize.

grplabangle

A numeric indicating the angle/rotation of group labels. 0 is horizontal while 90 is vertical. Default is 0.

grplabjust

A numeric indicating the justification of group labels. Defaults to 0.5 if grplabangle=0 or 1 if grplabangle between 20 and 135.

grplabcol

A colour character for the colour of group labels. Defaults to "grey30".

grplabalpha

A numeric between 0 and 1 denoting transparency of group labels. Defaults to 1.

grplabface

A character specifying font face. Either 'plain', 'italic', 'bold' or 'bold.italic'.

showindlab

A logical indicating if individual labels must be shown. See details.

sharedindlab

A logical indicating if only one set of shared individual labels must be shown below all plots. Applicable only when imgoutput="join". Individual labels are visible only when showindlab=TRUE.

useindlab

A logical indicating if individual labels must be read from the rownames of qlist dataframes and used as individual labels. See details.

indlabwithgrplab

A logical indicating if individual labels must be concatenated with grplab. Applies only when grplab is in use. Relevant for sorting by label.

indlabspacer

A numeric denoting space between the individual label and the plot area. Default set to 0.

indlabheight

A numeric indicating space below the individual label panel. Increase to 0.1, 0.2 etc if labels are clipped off.

indlabsep

A character used as separator when concatenating individual labels and group labels. Defaults to space indlabsep=" ".

indlabsize

A numeric indicating the size of the individual labels. Computed automatically when set to NULL. Note that overall text size can be controlled using basesize.

indlabangle

A numeric indicating the angle/rotation of individual labels. 0 is horizontal while 90 is vertical. Defaults to 90.

indlabvjust

A numeric denoting vertical justification of the individual labels. Defaults to 0.5.

indlabhjust

A numeric denoting the horizontal justification of the individual labels. Defaults to 1.

indlabcol

A colour character for the colour of individual labels. Defaults to "grey30".

pointsize

A numeric indicating the size of points on label marker line. Default range between 1.2 - 3.2 depending on number of individuals.

pointcol

A colour character for the colour of points on the label marker line. Defaults to "grey30".

pointbgcol

A colour character for the background of marker point for certain point types.

pointtype

A character or number for the type of points on the label marker line. Defaults to |. Same as pch in standard R.

pointalpha

A numeric between 0 and 1 denoting transparency of the points. Defaults to 1.

linepos

A numeric indicating the y position of the label marker line and the points. Applicable only with group labels. Defaults to 1.

linesize

A numeric indicating the thickness of the label marker line. Default range between 0.3 and 0.6 depending on number of individuals.

linecol

A colour character for the label marker line. Defaults to "grey30".

linetype

A numeric indicating the type of line for marker line. Same as lty in standard R. Default value is 1.

linealpha

A numeric between 0 and 1 denoting transparency of the marker line. Defaults to 1.

showdiv

A logical indicating if divider lines between groups must be drawn. Applicable only when group labels are in use.

divgrp

A character or character vector with one or more group label titles denoting which groups are used to draw divider lines. This must be a group label title used in grplab. If not provided, the value in selgrp is used by default.

divcol

A character or hexadecimal colour denoting the colour of the divider line. Default is white.

divtype

A numeric indicating the type of line for the divider line. Same as lty in standard R. Default value is '21'.

divsize

A numeric indicating the thickness of the divider line. Default is 0.25.

divalpha

A numeric between 0 and 1 denoting transparency of the divider line. Defaults to 1.

showlegend

A logical indicating if legend denoting cluster colours must be plotted. Defaults to FALSE.

legendlab

A character or character vector to for legend cluster labels. Must be equal to max number of clusters.

legendpos

A character 'right' or 'left' denoting position of the legend. Defaults to 'left'.

legendkeysize

A numeric indicating size of the legend key. Defaults to 4.

legendtextsize

A numeric indicating size of the legend text. Defaults to 3.

legendmargin

A numeric vector of length 4 indicating top, right, bottom and left margins of the legend.

barsize

A numeric indicating the width of the bars. Defaults to 1.

barbordersize

A numeric indicating border size of bars. Defaults to 0. Visible only when barbordercolour is not NA.

barbordercolour

A single colour for bar border. Defaults to NA. Visible only when barbordersize is larger than zero and set to a colour other than NA.

showyaxis

A logical indicating if y-axis labels should be displayed or not. Defaults to FALSE. Y-axis size is same as indlabsize.

showticks

A logical indicating if ticks on axis should be displayed or not. Defaults to FALSE. Applies to x and y axis. Y-axis ticks are visible only when showyaxis=TRUE. Tick colour is same as indlabcol.

ticksize

A numeric indicating size of ticks. Defaults to 0.2. Applies to both x and y axis.

ticklength

A numeric indicating length of tick marks in cm. Defaults to 0.03. Applies to both x and y axis.

outputfilename

A character or character vector denoting output file name without file extension. See details.

imgtype

A character indicating output image file type. Possible options are "png","jpeg","tiff" or "pdf".

height

A numeric indicating the height of a single run panel. By default, automatically generated based on number of runs. Separate plots use 1.8cm and joined plots use 1.2cm for single panel. See details.

width

A numeric indicating the width of the whole plot. By default, automatically generated based on number of individuals. Ranges between 5cm and 30cm.

dpi

A numeric indicating the image resolution in pixels per inch (PPI). Defaults to 300. If imgtype="pdf", dpi is fixed at 300.

units

A numeric indicating the units of height and width. Default set to "cm". Other options are 'px', 'in' or 'mm'.

theme

A character indicating ggplot theme to be used. Use like "theme_grey", "theme_bw" etc.

basesize

A numeric indicating overall text size. Defaults to 5 suitable for export. Set to 11 for returned plot.

font

A character indicating font family to be used in the plots. Uses default system fonts by default for jpeg, png and tiff. Uses 'Helvetica' as default for pdf. Use package extrafonts to import custom fonts. See vignette for examples.

na.rm

A logical indicating if NAs are removed from data, else ggplot prints warning messages for NAs. If set to TRUE, NAs are removed before plotting and ggplot NA warning is suppressed.

quiet

A logical indicating if any messages are printed to console.

panelratio

A two value integer vector denoting ratio of plot panel to grplab panel. Defaults to c(3,1). Applicable only when grplab is in use.

exportplot

A logical indicating if a plot image must be exported into the working directory.

returnplot

A logical indicating if ggplot plot objects must be returned. See 'Value'.

returndata

A logical indicating if processed data must be returned. See 'Value'.

Details

sortind
This argument takes one character as input. Default NA means individuals are plotted in the same order as input. Individuals can be ordered by any one cluster. For ex. sortind="Cluster1" or sortind="Cluster2". To order by all clusters as the 'Sort by Q' option in STRUCTURE software, use sortind="all". When using sortind="label", individuals are sorted by individual labels (along with grplab if present). Individual labels can be displayed using showindlab=T. When using sortind with grplab, individuals are sorted within the groups.

grplab
grplab must be a data.frame. One or more label sets can be provided. Each label set must be a character vector equal to the number of individuals present in the qlist. For example, we can provide one group label set as such:
grplab=data.frame(labs=c("Grp A","Grp A","Grp B","Grp B"),stringsAsFactors=F)

Two group label sets can be provided as such:
grplab=data.frame(labs=c("Grp A","Grp A","Grp B","Grp B"),loc=c("Loc 1","Loc 2","Loc 2","Loc 2"),stringsAsFactors=F)

selgrp
When multiple group label sets are in use, selgrp defines which group label set is used for group ordering (ordergrp), subsetting (subsetgrp) and group mean (grpmean). selgrp is also used for plotting divider lines and and sorting (sortind). If selgrp is not specified, the first group label set is used by default.

ordergrp
When using grplab, labels may not be in contiguous blocks. Using ordergrp=TRUE, regroups individuals into contiguous blocks for all group label sets starting with selgrp.

subsetgrp
This argument takes one or more characters as input. Use only group labels used in one of the group label sets in grplab. For ex. In case of a group label set 'labs' with two grps in order 'Grp A' and 'Grp B', use subsetgrp=c("Grp B","Grp A") to change order of groups. Use subsetgrp="Grp B" to subset only Grp B. When using multiple group label sets, use selgrp to declare which group label set to subset.

outputfilename
Default is outputfilename=NA which means that output file names are automatically generated. When imgoutput="sep", the names of the qlist are used to create output labels. When imgoutput="join", one output label is created for all input files in this format: JoinedNFiles-YYYYMMDDHHMMSS, where N stands for number of runs joined, and the ending stands for current system date and time. If outputfilename is provided, when imgoutput="sep", outputfilename must be a character vector equal to length of input runs. When imgoutput="join", outputfilename must be a character of length one. File extensions like .png etc must not be provided.

height
Argument height denotes the height of one run panel. With joined plots, the height is multiplied by number of runs. The height does not include label panel. The grplabheight is used to define the full height of the lab panel. If grplabheight is not provided, it is calculated based on the number of group label sets. final_image_height = (height*num_of_runs)+grplabheight It is possible to set either height or width and leave other as default.

indlab
When showindlab=T, individual labels are shown/displayed. When showindlab=F, individual labels are not shown/displayed on the graph, although they are present in the underlying data. Therefore, showindlab only control display of labels on the plot and nothing to do with label control in the data.
The default useindlab=F, creates labels numerically in the original order of data but with zero padding. For example, if there are 10 individuals, labels are 01, 02 up to 10. if there are 100 individuals, then labels are 001, 002 up to 100. Zero padding to ensure optimal sorting. When useindlab=T, labels are used from rownames of qlist dataframes. They are usually labelled 1,2,3.. if read in using readQ(). This can be an issue with sorting by labels sortind="label". For STRUCTURE files with individual labels, they can be read in automatically using readQ(indlabfromfile=T).
When group labels are in use, grplab, they are added to the individual labels in both cases useindlab=T and useindlab=F separated by indlabsep. Default indlabsep=" " adds a space between individual label and grplab. For example, group labels 'popA', 'popA'... will be '01 popA', '02 popA'... when useindlab=F and usually '1 popA', '2 popA'... when useindlab=T. When multiple group labels are in use, the are similarly concatenated one after the other to individual names in the order in which the group labels were provided.

See the vignette for more details.

Value

When returnplot=TRUE, plot object(s) are returned. When grplab=NA, a ggplot2 object is returned. When grplab is in use, a gtable (output from gridExtra::arrangeGrob()) list is returned. When returndata=TRUE, the input qlist is modified (sorted, subsetted etc) and returned. If grplab is in use, a list of modified qlist and grplab is returned. If returnplot=TRUE and returndata=TRUE are both set, then a named list (plot,data) is returned. The plot item contains the ggplot2 object or gtable and the data contains qlist (and grplab).

See Also

plotQMultiline

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
slist <- readQ(list.files(path=system.file("files/structure",package="pophelper"),full.names=TRUE))

# plot one separate figure
plotQ(qlist=slist[1])

# plot two separate figures
plotQ(qlist=slist[1:2])

# plot a joined figure with multiple plots
plotQ(qlist=slist[1:2],imgoutput="join")

# sort individuals
plotQ(qlist=slist[c(1,3)],sortind="all")
plotQ(qlist=slist[c(1,3)],sortind="Cluster1")
plotQ(qlist=slist[c(1,3)],sortind="label")
plotQ(qlist=slist[c(1,3)],sortind="all",imgoutput="join",sharedindlab=F)

# read group labels
md <- read.delim(system.file("files/metadata.txt", package="pophelper"), header=T,stringsAsFactors=F)

# plot with one group label set
plotQ(qlist=slist[1],grplab=md[,2,drop=F])
plotQ(qlist=slist[1:2],grplab=md[,2,drop=F],imgoutput="join")

# sort within groups
plotQ(qlist=slist[1:2],grplab=md[,2,drop=F],imgoutput="join",sortind="all",sharedindlab=F)
plotQ(qlist=slist[1:2],grplab=md[,2,drop=F],imgoutput="join",sortind="Cluster1",sharedindlab=F)
plotQ(qlist=slist[1:2],grplab=md[,2,drop=F],imgoutput="join",sortind="label")

# reorder groups
plotQ(qlist=slist[1],grplab=md[,2,drop=F],subsetgrp=c("CatB","CatA"))

# multiple group label sets and ordergrp
plotQ(qlist=slist[1],grplab=md,ordergrp=TRUE)
plotQ(qlist=slist[1:2],grplab=md,ordergrp=TRUE,imgoutput="join")

# sort in second label group set cat
plotQ(qlist=slist[1],grplab=md,selgrp="cat",sortind="all")

# use default individual labels
plotQ(slist[1],showindlab=T,width=15)

# use custom individual labels
inds <- read.delim(system.file("files/structureindlabels.txt",package="pophelper"),header=FALSE,stringsAsFactors=FALSE)
rownames(slist[[1]]) <- inds$V1
plotQ(slist[1],showindlab=T,useindlab=T,width=15)

# change cluster colours
plotQ(slist[1],clustercol=c("steelblue","coral"))

# plot a custom dataframe
temp <- list("custom"=data.frame(Cluster1=c(0.2,0.3,0.6,0.8),Cluster2=c(0.8,0.7,0.4,0.2)))
plotQ(temp)

royfrancis/pophelper documentation built on Aug. 21, 2018, 4:48 a.m.