panelDesc: micromapST panel description data.frame structure
In micromapST: Linked Micromap Plots for U. S. and Other Geographic Areas

panelDesc

R Documentation

micromapST panel description data.frame structure

Description

The panelDesc data.frame provides the micromapST function with the information required to process the statsDFrame data and panelData data.frames and to generate the required linked micromap plot.
It specifies which columns in the statsDFrame data.frame contain the data for each glyph column, the column types, labels, reference values and text, and when more complex data is needed by a glyph (boxplot and time series) what the name of the data structure..

  Example
    panelDesc = data.frame(
        type=c("mapcum","id","dotconf","dotconf"),
        lab1=c("","","White Males","White Females"),
        lab2=c("","","Rate and 95% CI","Rate and 95% CI"),
        lab3=c("","","Deaths per 100,000","Deaths per 100,000"),
        col1=c(NA,NA,"Rate",9), 
        col2=c(NA,NA,4,11),
        col3=c(NA,NA,5,12),
        colSize=c(NA,NA,5,5),
        refVals=c(NA,NA,NA,wflungUS[,1]),
        refTexts=c(NA,NA,NA,"US Rate"),
        panelData=c("","","","")

The panelDesc data.frame (which does not have to be named "panelDesc", any name will do) provides the means of defining how many columns to create, the type of glyph per column, where the data required by the glyph is located in the statsDFrame (column number or name) or the name of a supplemental data structure when the glyph is boxplots or time series (via the panelData list entry), the column titles, and the column's reference value and label for the link micromap generation.

In the following description the term "AREA" represents the geographic unit being mapped and associated with data in the statsDFrame. The naming used must match the border group specified. If the border group of "USStatesDF" is used, the areas are U.S. States and DC and 51 data rows must be present. If the border group of "USSeerDF" is used, the areas are U.S. Seer areas as defined by NCI and the number of data rows can be 9, 11, 13, 17 or 18. In all cases, the abbreviations and names defined in the border group dataset must be used in preparing the statsDFrame and panelData structures.

Glyph Types

The type vector defines the type of glyph to be used for each column. The available glyphs are:

Map types:: "map", "mapcum","maptail","mapmedian"
State or Area ID and/or Name:: "id"
Ranking:: "rank"
Graphical Type:: "dot", "dotse","dotconf", "dotsignf", "bar", "arrow", "ts", "tsconf","scatdot", "segbar", "normbar", "ctrbar", "boxplot"

The following provides a description of each panel type:

map: - US map with active areas colored
mapcum: - US map with active areas colored and previously active area highlighted generating an accumulation from top to bottom
maptail: - US map with active areas colored and previously active area highlighted until the median area, then the reverse to the end (areas that have not been active are highlighted.)
mapmedian: - US map with active areas colored. Maps above the median area have areas with values above the median highlighted. Maps below the median area have areas with values below the median highlighted. This helps define the above and below median area groups.
id: - generates a column with a colored identifier (a square) and the area or area name or abbreviation.
rank: - number the area in rank order, sequentially.
arrow: - an arrow between two values with a head.
bar: - a single bar chart.
boxplot: - a boxplot per area with box, upper and lower whiskers and outliers.
dot: - a dot for a single value.
dotse: - a dot for a single value and its standard error.
dotconf: - a dot for a single value and its confidence interval.
dotsignf: - a dot for a single value with an indicator of its significants.
ts: - a time series line for up to 30 sets "x" and "y" values for each area. The TS glyph can have X-Axis labels formated as numbers or dates.
tsconf: - a time series line for a up to 30 sets of "x", "y" and upper "y" and lower "y" values as a confidence interval band for each area. The TSConf glyph can have X-Axis labels formated as numbers or dates.
segbar: - a horizontal stacked (segmented) bar plot starting at 0 for 2 to 9 bars.
normbar: - a stacked bar plot where the data is normalized for each area by dividing the bar segment values by the sum of the values for all of the bars. Up to 9 bars are supported.
ctrbar: - a stacked bar plot where the bar segments are centered around the 0.Up to 9 bars are supported.

scatdot: - a set of points for each area with an "x" and "y" value.

Labels (Column Headers and Footers)

micromapST supports up to 3 column labels or titles: lab1, lab2 and lab3, where lab1 and lab2 are header titles for the column. lab3 is the footer title for the column. All titles are optional. lab3 is used to indicate the unit of measure at the bottom of the columns, but is not limited to this use. For example:

     lab1=c("Col1-Title", "Col2=Title", "Col3-Title" ) # 1st title for columns
     lab2=c("Col1-Sub",   "Col2-Sub",   "Col3-Sub"   ) # 2nd title for columns
     lab3=c("Col1-Footer","Col2-Footer","Col3-Footer") # Footer title for columns

lab4 is used only when time series or scatter dot glyphs are used to provide a Y axis title for the column. All label-title vectors are optional and only required when an title or label is needed.

Data References

Depending on the type of glyph selected for the column, 1 to 3 data values for each area may be required: The col1, col2 and col3 vectors serve as indexes to columns in the statsDFrame data.frame passed in the arguments of the micromapST function call. The values can be either the numeric number of the row in statsDFrame data.frame or the column name. If no index is required, the entry should be set to NA.

If the glyph requires one value, then only the col1 index is used and the col2 and col3 indexes are set to NA if present . If 2 values are required, then col1 and col2 indexes are used and the col3 index is set to NA, if present. If 3 values are required, then col1, col2, and col3 indexes are used.

The statsDFrame column indexes can be provided as an integer or the column name. If the integer value is less than 1 or greater than the number of columns in statsDFrame or a column name is used that does not exist in statsDFrame, the micromapST function will stop and generate an error message.

Glyph	Meaning	col1	col2	col3	panelData
Name

arrow	Arrow	Beginning	Ending Values	NA	NA
		Values	(arrow head)



bar	Horizontal	Bar end	NA	NA	NA
	bar	values
		(length)


segbar	Horizontal	Values for	Values for	NA	NA
	stacked	first (left	the last
	bar	-most) segment	(right-most)
		(length)	bar segment
			(length)

normbar	Horizontal	Values for	Values for	NA	NA
	stacked	first (left-	last (right-
	bar, nor-	most) bar	most,bar
	malized to	segment	segment
	total 100%	(length)	(length)

ctrbar	Horizontal	Values for	Values for	NA	NA
	stacked	first (left-	last (right-
	bar, cen-	most) bar	most,bar
	tered on	segment	segment
	the middle	(length)	(length)
	bar

boxplot	Horizontal	NA	NA	NA	Name of
	box plot				output
					list from
					call to
					boxplot(...plot=F)

dot	Dot	Values for	NA	NA	NA
		dots




dotconf	Dot with	Values	Values of	Values for	NA
	confidence	for dots	lower limits	upper limits tab
	interval
	line


dotse	Dot with	Values for	Standard	NA	NA
	line length	dots	errors
	+/- standard
	error


dotsignf	Dot	Values for	P value	NA	NA
	overprinted	dots	associated
	if not		with dot
	significant


scatdot	Scater plot	Values on	Values on	NA	NA
	of dots	horizontal	vertical
		(x) axis	(y) axis


ts	Time Series	NA	NA	NA	Name of array
	(line) plot				with dimensions
					of c(51,t,2),
					where t = #
					of time points
					(max 15), x values
					in [,,1], y values
					in [,,2]

tsconf	Time Series	NA	NA	NA	Name of array
	(line) plot				with dimensions
	with confidence				of c(51,t,4), as ts
	limits				lower limit is
					[,,3] amd the
					upper limit is
					[,,4]

The panelData data.frame is only used when a glyph requires more data per area than can be provided by the statsDFrame columns. Only glyphs using this vector are boxplots and time series.

In the case of the boxplot glyph, the boxplot function with plot=F is used to generate the boxplot statistical details for each area. The name of the resulting list of 51 sets of boxplot statistics (one for each area) is placed in the panelData data.frame element for the boxplot column.

For the time series and time series with confidence interval, the glyphs require a 3 dimensional array of data. The first dimension ([area,,]) represents the areas. The second dimension ([,t,]) ranges from 2 to n. There is no upper limit, but 200-250 samples is a practical limit. One for each data point. The third dimension ([,,v]) provides the values at data point t for area st. [,,1] is the x axis value. For time series, is usually just the value 1 to n to order the y values. [,,2] is the median y value. For time series with confidence intervals: [,,3] is the lower value y and [,,4] is the upper value y.

Reference Lines

Reference lines can be created in arror, bar, dot, dotconf, dotse, and segbar glyphs by specifying the reference values in the RefVal= vector. A label appearing at the bottom of the column can be specified using the RefTxt= vector in the panelDesc data.frame.

Format

The parameters in the panelDesc data.frame structure are:

type=

The types of graphics for each column of panels can be specified by the following keywords in the "type variable":

The following are the type of glyphs that can be specified in the type vector:

Map types:: "map", "mapcum","maptail","mapmedian"
State ID and/or Name:: "id"
Glyph Type:: "dot", "dotse","dotconf", "dotsignf", "bar", "arrow", "ts", "tsconf","scatdot", "segbar", "normbar", "ctrbar", "boxplot"

The following provides a description of each panel type:

map: - a non-highlighted map
mapcum: - maps show the accumulated areas top to bottom
maptail: - maps show the accumulated areas from the top and bottom toward median area.
mapmedian: - the maps above the median highlight the areas above the median area and maps below the median highlight areas below the median area based on the sorting variable.
id: - generates a column with a color identifier (a filled in square) and the area abbreviation or name. The plotNames parameter in the micromapSEER call controls whether the area's full name or 2 character abbreviation is displayed.
rank: - sequentially number areas from 1 (highest rank) to "n" (lowest rank)
arrow: - an arrow from value 1 to value 2 with value 2 the head of the arrow.
bar: - a bar for a single set of values, The values can be positive or negative.
boxplot: - a boxplot for each area using a data.frame generated by the boxplot function with plot=F. The name of the boxplot data.frame is passed to micromapSEER using the panelData vector.
dot: - a dot for a single value using one set of values.
dotse: - a dot for a single value and its standard error using two values.
dotconf: - a dot for a single value and its confidence interval using three values.
dotsignf: - a dot for a single value overlaid if value is not significant using two values: value for dot and P value.
ts: - a time series line plot for each area. The glyph use the panelData vector to get the name of a three (3) dimensional array the data for the plot. The array contains one entry per area, 1 to 30+ data points and the x and y values. See section on panelData below for more details. A reasonable upper limit to the number of points is between 200-300. Only a few will be selected to be used as X-Axis labels. The format of the X-Axis label is controled by the "xIsDate" attribute on the array being set to TRUE. If the "xIsDate" attribute is not set to TRUE, the X-Axis will be formated as numeric and axisScaling can be preformed. If the "xIsDate" attribute is TRUE, the default date format of " or less than 90, a short date format will be used of " The x-axis date feature will override the specification of the axisScale call parameters on time series glyph columns.
tsconf: - a time series line and confidence interval band for each area. The glyph use the panelData vector to get the name of a three (3) dimensional array the data for the plot. The array contains one entry per area, 1 to n data points and the x, y, lower y and upper y values. See section on panelData below for more details. A reasonable upper limit to the number of points is between 200-300. The format of the X-Axis label is controled by the "xIsDate" attribute on the array. If the "xIsDate" is set to TRUE, the X-Axis values will be format using the default date format of " date format of " TRUE, the X-Axis will be formated as numeric and axisScaling can be preformed. The x-axis date feature will override the specification of the axisScale call parameters on time series glyph columns.
segbar: - a horizontal stacked (segmented) bar plot starting at 0 using data in the statsDFrame data.frame. The col1 and col2 columns are used to indicate the first and last columns in the statsDFrame data.frame that contain the contiguous bar segment values (lengths). For example: The data for a 5 segment bar glyph is in columns 4 through 8 in the statsDFrame (5 columns). col1 is set to 4 to identify the first column and col2 is set to 8 to identify the last column in the sequence. Column names may be used, but the column identified in col1 must preceed the column identified in col2.
normbar: - a stacked bar plot where the data is normalized for each area by dividing the bar segment values by the sum of the values for all of the bars. The stacked bar plot for each area then ranges from 0 to 100% (edge to edge). The col1 and col2 columns are used to identify the first and last columns for bar data in the statsDFrame in the same way as for the "segbar" glyph (see above.)
ctrbar: - a stacked bar plot where the bar segments are centered around the middle of the data. If there is an even number of segments, the 0 point is between the lower half and the upper half of the segments. If there is an odd number of segments, the center is the midpoint of the middle segment. The other segments are plotted to the left and right of the center point. The col1 and col2 columns are used to indicate the first and last columns in the statsDFrame data.frame that contain the contiguous bar segment values. (See "segbar" type above for more information.)
scatdot: - a set of 51 points with an x and y value per area. All points are plotted in each panel with the key areas in the panel highlighted. col1 indicates statsDFrame column containing the x values and col2 indicates the column containing the y values.

Example: type=c("id","map","rank", "boxplot") To specify a micromapSEER with three columns, left to right, containing the area label, a map and a boxplot.

col1=, col2=, col3=

Vectors of index numbers or names of columns in statsDFrame data.frame to be used as data for graphics. The uses of these three vectors are defined below:

any "map" type, id, boxplots, ts, and tsconf: glyphs do not use the col1, col2, or col3 vectors to locate data in the statsDFrame data.frame. If these vectors are present, the corresponding entires should be NA for the respective columns.
dot: uses col1 to specify a single data column in statsDFrame data.frame to be ploted.
bar: uses col1 to specify the data column in statsDFrame data.frame for the length of the bar. The data value can be positive or negative.
dotse: uses col1 and col2 to specify the data columns in statsDFrame data.frame to be used as the estimate and standard error values, respectively.
dotsignf: uses col1 and col2 to specify the data columns in statsDFrame data.frame to be used as the value for the dot and its associated P value.
arrow: uses col1 and col2 to specify the data columns in statsDFrame data.frame for the beginning and end values of the arrow.
segbar, normbar, ctrbar: uses col1 and col2 to specify the first and last columns in the statsDFrame data.frame. The statsDFrame data.frame columns from col1 to col2 are used for the length values of each bar in the glyph. col1 must preceed col2 in the statsDFrame data.frame. The minimum number of data columns is 2 columns with a maximum of 9 columns.
scatdot: uses col1 and col2 to specify the x and y values respectivefully for a dot for each of the 51 areas and DC in a scatter dot plot.
dotconf: uses col1, col2, and col3 to specify the data columns in statsDFrame data.frame for the estimate value, lower confidence interval, and upper confidence interval values.

See the table above.

colSize=

A numeric vector used to specify the proportional width size of a glyph column in relation to all other glyph columns. If used, values must be included for all glyph columns except for the map and id glyphs, which are fixed width columns. The width of a glyph column is determined by summing all of the colSize values and dividing the sum into the value for each glyph column to yield a percentage of the available width to be allocated to each column. For example: colSize=c(NA,NA,10,10,5,15), does not affect columns 1 and 2. The percentages for columns 3 through 6 are 25%, 25%, 12.5% and 37.5%. If 4 inches of space is available, the columns will be allocated: 1, 1, 0.5, and 1.5 inches. The column widths are still regulated by the minimum and maximum column widths set in the package. If a value is missing for non-map or id glyph, the package will a value equal to the average of the provided values.

lab1=, lab2=

Character vectors provide the two column labels (titles) lines at the top of each column. If no label is required, use "" for a blank line.

lab3=

Character vector used as a label at the bottom of each column. This is typically used to show units of measure. If no label is required, use "" for a blank line.

lab4=

Character vector used as the vertical (y) axis label for ts, tsconf, and scatdot glyphs. If no label is required, use "" for a blank line.

refVals=

Is a list of object names providing the reference values for each graphic column. The reference value is displayed as a dashed vertical line for each panel in the specified column.

refTexts=

Is a list of 1 or 2 labels to be displayed at the bottom of each column to identify the reference value.

panelData=

List of object names containing the boxplot data list and/or an array of time series data for each area. If boxplot and time series data are not used in a column, then associated object names should be NA.

For boxplot data, each row name in the boxplot list must be the area abbreviation (2 character) for the area associated with the data. There must be the same number of rows as in the name table and statsDFrame table. Each row must be data produced by the boxplot function. The area location identifier used in the statsDFrame data and must be placed in the boxplot$names (names) attribute for that set of boxplot data to be able to associate the individual boxplots to each area.

For the time series glyph (ts), the data must be a three (3) dimensional array. The first dimension [st,,] represent one entry for each area (1 to 51). The second dimension [,t,] indexes up to 30+ data points for the area. The third dimension [,,v] are the data point values at each data point. [,,var{1}] is the x value and [,,2] is the median y value for the data point. The rownames associated with the first dimension must be the area location ids used in the statsDFrame table to link the elements of this structure the presentation order of the areas.

For the time series with confidence intervals glyph (tsconf), the array is extended to include: [,,3] and [,,4] for the lower y and upper y values.

For time series data, the order of the first dimension of the array must match the area order in the statsDFrame. For example, the data in dataArray[1,,] is the the area identified in statsDFrame[1,]

The Date feature allows the caller to request the TS X-Axis labels be formated at dates. This requires the data in the TS array has valid date data as the X data. These are numbers based on 1970-1-1 being day zero in the computer calendar. There are many functions in R to convert to and from characters and date variable. In the past, before this feature, users had to do work-a-rounds by using year numbers or year and faction numbers. Once you have inserted the date X values into the array [,,1], modify the class of the array to add the "Date" class. micrpmapST will inspect the array and find the "Date" class, flag it for internal operations and remove it. The date format of " the date format will be changed to " The date feature is only available on the Time Series Glyphs.

If axisScale is set to "s" or "sn", they will be ignored for any TS glyph using the date feature.

Details

The panelDesc data.frame is used to describe the content of the micromapST plot to the function. It contains the index of the data in the statsDFrame data.frame, the types of graphics to be used in each column, titles, column headers, reference values and labels, etc.

Note

A descriptor may be omitted if none of the panel plots need it.

Author(s)

Daniel B. Carr, George Mason University, Fairfax VA, with contributions from Jim Pearson and Linda Pickle of StatNet Consulting, LLC, Gaithersburg, MD

micromapST
Linked Micromap Plots for U. S. and Other Geographic Areas

panelDesc: micromapST panel description data.frame structure
In micromapST: Linked Micromap Plots for U. S. and Other Geographic Areas

micromapST panel description data.frame structure

Description

Format

Details

Note

Author(s)

See Also

Related to panelDesc in micromapST...

R Package Documentation

Browse R Packages

We want your feedback!

micromapST Linked Micromap Plots for U. S. and Other Geographic Areas

panelDesc: micromapST panel description data.frame structure In micromapST: Linked Micromap Plots for U. S. and Other Geographic Areas

micromapST panel description data.frame structure

Description

Format

Details

Note

Author(s)

See Also

Related to panelDesc in micromapST...

R Package Documentation

Browse R Packages

We want your feedback!

micromapST
Linked Micromap Plots for U. S. and Other Geographic Areas

panelDesc: micromapST panel description data.frame structure
In micromapST: Linked Micromap Plots for U. S. and Other Geographic Areas