dsldFreqPCoord: dsldFreqPCoord

View source: R/dsldFreqPCoord.R

dsldFreqPCoordR Documentation

dsldFreqPCoord

Description

Wrapper for the freqparcoord function from the freqparcoord package.

Usage

dsldFreqPCoord(data, m, sName = NULL, method
    = "maxdens", faceting = "vert", k = 50, klm = 5 * k, keepidxs = NULL, 
    plotidxs = FALSE, cls = NULL, plot_filename = NULL)

Arguments

data

Data frame or matrix.

m

Number of lines to plot for each group. A negative value in conjunction with the method maxdens indicates that the lowest-density lines are to be plotted. If method is locmax, then m is forced to 1.

sName

Column for the grouping variable, if any (if none, all the data is treated as a single group); the column must be a vector or factor. The column must not be in dispcols. If method is locmax, grpvar is forced to NULL

method

What to display: 'maxdens' for plotting the most (or least) typical lines, 'locmax' for cluster hunting, or 'randsamp' for plotting a random sample of lines.

faceting

How to display groups, if present. Use 'vert' for vertical stacking of group plots, 'horiz' for horizontal ones, or 'none' to draw all lines in one plot, color-coding by group.

k

Number of nearest neighbors to use for density estimation.

klm

If method is "locmax", number of nearest neighbors to use for finding local maxima for cluster hunting. Generally needs to be much larger than k, to avoid "noise fitting."

keepidxs

If not NULL, the indices of the rows of data that are plotted will be stored in a component idxs of the return value. The rows themselves will be in a component xdisp, ordered by data[,dispcols[1].

plotidxs

If TRUE, lines in the display will be annotated with their case numbers, i.e. their row numbers within data. Use only with small values of m, as overplotting may occur.

cls

Cluster, if any (see the parallel package) for parallel computation.

plot_filename

Name of the file that will hold the saved graph image. If NULL, the graph will be generated and displayed without being saved.

If a filename is provided, the graph will not be displayed, only saved.

Details

The dsldFreqPCoord function wraps freqparcoord, which uses a frequency-based parallel coordinates method to vizualize multiple variables simultaneously in graph form.

This is done by plotting either the "most typical" or "least typical" (i.e. highest or lowest estimated multivariate density values respectively) cases to discern relations between variables.

The Y-axis represents the centered and scaled values of the columns.

Value

Object of type 'gg' (ggplot2 object), with components idxs and xdisp added if keepidxs is not NULL (see argument keepidxs above).

Author(s)

N. Matloff, T. Abdullah, B. Ouattara, J. Tran, B. Zarate

References

https://cran.r-project.org/web/packages/freqparcoord/index.html

Examples

data(lsa)
lsa1 <- lsa[,c('fam_inc','ugpa','gender','lsat','race1')]
dsldFreqPCoord(lsa1,75,'race1')
# a number of interesting trends among the most "typical" law students in the
# dataset: remarkably little variation among typical
# African-Americans; typical Hispanic men have low GPAs, poor LSAT
# scores there is more variation; typical Asian and Black students were
# female; Asians and Hispanics have the most variation in family income
# background

dsld documentation built on Sept. 14, 2024, 1:08 a.m.