Description Usage Arguments Details Value Examples
Polynomial-based alternative to t-SNE, UMAP etc.
1 2 3 4 5 | prVis(xy, labels = FALSE, yColumn = ncol (xy), deg = 2, scale = FALSE,
nSubSam = 0, nIntervals = NULL, outliersRemoved = 0, pcaMethod = "prcomp",
,bigData = FALSE, saveOutputs="lastPrVisOut", cex = 0.5, alpha=0)
addRowNums(np, savedPrVisOut, specifyArea = FALSE)
colorCode(colName="",n=256,exps="", savedPrVisOut="lastPrVisOut", cex = 0.5)
|
xy |
Data frame with labels, if any, in the last column or can be specified in the yColumn. |
labels |
If TRUE, have class labels. The column specified by yColumn must be an R factor, unless nIntervals in non-NULL, in which case Y will be discretized to make labels. |
yColumn |
The column number of the labeled column, the default for yColumn is the last column of xy |
deg |
Degree of polynomial expansion. |
scale |
If TRUE, call |
nSubSam |
Number of random rows of |
nIntervals |
If Y column is continuous, transform it into a factor with n many levels |
outliersRemoved |
Specify how many outliers will be removed from the plot calculated using mahalanobis distance. Values between 0 and 1 will be interpreted as percentages (0.99 means remove the 99 percent of the data with the largest mahalanobis distances) |
pcaMethod |
A string that specifies the method of eigenvector computation, prcomp or RSpectra |
bigData |
a boolean that specifies whether dataframe should be processed using bigmemory package. data will be stored using as.big.matrix |
saveOutputs |
name of the file where prVis object will be saved. Use the empty string in order to not save results |
cex |
Controls the point size for plotting. |
alpha |
a number between 0 and 1 that that specifies the level of transparency for alpha blending. If alpha is specified then ggplot2 will be used to create the plot. |
np |
Number of points to add row numbers to in the plot. If no value is provided, rownumbers will be added to all datapoints in the selection |
savedPrVisOut |
the name of the file where a previous call to prVis was stored |
area |
A vector in the form of [x_start, x_finish, y_start, y_finish]. x_start, x_finish, y_start, and y_finish should all be between 0 and 1. These values correspond to percentages of the graph from left to right and bottom to top. [0,1,0,1] would specify the entirety of the graph. [0,0.5, 0.5,1] specifies the upper-left quadrant. x_start and y_start must be less than x_finish and y_finish respectivelly. |
colName |
The name of the column of continuous data that will be used for color coding |
n |
The number of shades used to color code the values of colName. n and exps should not both be specified at the same time. |
exps |
a vector of string expressions that will be used to create a
factor column for coloring. If user specifies colName, they
can not specify exps. Each expression corresponds to a group
of the factor that will be created. The format for expression is
listed below as a context free grammar. |
A number of "nonlinear" analogs of Principle Components Analysis (PCA)
have emerged, such as ICA, t-SNE, UMAP and so on. Intuitively, an
approach based on polynomials may be effective too. Specifically,
prVis
first expands xy
to polynomial terms, then applies
PCA to the result.
Once a plot is displayed, addRowNums
can be used to add
row-number IDs of random points, to gain further insight into the data.
colorCode
can be used to display color coding for user-specified
expressions using the exps argument. colorCode
can also be used to color
based upon a column of continuous data by using the colName argument.
If saveOutputs
is set, a file is R list is created, with the components
contained inside of a list called outputList
. Two of the components,
gpOut
, the generated polynomial matrix, and prout
, the
return value from the call to prcomp
will always be contained inside of
outputList
. Additional information may be included in outputList
regarding the y column in colName
, yCol
, or yname
.
1 2 3 4 5 6 7 8 9 | data(peFactors) # prgeng data, included in pkg
pe1 <- peFactors[,c(1,8,9)]
z <- prVis(pe1,nSubSam=5000,labels=FALSE)
# get a bunch of streaks; why?
# call addRowNums() (not shown); discover that points on the same streak
# tend to have same combination of sex, education and occupation; moving
# along a streak mainly consists of variying age; call colorCode() (not
# shown) to explore
print('see data/SwissRoll for another example')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.