partitionMap: Partition Maps
In partitionMap: Partition Maps

Description Usage Arguments Value Author(s) References Examples

using Random Forest multiclass output, embed observations in low-dimensional space

1
2
3

partitionMap(X, Y, XTEST = NULL, YTEST = NULL,  method = "pm", dimen = 2, 
                   force = TRUE, ntree = 100,
                   plottrain = TRUE, addjitter = 0.03, ...)

`X`	matrix with predictor variables in the training dataset
`Y`	response variable, a factor with multiple classes
`XTEST`	The matrix of predictor variables for the test dataset (optional)
`YTEST`	Class labels of test observations, used for coloring the test embeddings in the plot. If not supplied, test observations are shwon in grey (optional)
`method`	`pm` for "partitonMap" and `ha` for "Homogeneity Analysis"
`dimen`	dimension of embedding, typically 2 or 3
`force`	use force-based variation of "partitionMap" algorithm? no effect if `method="ha"`
`ntree`	number of trees to use for randomForest prediction
`plottrain`	plot embedding for training data?
`addjitter`	amount if jitter to add to the plots to avoid overlapping observations (set `addjitter=0` for no jitter)
`...`	other arguments to be passed to randomForest

A list with values

`Samples`	low-dimensional co-ordinates of embedded training samples
`Rules`	low-dimensional co-ordinates of embedded Rules (nodes in the trees)
`Z`	a binary matrix, with as many rows as training samples and as many columns as rules. a value `1` in row i and column j indicates that observation i is part of rule j
`Samplestest`	low-dimensional co-ordinates of embedded test samples
`Ztest`	a binary matrix, with as many rows as test samples and as many columns as rules. a value `1` in row i and column j indicates that observation i in the test data is part of rule j
`rf`	the trained Random Forest classifier

Nicolai Meinshausen <meinshausen@stats.ox.ac.uk>

Nicolai Meinshausen (2011)

Partition Maps

JCGS 20(4), 1007-1028

	
##---- load Soybean data ----
	data(Soybean)
	X <- Soybean[,-1]
	Y <- Soybean$Y 
	
##---- divide into training and test data ----
	indtrain <- rep(0,nrow(X))
	indtrain[sample(1:length(indtrain), ceiling(nrow(X)/3*2))] <- 1
	XTEST <- X[indtrain==0,]
	YTEST <- Y[indtrain==0]
	X <- X[indtrain==1,]
	Y <- Y[indtrain==1]


##---- compute Partition Map solution ----
	pm <- partitionMap(X,Y,XTEST=XTEST,method="pm",force=TRUE,
                                dimen=2,ntree=80,plottrain=TRUE)


##---- plot the embedded training and test samples ----
	par(mfrow=c(1,3))
	plot(pm$Samples,col=Y,pch=20,cex=1.5,main="Training Data",
                                    xlab="Dimension 1",ylab="Dimension 2")
	points(pm$Rules,pch=".")
	plot(pm$Samplestest,col=YTEST,pch=20,cex=1.5,main="Test Data",
                                     xlab="Dimension 1",ylab="Dimension 2")
	points(pm$Rules,pch=".")
	plot(pm$Samples,col=Y,pch=20,cex=1.5,xlab="",ylab="",type="n",axes=FALSE)
	legend(quantile(pm$Samples[,1],0),quantile(pm$Samples[,2],1),unique(Y),
                              col=1:length(unique(Y)),fill=1:length(unique(Y)),border=0)
	par(mfrow=c(1,1))