View source: R/NNS_term_matrix.R

Generates a term matrix for text classification use in NNS.reg.


NNS.term.matrix(x, oos = NULL, names = FALSE)


`x` |
Text A two column dataset should be used. Concatenate text from original sources to comply with format. Also note the possiblity of factors in |

`oos` |
Out-of-sample text dataset to be classified. |

`names` |
Column names for |

Returns the text as independent variables `"IV"`

and the classification as the dependent variable `"DV"`

. Out-of-sample independent variables are returned with `"OOS"`

.

Viole, F. and Nawrocki, D. (2013) "Nonlinear Nonparametric Statistics: Using Partial Moments" http://amzn.com/1490523995


x <- data.frame(cbind(c("sunny", "rainy"), c(1, -1)))
NNS.term.matrix(x)
### Concatenate Text with space seperator, cbind with "DV"
x <- data.frame(cbind(c("sunny", "rainy"), c("windy", "cloudy"), c(1, -1)))
x <- data.frame(cbind(paste(x[ , 1], x[ , 2], sep = " "), as.numeric(as.character(x[ , 3]))))
NNS.term.matrix(x)
### NYT Example
## Not run:
require(RTextTools)
data(NYTimes)
### Concatenate Columns 3 and 4 containing text, with column 5 as DV
NYT=data.frame(cbind(paste(NYTimes[ , 3], NYTimes[ , 4], sep = " "),
as.numeric(as.character(NYTimes[ , 5]))))
NNS.term.matrix(NYT)
## End(Not run)


