A package for text mining, specially devoted to the italian language
Package: | TextWiller |
Type: | Package |
Version: | 2.0 |
Date: | 2016-05-12 |
License: GPL (>= 2) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ## Not run: # install.packages("devtools") # if you don't already have it.
library(devtools)
install_github("livioivil/TextWiller")
library(TextWiller)
## End(Not run)
### normalize texts
normalizzaTesti(c('ciao bella!','www.associazionerospo.org','noooo, che grandeeeeee!!!!!','mitticooo', 'mai possibile?!?!'))
# get the sentiment of a document
sentiment(c("ciao bella!","farabutto!","fofi sei figo!"))
# Classify users' gender by (italian) names
classificaUtenti(c('livio','alessandra','andrea'))
classificaUtenti(c('alessandroBianchi', 'mariagiovanna', 'corriereDelMezzogiorno'), scan_interno=T)
# and classify location
data(vocabolarioLuoghi)
classificaUtenti(c('Bosa','Pordenone, Italy','Milan'),vocabolarioLuoghi)
# find re-tweets (RT) by texts similarity:
data(TWsperimentazioneanimale)
RTHound(TWsperimentazioneanimale[1:10,"text"], S = 3, L = 1,
hclust.dist = 100, hclust.method = "complete",
showTopN=3)
#extract short urls and get the long ones
## Not run: urls=urlExtract("Influenza Vaccination | ONS - Oncology Nursing Society http://t.co/924sRKGBU9 See All http://t.co/dbtPJRMl00")
#extract short urls and get the long ones
## Not run: urls=urlExtract("Influenza Vaccination | ONS - Oncology Nursing Society http://t.co/924sRKGBU9 See All http://t.co/dbtPJRMl00")
#extract users:
## Not run: patternExtract(c("@luca @paolo: buon giorno!", "@matteo: a te!"), pattern="@\w+")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.