size.penalize() renamed to samplesize.penalize(), for CRAN
bug in size.penalize() fixed
improved performance of dist.minmax()
oppose() update, to allow having just one text per set
a solid clean-up in a few functions
several minor improvements here and there
version 0.7.4, 2020/12/5
option for shading in rolling.classify()
performance.measures() greatly improved
supervised classifiers updated, to be compliant with cross-validation
SVM output fixed
bugs in rolling.classify() fixed
bugs in load.corpus() causing codepage mismatches fixed
general code cleanup
version 0.7.3, 2020/08/11
perfom.svm() improved to work with R >4.0.0
oppose() not restricted anymore to have at least 2 texts per set
better color management in rolling.classify()
CPU performance improvements
version 0.7.2, 2020/04/20
fixes required by CRAN to meet R >3.6.3 requirements
CPU performance improvements
improvements in performance.measures()
confusion matrices fixed
oppose() update, to allow having just one text per set
version 0.7.1, 2019/11/4
improvements in crossv(): confusion matrix fully operational
new funcion performance.measures(), providing recall, precision, f1, etc.
performance measures made available via classify()
new function size.penalize() to assess minimal sample size
extension to the generic plot() function, to plot size.penalize() results
version 0.7.0, 2019/01/22
Unicode (UTF-8) made the default encoding, also for Windows
version 0.6.9, 2019/01/20
check.encoding() and change.encoding() introduced
GUI allows for changing the working directory with one click
metadata handling through a dedicated variable
{Steffen Pielström joins!}
version 0.6.8, 2018/06/14
support for JCK (Japanese-Chinese-Korean) significantly improved
a fix for exporting networks to Gephi ver. 0.9.2
support for rmarkdown: stylo(), classify(), oppose()
version 0.6.7, 2018/05/12
supports the following taggers: TaKIPI (for Polish), Alpino (Dutch)
the Imposters method reimplemented, via the new function imposters()
fine tuning the parameters of the Imposters method via imposters.optimize()
version 0.6.6, 2018/04/13
Cosine Delta implemented and aviable via GUI
Min-Max distance implemented
Entropy distance implemented
version 0.6.5, 2017/11/03
support for interactive network visualisations via stylo.network()
corrected Spanish pronouns
fixes in documentation
countless minor fixes
version 0.6.4, 2016/09/08
citation hint updated; to see the changes type: citation("stylo")
the impostors method almost implemented, see help(perform.impostors)
confusion table for supervised classification via classify()
a separate funtion for cross-validation, see help(crossv)
a significant change in SVM wrapper: the procedure automatically
gets rid of the variables with all 0s in the training set
the file inst/CITATION updated to meet recent CRAN requirements
man files for perform.delta, perform.svm etc. updated: new executable
examples added, so that one can perform a supervised test without any corpus
perform.knn(), perform.svm() etc. improved, in order to handle custom
vectors of classes provided by a user
an improved output of the oppose() function
version 0.6.3, 2015/12/20
significant performance improvement in make.table.of.frequencies()
PCA values (rotation, explained variance, etc.) saved in final results
version 0.6.2, 2015/11/11
the package 'stringi' involved to optimize n-gram computing
three datasets added to the package
data(novels), a collection of 9 novels by
the Bronte sisters and Jane Austen (full text)
data(galbraith), a table of frequencies of 26
novels by 5 authors, including Galbraith's "Cacoo's Calling"
data(lee), a table of frequencies of 28 American
novels by 8 authors, including the new novel by Harper Lee
new version of make.table.of.frequencies(),
which speeds up the tasks radically
delete.markup(), delete.stop.words(), make.samples(),
make.frequency.list(), txt.to.features(), txt.to.words.ext()
remodelled so that can be applied to single texts and/or to corpora
countless improvements in most of the functions
version 0.6.1, 2015/09/27
UTF-8 issue in txt.to.words.ext() fixed, according to the CRAN's request
version 0.6.0, 2015/08/17
support for Georgian
plot size in rolling.classify() improved
distance measure engine thoroughly restructured
custom distance measures allowed
cosine distance introduced
new functions: dist.cosine(), dist.delta(), dist.argamon(),
dist.eder(), dist.simple()
extracting POS tags via the function parse.pos.tags()
version 0.5.9-3, 2015/07/2
support for Coptic
customizable graphs size in rolling.classify()
custom graph filename
integration with CLARIN-PL stylometric infrastructure
version 0.5.9, 2015/01/30
non-ASCII chars in the source code neutralized
(required by CRAN)
new sequential methods available: rolling SVM,
rolling NSC, and rolling Delta
bug in load.corpus.and.parse() fixed
bug in rolling.delta() fixed
network related bug in stylo() neutralized
classification procedures as separate functions:
perform.delta(), perform.svm(), perform.knn(),
perform.naivebayes(), perform.nsc()
classification output enhanced
doc files for new functions added
version 0.5.7, 2014/08/13
culling implemented as a separate function
custom stop words deletion: delete.stop.words()
a thoroughly re-written oppose() to use
the same tokenizing, corpus loading,
sampling etc. functions as stylo() and classify()
zeta.chisquare(), zeta.craig(), and zeta.eder()
derrived as separate functions
gui.oppose() derrived as a separate function
distinctive words visualization in oppose() improved
draw.polygons derrived as a separate function
(hidden to the end user, though)
cross-validation in classify() improved
fixed bug in cross-validation for naivebayes
a very unpleasant bug in oppose() fixed:
the preferred and avoided words were calculated
using the I set only
help files significatnly improved
version 0.5.6, 2014/04/20
support for Unicode on Windows
support for a few non Latin scripts
experimental support for CJK (Chinese-Japanese-Korean)
the function txt.to.words() remodelled
loading corpus files improved
printing variables on screen improved
better class inheritance
an issue with hclust and "ward", "ward.D" fixed
man files extended and updated
version 0.5.5, 2014/04/03
cross-validation in classify()
lots of bugs fixed
version 0.5.4, 2014/02/25
tSNE implemented
preserve.case option
more flexible function for splitting input text
version 0.5.3, 2014/01/2
custom regular expressions to tokenize input texts
support for external corpora or frequencies
support for external set of features (e.g. frequent words)
class "stylo.results" for formatting final results
class "stylo.corpus" for formatting loaded corpora
class "stylo.data" for formatting tables and vectors
PCA coordinates piped to final results
optional choice between relative/raw frequencies
xml support improved (bug fixed)
codepage bug in oppose() fixed
version 0.5.2, 2013/09/07
CRAN-related issue with .Rbuildignore fixed
network analysis support significantly improved
improvements in man pages
version 0.5.1, 2013/08/07
bug fixes, minor improvements
different options for k-NN and SVM
submitted to CRAN for the first time (!)
version 0.5.0-58, 2013/08/06
batch mode improved
several clustering algorithms available
version 0.5.0-50, 2013/07/24
man pages revised and improved
version 0.5.0-49, 2013/07/18
poster presentation at DH2013 (Lincoln, NE)
minor improvements
version 0.5.0-48, 2013/06/26
namespace issues solved
documentation corrected (typos)
version 0.5.0-45, 2013/06/12
arguments can be passed from command-line
man pages cleaned and extended
global variables abandoned
innumerable minor improvements
version 0.5.0-43, 2013/04/31
thousands of changes and improvements
documentation improved and augmented
stylo R package (un)officially released
version 0.5.0-30, 2013/04/26
changes in names of some functions
code cleaning, improvements, improvements, ...
version 0.5.0-23, 2013/05/24
first prototype of an R package
version 0.5.0-1, 2013/04/03
first attempt to port the stylo script into R package
version 0.4.9-2, 3013/05/27
code OS-independent
minor cleaning
version 0.4.9-1, 2013/04/02
experimental support for network analysis (output to Gephi)
bugs fixed
version 0.4.9, 2013/03/06
added option to dump samples for closer post-analysis inspection
version 0.4.8, 2012/12/29
customizable plot area, font size, etc.
thoroughly rewritten code for margins assignment
scatterplots represented either by points, or by labels, or by both
(customizable label offset)
saving the words (features) actually used
saving the table of actually used frequencies
version 0.4.7, 2012/11/25
new output/input extensions: optional custom list of files
to be analyzed, saving distance table(s) to external files
support for TXM Textometrie Project
color cluster analysis graphs (at last!)
version 0.4.6, 2012/09/09
code revised, cleaned, bugs fixed
version 0.4.5-4, 2012/09/03
added 2 new PCA visualization flavors
version 0.4.5-3, 2012/08/31
new GUI written
version 0.4.5-2, 2012/08/27
added functionality for normal sampling
version 0.4.5-1, 2012/08/22
support for Dutch added
{Mike Kestemont joins!}
version 0.4.5, 2012/07/07
option for choosing corpus files
code cleaned; bugs fixed
version 0.4.4, 2012/05/31
the core code rewritten
I/II set division abandoned
GUI remodeled
GUI tooltips added
different input formats supported (xml etc.)
config options loaded from external file
the code forked into (1) the Stylo script, supporting explanatory
analyses (MDS, Cons. Trees, ...), (2) the Classify script for
machine-learning methods (Delta, SVM, NSC, Bayes)
version 0.4.3, 2012/04/28
feature selection (word and character n-grams)
version 0.4.2, 2012/02/10
three ways of splitting words in English
bugs fixed
GUI code rearranged and simplified
version 0.4.1, 2011/06/27
better output
better text files uploading
new options for culling and ranking of candidates
version 0.4.0, 2011/06/20
the official world-premiere, at DH2011 (Stanford, CA)
version 0.3.9b, 2011/06/1
the code simplified; minor cleaning
version 0.3.9, 2011/05/21
uploading wordlist from external source
thousands of improvements
the code simplified
version 0.3.8, 2010/11/01
skip top frequency words option added
version 0.3.7, 2010/11/01
better graphs
attempt at better graph layout
version 0.3.6, 2010/07/31
more graphic options
dozens of improvements
version 0.3.5, 2010/07/19
module for color graphs
module for PCA
version 0.3.4, 2010/07/12
module for uploading corpus files improved
version 0.3.3, 2010/06/03
the core code simplified and improved (faster!)
version 0.3.2, 2010/05/10
reordered GUI
minor cleaning
version 0.3.1, 2010/05/10
the z-scores module improved
version 0.3.0, 2009/12/26
better counter of "good guesses"
option for randomly generated samples
minor improvements
version 0.2.99, 2009/12/25
platform-independent outputfile saving
version 0.2.98, 2009/12/24
GUI thoroughly integrated with initial variables
version 0.2.10, 2009/11/28
corrected MFW display in graph
more analysis description in outputfile
version 0.2.9, 2009/11/22
auto graphs for MSD and CA
version 0.2.8a, 2009/11/21
remodeled GUI
version 0.2.8, 2009/11/20
GUI: radiobuttons, checkbuttons
version 0.2.7, 2009/11/19
language-determined pronoun selection
version 0.2.6, 2009/11/18
dialog box (GUI)
{Jan Rybicki joins!}
version 0.2.5, 2009/11/16
module for different distance measures
thousands of improvements (I/O, interface, etc.)
version 0.2.2, 2009/10/25
numerous little improvements
deleting pronouns
version 0.2.1, 2009/08/23
module for culling
module for bootstrapping
version 0.2.0, 2009/08/23
module for uploading plain text files
version 0.1.9, 2009/08/1
innumerable improvements
the code simplified
{this version was completed on a train from Leipzig
to Krakow (a looong trip...), after a very successful
R course taught by Stefen Gries at ESU "C&T",
Leipzig, Germany (26-31/08/2009)}