NEWS.md
In stylo: Stylometric Multivariate Analyses

size.penalize() renamed to samplesize.penalize(), for CRAN
bug in size.penalize() fixed
improved performance of dist.minmax()
oppose() update, to allow having just one text per set
a solid clean-up in a few functions
several minor improvements here and there

option for shading in rolling.classify()
performance.measures() greatly improved
supervised classifiers updated, to be compliant with cross-validation
SVM output fixed
bugs in rolling.classify() fixed
bugs in load.corpus() causing codepage mismatches fixed
general code cleanup

perfom.svm() improved to work with R >4.0.0
oppose() not restricted anymore to have at least 2 texts per set
better color management in rolling.classify()
CPU performance improvements

fixes required by CRAN to meet R >3.6.3 requirements
CPU performance improvements
improvements in performance.measures()
confusion matrices fixed
oppose() update, to allow having just one text per set

improvements in crossv(): confusion matrix fully operational
new funcion performance.measures(), providing recall, precision, f1, etc.
performance measures made available via classify()
new function size.penalize() to assess minimal sample size
extension to the generic plot() function, to plot size.penalize() results

Unicode (UTF-8) made the default encoding, also for Windows

check.encoding() and change.encoding() introduced
GUI allows for changing the working directory with one click
metadata handling through a dedicated variable
{Steffen Pielström joins!}

support for JCK (Japanese-Chinese-Korean) significantly improved
a fix for exporting networks to Gephi ver. 0.9.2
support for rmarkdown: stylo(), classify(), oppose()

supports the following taggers: TaKIPI (for Polish), Alpino (Dutch)
the Imposters method reimplemented, via the new function imposters()
fine tuning the parameters of the Imposters method via imposters.optimize()

Cosine Delta implemented and aviable via GUI
Min-Max distance implemented
Entropy distance implemented

support for interactive network visualisations via stylo.network()
corrected Spanish pronouns
fixes in documentation
countless minor fixes

citation hint updated; to see the changes type: citation("stylo")
the impostors method almost implemented, see help(perform.impostors)
confusion table for supervised classification via classify()
a separate funtion for cross-validation, see help(crossv)
a significant change in SVM wrapper: the procedure automatically gets rid of the variables with all 0s in the training set
the file inst/CITATION updated to meet recent CRAN requirements
man files for perform.delta, perform.svm etc. updated: new executable examples added, so that one can perform a supervised test without any corpus
perform.knn(), perform.svm() etc. improved, in order to handle custom vectors of classes provided by a user
an improved output of the oppose() function

significant performance improvement in make.table.of.frequencies()
PCA values (rotation, explained variance, etc.) saved in final results

the package 'stringi' involved to optimize n-gram computing
three datasets added to the package
- data(novels), a collection of 9 novels by the Bronte sisters and Jane Austen (full text)
- data(galbraith), a table of frequencies of 26 novels by 5 authors, including Galbraith's "Cacoo's Calling"
- data(lee), a table of frequencies of 28 American novels by 8 authors, including the new novel by Harper Lee
new version of make.table.of.frequencies(), which speeds up the tasks radically
delete.markup(), delete.stop.words(), make.samples(), make.frequency.list(), txt.to.features(), txt.to.words.ext() remodelled so that can be applied to single texts and/or to corpora
countless improvements in most of the functions

UTF-8 issue in txt.to.words.ext() fixed, according to the CRAN's request

support for Georgian
plot size in rolling.classify() improved
distance measure engine thoroughly restructured
custom distance measures allowed
cosine distance introduced
new functions: dist.cosine(), dist.delta(), dist.argamon(), dist.eder(), dist.simple()
extracting POS tags via the function parse.pos.tags()

support for Coptic
customizable graphs size in rolling.classify()
custom graph filename
integration with CLARIN-PL stylometric infrastructure

non-ASCII chars in the source code neutralized (required by CRAN)
random sampling substantially improved

bug fixes: options for assign.plot.colors()

bug fixes: 'start.at' parameter in stylo()

bug fixes (mostly: colors on dendrograms)

new sequential methods available: rolling SVM, rolling NSC, and rolling Delta
bug in load.corpus.and.parse() fixed
bug in rolling.delta() fixed
network related bug in stylo() neutralized
classification procedures as separate functions: perform.delta(), perform.svm(), perform.knn(), perform.naivebayes(), perform.nsc()
classification output enhanced
doc files for new functions added

culling implemented as a separate function
custom stop words deletion: delete.stop.words()
a thoroughly re-written oppose() to use the same tokenizing, corpus loading, sampling etc. functions as stylo() and classify()
zeta.chisquare(), zeta.craig(), and zeta.eder() derrived as separate functions
gui.oppose() derrived as a separate function
distinctive words visualization in oppose() improved
draw.polygons derrived as a separate function (hidden to the end user, though)
cross-validation in classify() improved
fixed bug in cross-validation for naivebayes
a very unpleasant bug in oppose() fixed: the preferred and avoided words were calculated using the I set only
help files significatnly improved

support for Unicode on Windows
support for a few non Latin scripts
experimental support for CJK (Chinese-Japanese-Korean)
the function txt.to.words() remodelled
loading corpus files improved
printing variables on screen improved
better class inheritance
an issue with hclust and "ward", "ward.D" fixed
man files extended and updated

cross-validation in classify()
lots of bugs fixed

tSNE implemented
preserve.case option
more flexible function for splitting input text

custom regular expressions to tokenize input texts
support for external corpora or frequencies
support for external set of features (e.g. frequent words)
class "stylo.results" for formatting final results
class "stylo.corpus" for formatting loaded corpora
class "stylo.data" for formatting tables and vectors
PCA coordinates piped to final results
optional choice between relative/raw frequencies
xml support improved (bug fixed)
codepage bug in oppose() fixed

CRAN-related issue with .Rbuildignore fixed
network analysis support significantly improved
improvements in man pages

bug fixes, minor improvements
different options for k-NN and SVM
submitted to CRAN for the first time (!)

batch mode improved
several clustering algorithms available

man pages revised and improved

poster presentation at DH2013 (Lincoln, NE)
minor improvements

namespace issues solved
documentation corrected (typos)

arguments can be passed from command-line
man pages cleaned and extended
global variables abandoned
innumerable minor improvements

thousands of changes and improvements
documentation improved and augmented
stylo R package (un)officially released

changes in names of some functions
code cleaning, improvements, improvements, ...

first prototype of an R package

first attempt to port the stylo script into R package

code OS-independent
minor cleaning

experimental support for network analysis (output to Gephi)
bugs fixed

added option to dump samples for closer post-analysis inspection

customizable plot area, font size, etc.
thoroughly rewritten code for margins assignment
scatterplots represented either by points, or by labels, or by both (customizable label offset)
saving the words (features) actually used
saving the table of actually used frequencies

new output/input extensions: optional custom list of files to be analyzed, saving distance table(s) to external files
support for TXM Textometrie Project
color cluster analysis graphs (at last!)

code revised, cleaned, bugs fixed

added 2 new PCA visualization flavors

new GUI written

added functionality for normal sampling

support for Dutch added
{Mike Kestemont joins!}

option for choosing corpus files
code cleaned; bugs fixed

the core code rewritten
I/II set division abandoned
GUI remodeled
GUI tooltips added
different input formats supported (xml etc.)
config options loaded from external file
the code forked into (1) the Stylo script, supporting explanatory analyses (MDS, Cons. Trees, ...), (2) the Classify script for machine-learning methods (Delta, SVM, NSC, Bayes)

feature selection (word and character n-grams)

three ways of splitting words in English
bugs fixed
GUI code rearranged and simplified

better output
better text files uploading
new options for culling and ranking of candidates

the official world-premiere, at DH2011 (Stanford, CA)

the code simplified; minor cleaning

uploading wordlist from external source
thousands of improvements
the code simplified

skip top frequency words option added

better graphs
attempt at better graph layout

more graphic options
dozens of improvements

module for color graphs
module for PCA

module for uploading corpus files improved

the core code simplified and improved (faster!)

reordered GUI
minor cleaning

the z-scores module improved

better counter of "good guesses"
option for randomly generated samples
minor improvements

platform-independent outputfile saving

GUI thoroughly integrated with initial variables

corrected MFW display in graph
more analysis description in outputfile

auto graphs for MSD and CA

remodeled GUI

GUI: radiobuttons, checkbuttons

language-determined pronoun selection

dialog box (GUI)
{Jan Rybicki joins!}

module for different distance measures
thousands of improvements (I/O, interface, etc.)

numerous little improvements
deleting pronouns

module for culling
module for bootstrapping

module for uploading plain text files

innumerable improvements
the code simplified
{this version was completed on a train from Leipzig to Krakow (a looong trip...), after a very successful R course taught by Stefen Gries at ESU "C&T", Leipzig, Germany (26-31/08/2009)}

loop for different MFW settings

some bash and awk scripts translated into R

Any scripts or data that you put into this service are public.

stylo documentation built on May 29, 2024, 1:37 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stylo Stylometric Multivariate Analyses

NEWS.md In stylo: Stylometric Multivariate Analyses

'stylo' news

version 0.7.5, 2024/04/02

version 0.7.4, 2020/12/5

version 0.7.3, 2020/08/11

version 0.7.2, 2020/04/20

version 0.7.1, 2019/11/4

version 0.7.0, 2019/01/22

version 0.6.9, 2019/01/20

version 0.6.8, 2018/06/14

version 0.6.7, 2018/05/12

version 0.6.6, 2018/04/13

version 0.6.5, 2017/11/03

version 0.6.4, 2016/09/08

version 0.6.3, 2015/12/20

version 0.6.2, 2015/11/11

version 0.6.1, 2015/09/27

version 0.6.0, 2015/08/17

version 0.5.9-3, 2015/07/2

version 0.5.9, 2015/01/30

version 0.5.8-3, 2014/10/26

version 0.5.8-2, 2014/10/19

version 0.5.8-1, 2014/09/23

version 0.5.8, 2014/09/3

version 0.5.7, 2014/08/13

version 0.5.6, 2014/04/20

version 0.5.5, 2014/04/03

version 0.5.4, 2014/02/25

version 0.5.3, 2014/01/2

version 0.5.2, 2013/09/07

version 0.5.1, 2013/08/07

version 0.5.0-58, 2013/08/06

version 0.5.0-50, 2013/07/24

version 0.5.0-49, 2013/07/18

version 0.5.0-48, 2013/06/26

version 0.5.0-45, 2013/06/12

version 0.5.0-43, 2013/04/31

version 0.5.0-30, 2013/04/26

version 0.5.0-23, 2013/05/24

version 0.5.0-1, 2013/04/03

version 0.4.9-2, 3013/05/27

version 0.4.9-1, 2013/04/02

version 0.4.9, 2013/03/06

version 0.4.8, 2012/12/29

version 0.4.7, 2012/11/25

version 0.4.6, 2012/09/09

version 0.4.5-4, 2012/09/03

version 0.4.5-3, 2012/08/31

version 0.4.5-2, 2012/08/27

version 0.4.5-1, 2012/08/22

version 0.4.5, 2012/07/07

version 0.4.4, 2012/05/31

version 0.4.3, 2012/04/28

version 0.4.2, 2012/02/10

version 0.4.1, 2011/06/27

version 0.4.0, 2011/06/20

version 0.3.9b, 2011/06/1

version 0.3.9, 2011/05/21

version 0.3.8, 2010/11/01

version 0.3.7, 2010/11/01

version 0.3.6, 2010/07/31

version 0.3.5, 2010/07/19

version 0.3.4, 2010/07/12

version 0.3.3, 2010/06/03

version 0.3.2, 2010/05/10

version 0.3.1, 2010/05/10

version 0.3.0, 2009/12/26

version 0.2.99, 2009/12/25

version 0.2.98, 2009/12/24

version 0.2.10, 2009/11/28

version 0.2.9, 2009/11/22

version 0.2.8a, 2009/11/21

version 0.2.8, 2009/11/20

version 0.2.7, 2009/11/19

version 0.2.6, 2009/11/18

version 0.2.5, 2009/11/16

version 0.2.2, 2009/10/25

version 0.2.1, 2009/08/23

version 0.2.0, 2009/08/23

stylo
Stylometric Multivariate Analyses

NEWS.md
In stylo: Stylometric Multivariate Analyses