Description Format Source References Examples
Data described in Baayen and Milin (2010).
A data frame with 275996 observations on the following 24 variables.
ReadingTimea numeric vector of self-paced reading times
Subjecta factor with participant identifiers
Sexa factor with levels m (male) and f (female)
Agea numeric vector specifying the participant's age
NPoemsa numeric vector of the self-reported maximum number of poems read annually, according to a four-choice question
MultipleChoiceRTa numeric vector with the response latency to the four-choice question
Triala numeric vector specifying the rank of the item in the subject's experimental list
NumberOfWordsIntoLinea numeric vector specifying the position of the item in the line of poetry being read
PositionBegMidEnda factor specifying whether the word was initial
beg, medial mid or final end in the sentence
SentenceLengtha numeric vector specifying sentence length
Poema factor with as levels identifiers for the poems
Worda factor with as levels identifiers for the words
WordFrequencyInPoema numeric vector specifying the frequency of the word in the poem
RhymeFreqInPoema numeric vector specifying the frequency of the word's rhyme in the poem
OnsetFreqInPoema numeric vector specifying the frequency of the word's onset in the poem
WordLengtha numeric vector specifying the length of the word in letters
FamilySizea numeric vector specifying the count of morphological family members
InflectionalEntropya numeric vector specifying Shannon's entropy calculated over the probability distribution of a word's inflected variants
LemmaFrequencya numeric vector specifying the frequency of occurrence of the word in the lemma subsection of the CELEX lexical database
WordFormFrequencya numeric vector specifying the frequency of occurrence of the word's inflected form in the word form subsection of the CELEX lexical database
NumberOfMeaningsa numeric vector specifying the number of synsets in WordNet in which the word is listed
IsFunctionWorda factor specifying whether the word is a function word TRUE or not FALSE
HasPunctuationMarka factor specifying whether the word is followed by a punctuation mark, levels FALSE (absent) and TRUE (present)
NumberOfMorphemesa numeric vector specifying the scaled number of morphemes in a word
Baayen, R. H. and Milin, P (2010) Analyzing reaction times. International Journal of Psychological Research, 3.2, pp. 12-28.
Baayen, R. H. and Milin, P (2010) Analyzing reaction times. International Journal of Psychological Research, 3.2, pp. 12-28.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | data(poems)
par(mfrow=c(2,4))
qqnorm(poems$ReadingTime)
qqnorm(poems$WordFormFrequency)
qqnorm(poems$LemmaFrequency)
qqnorm(poems$FamilySize)
qqnorm(poems$MultipleChoiceRT)
qqnorm(poems$NPoems)
qqnorm(poems$NumberOfMeanings)
poems$LogReadingTime = log(poems$ReadingTime)
poems$LogWordFormFrequency = log(poems$WordFormFrequency+1)
poems$LogLemmaFrequency = log(poems$LemmaFrequency+1)
poems$RecFamilySize = -100/(poems$FamilySize+1)
poems$LogMultipleChoiceRT = log(poems$MultipleChoiceRT)
poems$LogNPoems = log(poems$NPoems)
poems$LogNumberOfMeanings = log(poems$NumberOfMeanings+1)
## Not run:
p = poems[,c("Age", "LogNPoems", "LogMultipleChoiceRT", "NumberOfWordsIntoLine", "SentenceLength",
"WordFrequencyInPoem", "RhymeFreqInPoem", "OnsetFreqInPoem", "WordLength",
"NumberOfMorphemes",
"RecFamilySize", "InflectionalEntropy", "LogLemmaFrequency", "LogWordFormFrequency",
"LogNumberOfMeanings")]
pc = prcomp(p,center=TRUE, scale=TRUE)
round(pc$rotation[,1:7],2)
# PC1 PC2 PC3 PC4 PC5 PC6 PC7
#Age 0.00 0.01 0.00 0.03 0.61 0.49 -0.01
#LogNPoems 0.00 -0.01 0.01 -0.01 -0.70 -0.02 0.00
#LogMultipleChoiceRT 0.00 0.00 0.00 0.01 -0.37 0.87 -0.02
#NumberOfWordsIntoLine 0.03 -0.19 -0.39 -0.56 0.01 0.02 -0.05
#SentenceLength -0.09 -0.20 -0.40 -0.52 0.01 0.01 -0.11
#WordFrequencyInPoem -0.30 -0.36 0.14 0.11 0.00 -0.01 -0.06
#RhymeFreqInPoem -0.24 -0.54 0.15 0.07 0.01 0.00 0.11
#OnsetFreqInPoem -0.20 -0.56 0.14 0.06 0.01 0.00 0.13
#WordLength 0.41 -0.16 0.18 -0.08 0.00 0.00 0.15
#NumberOfMorphemes 0.17 -0.13 0.24 -0.03 0.01 -0.01 -0.83
#RecFamilySize -0.35 0.20 -0.02 -0.11 0.00 0.01 0.34
#InflectionalEntropy 0.30 -0.19 -0.42 0.36 -0.01 -0.01 -0.02
#LogLemmaFrequency -0.43 0.13 -0.21 0.18 -0.01 -0.01 -0.27
#LogWordFormFrequency -0.45 0.16 -0.12 0.10 0.00 -0.01 -0.25
#LogNumberOfMeanings 0.11 -0.15 -0.55 0.44 -0.01 -0.01 0.01
poems$PC1 = pc$x[,1]
poems$PC2 = pc$x[,2]
poems$PC3 = pc$x[,3]
poems$PC4 = pc$x[,4]
poems$PC5 = pc$x[,5]
poems$PC6 = pc$x[,6]
poems$PC7 = pc$x[,7]
library(lme4)
poems.lmer = lmer(LogReadingTime ~
PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 +
HasPunctuationMark*Sex + Trial + PositionBegMidEnd +
(1|Poem) + (1|Word) + (1|Subject),
#(1+LogWordFormFrequency+NumberOfMorphemes|Subject) ,
data=poems, REML=FALSE)
print(summary(poems.lmer), corr=FALSE)
chf <- diag(c(diag(
getME(poems.lmer, "Tlist")[[2]]),
getME(poems.lmer, "Tlist")[[1]],
getME(poems.lmer, "Tlist")[[3]]))
chf[1:3, 1:3] <- getME(poems.lmer, "Tlist")[[2]]
sv <- svd(chf)
round(sv$d^2/sum(sv$d^2)*100, 1)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.