movies: movies

Description Usage Format Details Source References Examples

Description

A movies data frame with

Usage

1
data("movies")

Format

A data frame with 58788 observations on the following 24 variables.

title

a character vector

year

a numeric vector

length

a numeric vector

budget

a numeric vector

rating

a numeric vector

votes

a numeric vector

r1

a numeric vector

r2

a numeric vector

r3

a numeric vector

r4

a numeric vector

r5

a numeric vector

r6

a numeric vector

r7

a numeric vector

r8

a numeric vector

r9

a numeric vector

r10

a numeric vector

mpaa

a character vector

Action

a numeric vector

Animation

a numeric vector

Comedy

a numeric vector

Drama

a numeric vector

Documentary

a numeric vector

Romance

a numeric vector

Short

a numeric vector

Details

Initial movies data frame on which Histogram variables are built/

Source

https://cran.r-project.org/web/packages/ggplot2movies/index.html

References

Makosso-Kallyth, Sun; Diday, Edwin. Adaptation of interval PCA to symbolic histogram variables. Advances in Data Analysis and Classification. Volume 6. n 2. 2012. pages 147-159. Springer.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
data(movies)
ab = movies
ab = na.omit(ab)
Action = subset(ab,Action==1)
Action$genre = as.factor("Action")
Drama = subset(ab,Drama==1)
Drama$genre = as.factor("Drama")

Animation = subset(ab,Animation==1)
Animation$genre = as.factor("Animation")

Comedy = subset(ab,Comedy==1)
Comedy$genre = as.factor("Comedy")

Documentary = subset(ab,Documentary ==1)
Documentary $genre = as.factor("Documentary")


Romance = subset(ab,Romance==1)
Romance$genre = as.factor("Romance")

Short = subset(ab,Short==1)
Short$genre = as.factor("Short")

 ab = rbind(Action,Drama,Animation,Comedy,Documentary,Romance,Short)
 Hist1=PrepHistogram(X=sapply(ab[,3],unlist),Z=ab[,25],k=5)$Vhistogram
Hist2=PrepHistogram(X=sapply(ab[,4],unlist),Z=ab[,25],k=5)$Vhistogram
 Hist3=PrepHistogram(X=sapply(ab[,5],unlist),Z=ab[,25],k=5)$Vhistogram
Hist4=PrepHistogram(X=sapply(ab[,6],unlist),Z=ab[,25],k=5)$Vhistogram
 Hist5=PrepHistogram(X=sapply(ab[,7],unlist),Z=ab[,25],k=5)$Vhistogram
 
 ss1=Ridi(Hist1)$Ridit
 ss2=Ridi(Hist2)$Ridit
 ss3=Ridi(Hist3)$Ridit
 ss4=Ridi(Hist4)$Ridit
 ss5=Ridi(Hist5)$Ridit

 
HistPCA(list(Hist1,Hist2,Hist3,Hist4,Hist5),score=list(ss1,ss2,ss3,ss4,ss5))

res_pca=HistPCA(list(Hist1,Hist2,Hist3,Hist4,Hist5),score=list(ss1,ss2,ss3,ss4,ss5))
 
 Visu(res_pca$PCinterval)

Example output

dev.new(): using pdf(file="Rplots1.pdf")
$Correlation
           Composante 1 Composante 2 Composante 3 Composante 4 Composante 5
Variable 1   -0.5099416   -0.8404558    0.1744720   0.05321523  -0.01792261
Variable 2   -0.8964693    0.3773781   -0.1639273   0.15665688  -0.05015003
Variable 3    0.7038555   -0.1734192   -0.2734637  -0.35609178  -0.52242657
Variable 4   -0.9289358    0.2128625    0.2691358  -0.13842813   0.01309501
Variable 5    0.5361541    0.3119180    0.7760655   0.10674359  -0.03967453

$VecteurPropre
     VecteurPropre 1 VecteurPropre 2 VecteurPropre 3 VecteurPropre 4
[1,]      -0.3517065     -0.86672330       0.2518212       0.2166830
[2,]      -0.5863505      0.36906619      -0.2243777       0.6049232
[3,]       0.1251775     -0.04611537      -0.1017769      -0.3738813
[4,]      -0.6625360      0.22700148       0.4017000      -0.5828772
[5,]       0.2790562      0.24274385       0.8452924       0.3279992
     VecteurPropre 5
[1,]     -0.12135838
[2,]     -0.32203402
[3,]     -0.91217226
[4,]      0.09169345
[5,]     -0.20273212

$Tableaumean
           [,1]       [,2]       [,3]       [,4]       [,5]
[1,]  0.8589656  4.8334938 -1.2980342  6.4569054  0.8820262
[2,]  3.9674694 -0.9520108  0.1997447  1.2762980 -2.2291313
[3,] -2.3470024  3.1952925  0.2271465  0.3074271 -3.5771625
[4,] -1.7619329 -1.1216316 -0.7631946 -1.6726714  0.8781062
[5,] -1.7185055 -2.7609073  0.7498192 -2.1782000  2.6901585
[6,]  4.2544378 -0.4661359 -0.0372088 -1.4127500 -0.8770593
[7,] -3.2534320 -2.7281007  0.9217272 -2.7770091  2.2330622

$PourCentageComposante
       eigenvalue percentage of variance cumulative percentage of variance
comp 1 17.0370073             58.3193212                          58.31932
comp 2  7.6204727             26.0856140                          84.40494
comp 3  3.8902740             13.3167835                          97.72172
comp 4  0.4888048              1.6732262                          99.39494
comp 5  0.1767567              0.6050552                         100.00000

$PCinterval
               PCMin.1    PCMax.1     PCMin.2   PCMax.2     PCMin.3    PCMax.3
Action      -10.113984 -4.5470300  1.39507050  4.163102  1.03254495  4.1738482
Drama        -2.845231 -1.7144001 -5.14551528 -2.955786 -0.74352695  0.3851163
Animation    -3.080693 -1.3624714  1.68626325  3.122643 -5.15027823 -3.3124228
Comedy        2.293623  2.7765038  0.80464185  1.158964 -0.26174764  0.1737415
Documentary   3.789631  5.2323098  0.01871686  1.170279  0.57833480  2.4404718
Romance      -1.219747  0.1469407 -5.60454015 -3.178133 -0.78202294  0.5237466
Short         4.658957  5.9855912  1.01321658  2.351077 -0.03976013  0.9819545
               PCMin.4    PCMax.4    PCMin.5   PCMax.5
Action      -2.5006393  2.7427340 -1.0334247 0.9063838
Drama       -1.7200808 -0.8118580 -0.1290639 0.5527385
Animation   -0.8653339  0.8391873 -0.7037315 0.3077986
Comedy       0.2438441  0.7322409  0.7836489 1.0959520
Documentary -0.9371000  0.5953836 -0.8374283 0.1746056
Romance      0.6293318  1.7498372 -0.6727035 0.1047213
Short       -1.1149256  0.4173790 -1.1190396 0.5695426

dev.new(): using pdf(file="Rplots2.pdf")
dev.new(): using pdf(file="Rplots3.pdf")

GraphPCA documentation built on May 2, 2019, 1:08 p.m.