Description Usage Arguments Details Value Author(s) References Examples
readCatdata
inputs the categorical data, accommodating complete or
missing observations.
This corresponds to objects of the classes vector
or matrix
which
represent a table of frequencies assumed to follow a product-multinomial
distribution.
For complete data, only the argument TF
is required. Linear,
log-linear and functional linear models may be subsequently fitted,
respectively, using the functions linML()
, loglinML()
and
funlinWLS()
.
For missing data, all arguments TF
, Zp
and
Rp
are required. Before proceeding to model fitting, inferences
for the saturated product-multinomial parameters are conducted using
either the function satMarML()
or satMcarWLS()
.
1 | readCatdata(TF, Zp, Rp)
|
TF |
a matrix including the table of frequencies (see details). |
Zp |
a matrix with indicators of the partially classified data (see details). |
Rp |
a matrix with the number of response classes corresponding to the missingness patterns (see details). |
Whenever TF
is a vector it represents one population assumed
to follow a multinomial distribution. Whenever TF
is a matrix,
each row represents a subpopulation, the set of which is assumed to follow a
product-multinomial distribution.
For complete data, TF
is usually a simple matrix (table of
frequencies) with S rows (subpopulations) and R columns (response
categories).
For missing categorical data, each row of TF
is assumed to have
first the R frequencies associated to the response categories of the
fully classified data, and thereafter the frequencies associated to the
response classes of the partially classified data.
The term response class is used to indicate that the corresponding observed data
may only be classified into a set of response categories and not to the individual
categories within this set. This is informed via indicator vectors with R rows,
the elements of which are equal to 1 for the positions corresponding to the
response categories in the response class and to 0 otherwise.
For computational simplicity, it is assumed that the response classes corresponding to
each missingness pattern belong to a partition of the response categories.
Zp
contains all the indicator vectors, which were first combined
columnwise within subpopulations (without mixing different missingness
patterns) and then between subpopulations.
As Rp
is a matrix that contains the number of response classes of
each missingness pattern (column) of each subpopulation (row), Zp
is a matrix with R rows and sum(Rp)
columns.
If the subpopulations do not have the same number of missingness patterns and/or
response classes, the matrices TF
and Rp
shall be
completed with "0".
The generic functions print
and summary
are used to print the results and to obtain a
summary thereof. The latter is particularly useful for checking whether the missing categorical
data input was conducted in the right way.
An object of the class readCatdata
is a list containing at least the following components:
R |
number of response categories. |
S |
number of subpopulations. |
Tt |
number of missingness patterns of each subpopulation. |
Nst |
frequencies under each missingness pattern of each subpopulation. |
nstm |
sample size of each missingness pattern of each subpopulation. |
nsmm |
sample size of each subpopulation. |
pst |
vectors of proportions computed under each missingness pattern of each subpopulation. |
Vpst |
covariance matrices of the proportions. |
theta |
vector of estimates for all product-multinomial probabilities under the saturated model (for complete data only). |
Vtheta |
corresponding covariance matrix (only for complete data). |
Frederico Zanqueta Poleto(frederico@poleto.com)
Julio da Motta Singer (jmsinger@ime.usp.br)
Carlos Daniel Paulino (daniel.paulino@math.ist.utl.pt)
with the collaboration of
Fabio Mathias Correa (fmcorrea@uesc.br)
Enio Galinkin Jelihovschi (eniojelihovs@gmail.com)
Paulino, C.D. e Singer, J.M. (2006). Analise de dados categorizados (in Portuguese). Sao Paulo: Edgard Blucher.
Poleto, F.Z. (2006). Analise de dados categorizados com omissao (in Portuguese). Dissertacao de mestrado. IME-USP. http://www.poleto.com/missing.html.
Poleto, F.Z., Singer, J.M. e Paulino, C.D. (2007). Analyzing categorical data with complete or missing responses using the Catdata package. Unpublished vignette. http://www.poleto.com/missing.html.
Poleto, F.Z., Singer, J.M. e Paulino, C.D. (2012). A product-multinomial framework for categorical data analysis with missing responses. To appear in Brazilian Journal of Probability and Statistics. http://imstat.org/bjps/papers/BJPS198.pdf.
Singer, J. M., Poleto, F. Z. and Paulino, C. D. (2007). Catdata: software for analysis of categorical data with complete or missing responses. Actas de la XII Reunion Cientifica del Grupo Argentino de Biometria y I Encuentro Argentino-Chileno de Biometria. http://www.poleto.com/SingerPoletoPaulino2007GAB.pdf.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | #Example 1.5 of Paulino and Singer (2006)
#S=4 subpopulations, R=4 response categories, with complete data
e15.TF<-rbind(c(19, 5, 4, 2),
c( 5, 8, 0,17),
c(11, 6, 7, 6),
c( 2, 5, 1,22))
e15.catdata<-readCatdata(TF=e15.TF)
e15.catdata #shows proportions and standard errors
#Example 13.4 of Paulino and Singer (2006)
#S=1 subpopulation, R=4 response categories, with missing data
#2 missingness patterns with 2 response classes each
e134.TF<-c(12,4,5,2, 50,31, 27,12)
e134.Zp<-cbind(kronecker(diag(2),rep(1,2)),kronecker(rep(1,2),diag(2)))
e134.Rp<-c(2,2)
e134.catdata<-readCatdata(TF=e134.TF,Zp=e134.Zp,Rp=e134.Rp)
e134.catdata #Proportions of the complete data
summary(e134.catdata) #A more detailed analysis of the missing data input
#Example 13.2 of Paulino and Singer (2006)
#S=1 subpopulation, R=9 response categories, with missing data
#2 missingness patterns with 4 response classes each
e132.TF<-c(7,11,2,3,9,5,0,10,4, 8,7,3,0, 0,7,14,7)
e132.Zp<-cbind(rbind(cbind(kronecker(rep(1,2),diag(3)),rep(0,6)),
cbind(matrix(0,3,3),rep(1,3)) ),
rbind(cbind(rep(1,3),matrix(0,3,3)),
cbind(rep(0,6),kronecker(rep(1,2),diag(3)))))
e132.Rp<-c(4,4)
e132.catdata<-readCatdata(TF=e132.TF,Zp=e132.Zp,Rp=e132.Rp)
summary(e132.catdata)
#Example 1 of Poleto et al (2012)
#S=2 subpopulation, R=9 response categories, with missing data
#in each subpopulation: 2 missingness patterns with 3 response classes each
smoking.TF<-rbind(c(167,17,19,10,1,3,52,10,11, 176,24,121, 28,10,12),
c(120,22,19, 8,5,1,39,12,12, 103, 3, 80, 31, 8,14))
smoking.Zp<-t(rep(1,2))%x%cbind(diag(3)%x%rep(1,3), rep(1,3)%x%diag(3))
smoking.Rp<-rbind(c(3,3),c(3,3))
smoking.catdata<-readCatdata(TF=smoking.TF,Zp=smoking.Zp,Rp=smoking.Rp)
smoking.catdata
#Example 2 of Poleto et al (2012)
#S=10 subpopulation, R=8 response categories, with missing data
#in each subpopulation: 6 missingness patterns, 3 patterns with 4 response
#classes each, and other 3 patterns with 2 response classes
obes.TF<-rbind(
c(90, 9, 3, 7, 0,1, 1, 8,16, 5,0, 0, 9,3,0,0,129,18,6,13,32, 5,33,11,70,24),
c(150,15, 8, 8, 8,9, 7,20,38, 3,1,11,16,6,1,3, 42, 2,3,13,45, 7,33, 4,55,14),
c(152,11, 8,10, 7,7, 9,25,48, 6,2,14,13,5,0,3, 36, 5,4, 3,59,17,31, 9,40, 9),
c(119, 7, 8, 3,13,4,11,16,42, 4,4,13,14,2,1,4, 18, 3,3, 1,82,24,23, 6,37,14),
c(101, 4, 2, 7, 8,0, 6,15,82, 9,8,12, 6,1,0,1, 13, 1,2, 2,95,23,34,12,15, 3),
c( 75, 8, 2, 4, 2,2, 1, 8,20, 0,0, 4, 7,2,0,1,109,22,7,24,23, 5,27, 5,65,19),
c(154,14,13,19, 2,6, 6,21,25, 3,1,11,16,3,0,4, 47, 4,1, 8,47, 7,23, 5,39,13),
c(148, 6,10, 8,12,0, 8,27,36, 0,7,17, 8,1,1,4, 39, 6,7,13,53,16,25, 9,23, 8),
c(129, 8, 7, 9, 6,2, 7,14,36, 9,4,13,31,4,2,6, 19, 1,2, 2,58,37,21, 1,23,10),
c(91, 9, 5, 3, 6,0, 6,15,83,15,6,23, 5,0,0,1, 11, 1,2, 3,89,32,43,15,14, 5))
obes.Zp<-t(rep(1,10))%x%cbind(diag(4)%x%rep(1,2),
diag(2)%x%rep(1,2)%x%diag(2), rep(1,2)%x%diag(4),
diag(2)%x%rep(1,4),rep(1,2)%x%diag(2)%x%rep(1,2), rep(1,4)%x%diag(2))
obes.Rp<-rep(1,10)%x%t(c(4,4,4,2,2,2))
obes.catdata<-readCatdata(TF=obes.TF,Zp=obes.Zp,Rp=obes.Rp)
obes.catdata #Proportions of the complete data
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.