Nothing
#' Kernel regressions based causal paths in Panel Data.
#'
#' We assume that panel data have space (space=individual region)
#' and time (e.g., year) dimensions. We use upper case
#' X to denote a common prefix in the panel data.
#' Xs =name of the space variable, e.g., state or individual.
#' The range of values for s is 1 to nspace.
#' Xt =name of the time variable, e.g., year.
#' The range of values for t is 1 to ntime.
#' Xy =the dependent variable(s) value at time t in state s.
#' Since panel data causal analysis can take a long computer time,
#' we allow the user to choose subsets of time and space values
#' called chosenTimes and chosenSpaces, respectively.
#' Various input parameters starting with "nam" specify the names
#' of variables in the panel study.
#'
#' The algorithm calls some function fn(mtx) where mtx is the data matrix,
#' and fn is causeSummary2NoP(mtx). The causal paths between (y, xj)
#' pairs of variables in mtx are computed following 3 sophisticated
#' criteria involving exact stochastic dominance. Type "?causeSummary2"
#' on the R console to get details (omitted here for brevity).
#' Panel data consist of a time
#' series of cross-sections and are also called longitudinal data.
#' We provide estimates of causal path directions and strengths
#' for both the time-series and cross-sectional
#' views of panel data. Since our regressions are kernel type
#' with no functional forms, fixed effects for time and space
#' are being suppressed when computing the causality.
#'
#' @param da {panel dat having a named column for space and time}
#' @param fn {an R function causeSummary2NoP(mtx)}
#' @param rowfnout {the number of rows output by fn}
#' @param colfnout {the number of columns output by fn}
#' @param fnoutNames {the column names of output by fn, for example,
#' fnoutNames=c("cause","effect","strength","r","p-val")}
#' @param namXs {title of the column in da having the space variable}
#' @param namXt {title of the column in da having the time variable}
#' @param namXy {title of the column in da having the dependent y variable}
#' @param namXc {title(s) of the column(s) in da
#' having control variable(s), default=0 means none specified}
#' @param namXjmtx {title(s) of the column(s) in da having regressor(s)}
#' @param chosenTimes {subset of values of time variable chosen for quick
#' results, There are NchosenTimes values chosen in the subset.
#' default=NULL means all time identifiers in the data are included.}
#' @param chosenSpaces {subset of values of space variable chosen for quick
#' results, There are NchosenSpaces values chosen in the subset.
#' default=NULL means all space identifiers are included. The degrees
#' of freedom for Studentized statistic for Granger causality tests
#' are df=(NchosenSpaces -1).}
#' @param ylag {time lag in Granger causality study of time dimension
#' the default ylag=0 is not really zero. It means ylag=
#' min(4, round(NchosenTimes/5,0)),
#' where NchosenTimes is the length of chosenTimes vector}
#' @param verbo {print detail results along the way, default=FALSE}
#'
#' @description
#' The algorithm of this function uses an internal function
#' fminmax=function(x){min(x)==max(x)}.
#' The subsets mtx2 of the original data da for a specific time or space
#' can become degenerate if the columns of mtx2 have no variability.
#' The apply function of R is applied to the columns of mtx2 as follows.
#' "ap1=apply(mtx2,2,fminmax)." Now, "sumap1=sum(ap1)" counts how many
#' columns of the data matrix are degenerate. We have a degeneracy
#' problem only if sumap1 is >1 or =1. For example, the panel consists
#' of data on 50 United States and 20 years. Now, consumer price index
#' (cpi) data may be common for all states. That is, the min(cpi)
#' equals max(cpi) for all states. Then, the variance of cpi is zero,
#' and we have degeneracy. When this happens, the regressor cpi should
#' not be involved in determining causal paths.
#' We identify degeneracy using "fminmax=function(x){min(x)==max(x)}"
#'
#' @returns The causeSum2Panel(.) produces many output matrices and vectors.
#' The first
#' "outt" gives a 3-dimensional array of panel causal path output focused on
#' time series for each space value using fixed space value.
#' It reports causal path directions, and strengths for (y, xj) pairs.
#' The second output array, called
#' "outs", gives similar 3D panel causal path output focused on
#' space cross sections using fixed time value.
#' The third output matrix called
#' "outdif" gives causal paths using Granger causality for each
#' pair (y, xj). They are not causal strengths but differences
#' between Rsquare values of two flipped kernel regressions.
#' The summary of Granger causality answer is an output matrix called
#' grangerAns (first row average of differences in R-squares and
#' second row has its test statistic with degrees of freedom n-1),
#' and grangerStat for related t-statistic for formal inference.
#' based on column means and variances of "outdif". This function
#' also produces a matrix summarizing "outt" and "outs" into two-dimensional
#' matrices reporting averages of signed strengths as "strentime"
#' and "strenspace", Also, "pearsontime" reports the Pearson
#' correlation coefficients for various time values and their average in the
#' last column. It determines the overall direction of
#' the causal relation between y and xj.
#' For example, a negative average correlation means y and xj are negatively
#' correlated (xj goes up, y goes down). Similarly, "pearsonspace"
#' summarizes "outs" correlations.
#' @author Prof. H. D. Vinod, Economics Dept., Fordham University, NY.
#' @seealso See \code{\link{causeSummary2}}
#' @seealso See \code{\link{causeSummary}} is subject to trapezoidal approximation.
#' @references Vinod, H. D. 'Generalized Correlation and Kernel Causality with
#' Applications in Development Economics' in Communications in
#' Statistics -Simulation and Computation, 2015,
#' \doi{10.1080/03610918.2015.1122048}
#' @references Vinod, H. D. 'New exogeneity tests and causal paths,'
#' Chapter 2 in 'Handbook of Statistics: Conceptual Econometrics
#' Using R', Vol.32, co-editors: H. D. Vinod and C.R. Rao. New York:
#' North Holland, Elsevier Science Publishers, 2019, pp. 33-64.
#'
#' @references Vinod, H. D. Causal Paths and Exogeneity Tests
#' in {Generalcorr} Package for Air Pollution and Monetary Policy
#' (June 6, 2017). Available at SSRN:
#' \url{https://www.ssrn.com/abstract=2982128}
#'
#' @references Vinod, Hrishikesh D., R Package GeneralCorr
#' Functions for Portfolio Choice
#' (November 11, 2021). Available at SSRN:
#' https://ssrn.com/abstract=3961683
#'
#' @references Vinod, Hrishikesh D., Stochastic Dominance
#' Without Tears (January 26, 2021). Available at
#' SSRN: https://ssrn.com/abstract=3773309
#'
#' @concept Granger causality
#' @concept stochastic dominance orders
#' @concept summary index
#' @note The function prints to the screen some summaries of the three
#' output matrices. It reports how often a variable is a cause in
#' various pairs as time series or as cross sections. It also reports the
#' average strengths of causal paths for "outt" and "outs" matrices.
#' We compute the difference between two R-square values to find which
#' causal direction is more plausible. This involves kernel regressions
#' of y on its own lags and lags of a regressor. Unlike the usual Granger
#' causality we estimate better-fitting nonlinear kernel regressions.
#' If the averages in "outdif" matrix are negative, the Granger causal
#' paths go from y to xj. This may be unexpected when the model assumes
#' that y depends on x1 to xp, that is, the causal paths go from xj to y.
#' In studying the causal pairs, the function creates mixtures of
#' names y and xj. Character vectors containing the mixed names are
#' are column names or row names depending on the context. For example,
#' the output matrix grangerAns column names help identify the relevant
#' regressor name. The first row of the grangerAns matrix has column averages
#' of outdiff matrix to help get an overall estimate of the Granger-causal paths.
#' The second row of the grangerAns has the Studentized test
#' statistic for formal testing of the significance of Granger causal paths.
#' Collecting the results for the time dimension strengths with
#' suitable sign (negative strength means cause reversal xj->y) is output
#' named strentime. The corresponding Pearson
#' correlations as an output is named pearsontime.
#' Collecting the results for the space dimension strengths with
#' suitable sign (negative strength means cause reversal xj->y) is output
#' named strenspace. The corresponding Pearson
#' correlations are named pearsonspace. A grand summary of average
#' strengths and correlations is output matrix named grandsum.
#' It is intended to provide an overall picture of causal paths in Panel
#' data. These paths should not be confused with Granger causal paths which
#' always involve time lags and causes are presumed to precede effects in time.
#'
#' @examples
#'
#'
#' \dontrun{
#' library(plm);data(Grunfeld)
#' options(np.messages=FALSE)
#' namXs="firm"
#' print("initial values identifying the space variable")
#' head(da[,namXs],3)
#' print(str(da[,namXs]))
#' chosenSpaces=(3:10)
#' if(is.numeric(da[,namXs])){
#' chosenSpaces=as.numeric(chosenSpaces)}
#' if(!is.numeric(da[,namXs])){
#' chosenSpaces=as.character(chosenSpaces)}
#'
#' namXt="year"
#' print("initial values identifying the time variable")
#' head(da[,namXt],3)
#' print(str(da[,namXt]))
#' chosenTimes=1940:1949
#' if(is.numeric(da[,namXt])){
#' chosenTimes=as.numeric(chosenTimes)}
#' if(!is.numeric(da[,namXt])){
#' chosenTimes=as.character(chosenTimes)}
#'
#' namXy="inv"
#' namXc=0
#' namXjmtx=c("value","capital")
#' p=length(namXjmtx)
#' fn=causeSummary2NoP
#' fnout=matrix(NA,nrow=p,ncol=5)
#' fnoutNames=c("cause","effect","strength","r","p-val")
#' causeSum2Panel(da, fn=causeSummary2NoP,
#' rowfnout=p, colfnout=5,
#' fnoutNames=c("cause","effect","strength","r","p-val"),
#' namXs=namXs,
#' namXt=namXt,
#' namXy=namXy,
#' namXc=namXc,
#' namXjmtx=namXjmtx,
#' chosenTimes=chosenTimes,
#' chosenSpaces=chosenSpaces,
#' verbo=FALSE)
#' }
#'
#'
#' @export
causeSum2Panel=function(da, fn=causeSummary2NoP, rowfnout, colfnout, fnoutNames,
namXs,
namXt,
namXy,
namXc=0,
namXjmtx,
chosenTimes=NULL,
chosenSpaces=NULL,
ylag=0,
verbo=FALSE)
{
Xs=da[,namXs]
uniXs=unique(Xs)
nspace=length(uniXs)
if(length(chosenSpaces)==0) chosenSpaces=uniXs #select All
Xt=da[,namXt]
uniXt=unique(Xt)
if(length(chosenTimes)==0) chosenTimes=uniXt
ntime=length(uniXt)
Xy=da[,namXy]
if(namXc != 0) Xcmtx=da[,namXc]
Xjmtx=da[,namXjmtx]
p=length(namXjmtx)
if (verbo){
print(c("number of regressors in data=",p))
print(c("number of (states, individuals) spaces=",nspace))
print(c("number of times=",ntime))
}#end if verbo
NchosenSpaces=length(chosenSpaces)
NchosenTimes=length(chosenTimes)
if(verbo){
print("how many spaces and times chosen for analysis")
print(c(NchosenSpaces,NchosenTimes))
}#endif verbo
if (ylag> ntime/3) print("Granger Warning: ylag>ntime/3")
if(ylag==0) ylag=min(4, round(NchosenTimes/5,0))
print(c("Granger Causality: max time lag used=",ylag))
#now define the space range for analysis
spaces=1:NchosenSpaces
times=1:NchosenTimes
#create mixed names with y and each regressor
mixnam=rep(NA, p)
for (j in 1:p){
mixnam[j]=paste("y-",namXjmtx[j],sep="")
} #end j loop
mixnam2s=mixnam #initialize if no degeneracy is found
mixnam2t=mixnam #initialize if no degeneracy is found
mixnam0=mixnam #original mix names
if(verbo){print("Causal paths with fixed-time cross sections")
print("mixed names of y and regressors")
print(mixnam) }
# places to store output for various space values
outs=array(NA,c(rowfnout,colfnout,NchosenSpaces))
dimnames(outs)=list(mixnam2s,fnoutNames,chosenSpaces)
outt=array(NA,c(rowfnout,colfnout,NchosenTimes))
dimnames(outt)=list(mixnam2t,fnoutNames,chosenTimes)
outdif=matrix(NA,nrow=NchosenSpaces,ncol=p)
nammtx=c(namXy, namXjmtx)
if(verbo)print("Next: Granger type time-lag Causality, fix Xs space value ")
for (i in 1:NchosenSpaces){
# i=1 is time series for the first chosen space (state) entity
mtx=subset(da,da[,namXs]==chosenSpaces[i])
mtx2=mtx[,nammtx] #select data columns for y x1 x2 .. xp
if (verbo){
print(head(mtx,2))
print(tail(mtx,2))
print(head(mtx2,2)) #selected data columns for y x1 x2 .. xp
print(tail(mtx2,2))
}#endif verbo
fminmax=function(x){min(x,na.rm=TRUE)==max(x,na.rm=TRUE)} #checking degenerate data
ap0=apply(mtx2,2,fminmax) # this uses names y x1 x2 .. xp
ap1=ap0[-1] #oimt the first for y
# print(c("ap0",ap0)) ################
# print(c("ap1",ap1)) ##################
sumap1=sum(ap1)
if(verbo)print(c("how many variables have zero variability min=max",sumap1))
if(verbo)print(c("which variable has zero variability",nammtx[ap0]))
if(sumap1 >= 1) {
mixnam[ap1]="NA"
mixnam2s=mixnam
# dimnames(outs)=list(mixnam2s,fnoutNames,chosenSpaces)
if(verbo) print(c("variable has no variation",nammtx[ap1]))
} #endif sumap1>=1, mixnam2s allows for degenerate data
nammtx2=setdiff(nammtx, nammtx[ap0])
# nRHS=length(nammtx2)
nRHS=sum(!is.na(nammtx2))
mtx3=mtx2[,nammtx2]
fir=fn(mtx3)
# if(i==1) { print(" output of fn for the first chosenSpace")
# print(fir)
# print(" outs matrix designed to store above output: should match dimensions")
# print(outs[c(1:(p-sumap1)),,i])
# }#endif i==1
outs[c(1:(p-sumap1)),,i]=fir
# print("outs")
# print(fir) ################## ##################
if(verbo)print("call Granger Causality function: GcRsqX12")
for (j in 2:nRHS){
dif1j=GcRsqX12(mtx3[,1],mtx3[,j],px1=ylag,px2=ylag,pwanted=ylag)[3]
# print(dif1j) #third value is difference between Rsquare values
outdif[i,(j-1)]=as.numeric(dif1j)
}# end j loop for Granger causality
}# end the for loop for i-th chosen space/indivi/state
colnames(outdif)=mixnam
if(verbo)print("Next:Cross-section Causality between variables fix time unit")
#begin loop for i over chosen time units for cross sections at each time
for (i in 1:NchosenTimes){
mtx=subset(da,da[,namXt]==chosenTimes[i])
mtx2=mtx[,nammtx]
if (verbo){
print(head(mtx,2))
print(tail(mtx,2))
print(head(mtx2,2))
print(tail(mtx2,2))
}#endif verbo
fminmax=function(x){min(x)==max(x)}
ap0=apply(mtx2,2,fminmax)#how many variables have zero variability min=max
ap1=ap0[-1]
# print(c("ap0",ap0)) ################
# print(c("ap1",ap1)) ##################
sumap1=sum(ap1)
if(verbo)print(c("how many variables have zero variability min=max",sumap1))
if(verbo)print(c("which variable has zero variability",nammtx[ap0]))
if(sumap1 >= 1) {
mixnam[ap1]="NA"
mixnam2t=mixnam
# dimnames(outt)=list(mixnam2t,fnoutNames,chosenTimes)
print(c("variable(s) have no variation",nammtx[ap0])) } #endif
nammtx2=setdiff(nammtx, nammtx[ap0])
# print(nammtx2) # after omitting the degenerate regressor ############
mtx3=mtx2[,nammtx2]
# print(head(mtx3,2)) ##################
# print(tail(mtx3,2)) ##################
fir=fn(mtx3)
outt[c(1:(p-sumap1)),,i]=fir
}# end for loop over i
# print("outt fir")
# print(fir) ################## ##################
print("How frequent is a variable designated as cause in outs, dim= space")
mtxScause=outs[,1,]
print(table(mtxScause))
mtxstren=outs[,3,]
stravg=rep(NA, p) #average strength of causal relation
names(stravg)=mixnam0 #mixture of two names at a time
for ( j in 1:p){
stre=mean(as.numeric(mtxstren[j,]),na.rm=TRUE)
stravg[j]=stre
}
print("outs:Average causal strengths for mixed names defining variable pairs")
names(stravg)=mixnam2s
# print(stravg)
print("How frequent is a variable designated cause in outt?")
mtxTcause=outt[,1,]
print(table(mtxTcause))
print("Cause in Granger causality from following sign(s)")
print(mixnam0)
grangerStat=rep(NA,p)
grangerSD=rep(NA,p)
grangerAns0=apply(outdif,2,mean,na.rm=TRUE)
print(grangerAns0)
print("negative sign means y causes xj by Granger method")
if (nspace>5){grangerSD=apply(outdif,2,sd,na.rm=TRUE)
grangerStat=sqrt(nspace)*grangerAns0/grangerSD
} #endif
# print(c("p=",p)) ###########
# print(mixnam2t) ######
mtxstren=outt[,3,]
# print(head(mtxstren),2) #############
stravg=rep(NA, p) #average Granger strength of causal relation
names(stravg)=mixnam #mixture of two names at a time
for ( j in 1:p){
stre=mean(as.numeric(mtxstren[j,]),na.rm=TRUE)
stravg[j]=stre
}
# print(mixnam2t) ######
names(stravg)=mixnam2t
print("Granger Average causal strengths for mixed names defining variable pairs")
print(stravg)
######
###### begin outt summarization
######
print("Causality for various times next")
strent=outt[,3,] #third col. has abs(strength) values needs sign info
pearsont=outt[,4,]# 4-th column has pearson correlations
dim1=dim(outt)[1]
dim3=dim(outt)[3]
dimn1=dimnames(outt)[[1]]
dimn3=dimnames(outt)[[3]]
# print("c(dim1,dim3,dimn1,dimn3)")
# print(c(dim1,dim3,dimn1,dimn3))
strent2=matrix(NA,nrow=dim1, ncol=dim3) #place holder
pearsont2=matrix(NA,nrow=dim1, ncol=dim3) #place holder
causeT=outt[,1,] #If first column cause==dependent variable, minus strength
for (j in 1:dim3){
for (i in 1:dim1){
strent2[i,j]=as.numeric(strent[i,j])
pearsont2[i,j]=as.numeric(pearsont[i,j])
}}
#print("pearson time dimension next")
#print(pearsont2)
strenT=strent2 #place to store
for (j in 1:dim3){
for (i in 1:dim1){
strenT[i,j]=as.numeric(strent2[i,j])
##If first column cause==dependent variable namXy then change sign of strength
if (!is.na(causeT[i,j])){
if (causeT[i,j]==namXy) strenT[i,j]=-as.numeric(strent2[i,j])
}}}
rownames(strenT)=dimn1
colnames(strenT)=dimn3
rownames(pearsont2)=dimn1
colnames(pearsont2)=dimn3
# print(strenT)
avgstrent=apply(strenT,1,mean,na.rm=TRUE)
avgPearsont=apply(pearsont2,1,mean,na.rm=TRUE)
strentime=cbind(strenT,avgstrent)
pearsontime=cbind(pearsont2,avgPearsont)
colnames(strentime)=c(dimn3,"avg")
colnames(pearsontime)=c(dimn3,"avg")
print("strentime or causal strengths by time dimention")
print(strentime)
print("pearsontime or Pearson correlations by time Last col. has average")
print(pearsontime)
print("Above Causality for various times (averaging over space)")
######
###### begin outs summarization wrt s=space
######
print("Causality for various spaces next (averaging over time)")
strens=outs[,3,]
pearsons=outs[,4,]#
dim1=dim(outs)[1]
dim3=dim(outs)[3]
dimn1=dimnames(outs)[[1]]
dimn3=dimnames(outs)[[3]]
#print("c(dim1,dim3,dimn1,dimn3)")
#print(c(dim1,dim3,dimn1,dimn3))
strens2=matrix(NA,nrow=dim1, ncol=dim3) #place holder
pearsons2=matrix(NA,nrow=dim1, ncol=dim3) #place holder
causeS=outs[,1,]
for (j in 1:dim3){
for (i in 1:dim1){
strens2[i,j]=as.numeric(strens[i,j])
pearsons2[i,j]=as.numeric(pearsons[i,j])
}}
#print("pearsons2 next")
#print(pearsons2)
strenS=strens2 #place to store
for (j in 1:dim3){
for (i in 1:dim1){
strenS[i,j]=as.numeric(strens2[i,j])
# print(c(strens[i,j],strens2[i,j],namXy))
if (!is.na(causeS[i,j])){
if (causeS[i,j]==namXy) strenS[i,j]=-as.numeric(strens2[i,j])
}}}
rownames(strenS)=dimn1
colnames(strenS)=dimn3
#print(strenS)
avgstrens=apply(strenS,1,mean,na.rm=TRUE)
avgPearsons=apply(pearsons2,1,mean,na.rm=TRUE)
strenspace=cbind(strenS,avgstrens)
pearsonspace=cbind(pearsons2,avgPearsons)
colnames(strenspace)=c(dimn3,"avg")
colnames(pearsonspace)=c(dimn3,"avg")
rownames(pearsonspace)=dimn1
print("strenspace: causal strengths by space dimention, last col.has average")
print(strenspace)
print("pearsonspace or Pearson correlations by space dimension")
print(pearsonspace)
print("report the grand summary in 4 columns")
strtime=(strentime[,NCOL(strentime)])
cortime=(pearsontime[,NCOL(pearsontime)])
strspace=(strenspace[,NCOL(strenspace)])
corspace=(pearsonspace[,NCOL(pearsonspace)])
grandsum=cbind(strtime,cortime,strspace,corspace)
print(grandsum)
print("Nonlinear Granger Causality average of difference in two R-squares")
print(grangerAns0)
print("Granger Causality test statistic mean of differences in two R-squares")
print(grangerStat)
grangerAns=rbind(grangerAns0,grangerStat)
rownames(grangerAns)=c("mean of differeces in Rsq","test statistic")
print(grangerAns)
list(outt=outt, outs=outs, outdif=outdif,
grangerAns=grangerAns,
strentime=strentime,strenspace=strenspace,
pearsontime=pearsontime, pearsonspace=pearsonspace,
grandsum=grandsum)
} #end function
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.