Description Usage Arguments Details Value Author(s) References See Also Examples
Builds the data object for yourcast
function from files in working directory or other specified
directory and checks for errors
1 2 3 4 5 6 7 |
dpath |
String. Name of the directory where data files are
stored. If |
tag |
String. Group of characters placed before CSID code in
filenames to indicate which files in |
index.code |
String indicating how the CSID index variable is
coded in the input data. Between 0 and 4 of the following two
characters are used in this order: |
datalist |
A list of cross section dataframes already loaded into
the workspace to be added to the |
A.names, G.names, T.names |
String. Filename of optional
two-column data files that list all valid numerical codes
(in the first column) and corresponding alphanumeric names
(optionally in the second column) for the indices
corresponding to geographic areas in |
proximity |
Data file with codes to construct the symmetric
matrix (geographic region by geographic region) of proximity
scores for geographic smoothing used by the ‘map’ and ‘bayes’
methods. The larger the relative score, the more proximate
that pair of countries is in the prior; a zero element means
the two geographic areas are unrelated (the diagonal is
ignored). Each row of the |
year.var |
Boolean. Should be |
sample.frame |
Optional four element vector containing, in order,
the start and end time periods to be used for the observed
data and the start and end time periods to be forecast. All
cross sections do not have to begin at starting date, but
must contain all years after the first observed
value. Variables to be forecasted should be coded as
|
summary |
Boolean. If |
verbose |
Boolean. If |
lag |
Number of years covariate data needs to be lagged from
current position is cross section files. See ‘Details’ for more
information. Default: |
formula |
Formula. The formula that one will use in the subsequent run of
|
vars.nolag |
Vector of strings. Vector of variables to be included in the dataobj but not lagged. These variables do not need to be included in the formula, and if found there will not ignored when the other covariates are lagged. |
Creates dataobj
input for yourcast
from
files in working directory or other specified directory. Checks
that all cross sections in data
list titled properly and
if all years up to last predicted year included in the dataframes
(if sample.frame
argument specified). Please note, however,
that all cross sections from the same geographic area must have the
same observation and prediction years in the dataframe (even if
NA
) for the graphing software plot.yourcast
to
work.
The cross section files must be named according to the CSID
identifiers for country code and age group, preceeded by the
specified tag (default: "csid"
) so that yourprep()
can
identify the file from other files in the dpath. For example, for
the USA (country code 2450) time series of 45 year old
individuals, the file name should be ‘csid245045.txt’ if the
tag is left as the default. Files must have an extension so that
the program can recognize how the data is coded. Currently, fixed
width text files (‘*.txt’), comma-separated values
(‘*.csv’), and Stata v.5-10 (‘*.dta’) files are
supported, and multiple file types may be used in the same run of
the program. ‘*.Rdata’ objects can be included with the
datalist
option after they are loaded to a list in the
workspace. yourprep()
includes diagnostics to ensure that
objects are properly named and not included accidentally, but
users should examine the specified dpath
before running
yourprep()
to minimize errors.
Each cross section file should be labeled columns of time-series
data for the dependent variable(s) (e.g., disease, pop) and the
covariates that will be used in the forecast. The rownames for
the dataframe should be the observation year (if the year is
coded as a separate variable, set year.var=TRUE
). The
files must contain the full time series that will be specified in
the sample.frame
argument in yourcast
after
the first observed year. For instance, if
sample.frame=c(1950,2000,2001,2030)
, then files would have
observations that start between 1950 and 2000 and include all
other years (even if the entries are NA
) up to the last
year of prediction, i.e., 2030.
Optional auxiliary files such as G.names
should be named
according to the filename specified in the respective
arguments. If specified, these files must have extensions and be
coded in one of the three supported file types. However, these
files will be automatically loaded by yourprep()
if they are
saved in the dpath
and labeled with the tag specified by the
user. The default names for these files must be used (e.g.,
‘G.names’ and ‘proximity’). For example,
if the tag
is left as the default and there is a file in the
dpath
labeled ‘csid.G.names.txt’, yourprep()
will load this
automatically and save the input as the G.names
element of
the ‘dataobj’ list. yourprep()
arguments such as
G.names
take precedence over ‘TAG.*’ files in
thedpath
.
yourprep()
also includes a lagging utility (activated once
one specifies a lag length with the ‘lag’ argument). This
utility is useful for when the data in each cross section is, for
example, the response and covariates for 50 year olds in each year
but the desired content for each cross section is the response for
50 year olds and the covariates for 25 year olds 25 years prior to
each year (implying a lag of 25 years). In order to have
yourprep()
perform this lagging automatically, include cross
sections for each age group with data starting the same number of
years before the first observation year as the requested lag
period. Thus if lag=25
and the first observation year is
1950, then the cross sections should all start at 1925. Age groups
younger than the length of the lag will not retain covariate data
(except perhaps an ‘index’ variable) in the output
object. The covariates lagged are the predictor variables specified
in the formula argument.
If data for a cohort 25 years (in this case) younger is not
available for some cohort over age 25, yourprep()
will look
for the closest cohort available and issue a warning message.
dataobj |
A list with several components:
|
Jon Bischof jbischof@fas.harvard.edu
http://gking.harvard.edu/yourcast
yourcast
function and documentation
(help(yourcast)
)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | ## Not run:
# Working directory automatically set to directory with cross
# section and auxiliary files to begin. Files for this example
# in 'data' folder of YourCast library.
#Old working directory to be restored later
oldwd <- getwd()
# Now setting wd to 'data' folder in YourCast library
setwd(system.file("data",package="YourCast"))
# Simple run of the function, using option that turns year variable
# into label in each cs. Use sample.frame argument for all diagnostics
# to work
dta <- yourprep(G.names="cntry.codes.txt", proximity="proximity.txt",
year.var=TRUE,verbose=TRUE,sample.frame=c(1950,2000,2001,2030))
# With summary output (means of variables in each cross section)
dta <- yourprep(G.names="cntry.codes.txt", proximity ="proximity.txt",
year.var=TRUE,summary=TRUE)
# Function can also add datafiles already loaded into R as objects in
# the workspace with "datalist" option if put into a list and properly
# labeled. All diagnostics still performed
# 'csid204545', etc., are dataframes in workspace
# Labels changed to nonsense ones so as not to confuse with other files
data(csid204545)
data(csid204550)
data(csid204555)
datalist <- list("123456"=csid204545,"234567"=csid204550,
"345678"=csid204555)
# Verbose option turned on and datalist argument added
dta <- yourprep(G.names="cntry.codes.txt", proximity="proximity.txt",
year.var=TRUE,verbose=TRUE,datalist=datalist)
# Setting working directory back
setwd(oldwd)
rm(oldwd)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.