knitr::opts_chunk$set(fig.width = 5, fig.height = 4, fig.align = "center") options(digits = 4, show.signif.stars = FALSE) set.seed(12345)
Below is a guide to the AmeliaView menus with references back to the users's guide. The same principles from the user's guide apply to AmeliaView. The only difference is how you interact with the program. Whether you use the GUI or the command line versions, the same underlying code is being called, and so you can read the command line-oriented discussion above even if you intend to use the GUI.
The easiest way to load AmeliaView is to open an R session and type the following two commands:
library(Amelia) AmeliaView()
This will bring up the AmeliaView window on any platform.
AmeliaView loads with a welcome screen that has buttons which can load a data in many of the common formats. Each of these will bring up a window for choosing your dataset. Note that these buttons are only a subset of the possible ways to load data in AmeliaView. Under the File menu (shown below), you will find more options, including the datasets included in the package (africa
and freetrade
). You will also find import commands for Comma-Separated Values (.CSV), Tab-Delimited Text (.TXT), Stata v.5-10 (.DTA), SPSS (.DAT), and SAS Transport (.XPORT). Note that when using a CSV file, Amelia assumes that your file has a header (that is, a row at the top of the data indicating the variable names).
You can also load data from an RData file. If the RData file contains more than one data.frame
, a pop-up window will ask to you find the dataset you would like to load. In the file menu, you can also change the underlying working directory. This is where AmeliaView will look for data by default and where it will save imputed datasets.
Once a dataset is loaded, AmeliaView will show the variable dashboard. In this mode, you will see a table of variables, with the current options for each of them shown, along with a few summary statistics. You can reorder this table by any of these columns by clicking on the column headings. This might be helpful to, say, order the variables by mean or amount of missingness.
You can set options for individual variables by the right-click context menu or through the "Variables" menu. For instance, clicking "Set as Time-Series Variable" will set the currently selected variable in the dashboard as the time-series variable. Certain options are disabled until other options are enabled. For instance, you cannot add a lagged variable to the imputation until you have set the time-series variable. Note that any factor
in the data is marked as a ID variable by default, since a factor
cannot be included in the imputation without being set as an ID variable, a nominal variable, or the cross-section variable. If there is a factor
that fails to meet one of these conditions, a red flag will appear next to the variable name.
The "Variable" menu and the variable dashboard are the place to set variable-level options, but global options are set in the "Options" menu. For more information on these options, see vignette("using-amelia")
.
mydata
, your output files will be mydata1.csv, mydata2.csv...
etc.mi
tools.Seed - Sets the seed for the random number generator used by Amelia. Useful if you need to have the same output twice.
Tolerance - Adjust the level of tolerance that Amelia uses to check convergence of the EM algorithm. In very large datasets, if your imputation chains run a long time without converging, increasing the tolerance will allow a lower threshold to judge convergence and end chains after fewer iterations.
Empirical Prior - A prior that adds observations to your data in order to shrink the covariances. A useful place to start is around 0.5\% of the total number of observations in the dataset.
Maximum Resample for Bounds - Amelia fits logical bounds by rejecting any draws that do not fall within the bounds. This value sets the number of times Amelia should attempt to resample to fit the bounds before setting the imputation to the bound.
Once you have set all the relevant options, you can impute your data by clicking the "Impute!" button in the toolbar. In the bottom right corner of the window, you will see a progress bar that indicates the progress of the imputations. For large datasets this could take some time. Once the imputations are complete, you should see a "Successful Imputation!" message appear where the progress bar was. You can click on this message to open the folder containing the imputed datasets.
If there was an error during the imputation, the output log will pop-up and give you the error message along with some information about how to fix the problem. Once you have fixed the problem, simply click "Impute!" again. Even if there was no error, you may want to view the output log to see how Ameliaran. To do so, simply click the "Show Output Log" button. The log also shows the call to the amelia()
function in R. You can use this code snippet to run the same imputation from the R command line. You will have to replace the x
argument in the amelia()
call to the name of you dataset in the R session.
Upon the successful completion of an imputation, the diagnostics menu will become available. Here you can use all of the diagnostics available at the command-line.
It is often useful to save a session of AmeliaView to save time if you have impute the same data again. Using the Save Session button will do just that, saving all of the current settings (including the original and any imputed data) to an RData file. You can then reload your session, on the same computer or any other, simply by clicking the Load Session button and finding the relevant RData file. All of the settings will be restored, including any completed imputations. Thus, if you save the session after imputing, you can always load up those imputations and view their diagnostics using the sessions feature of AmeliaView.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.