IASD: Model Selection for Index of Asymmetry Distribution

View source: R/IASD.R

IASDR Documentation

Model Selection for Index of Asymmetry Distribution

Description

Calculate AIC's and AICc's of unimodal model (one normal distribution) and bimodal model(a mixture of two normal distributions) which fit the distribution of indices of asymmetry (IAS), and plot their density, to help determine IAS distribution is unimodal or bimodal.

Usage

IASD(df, dfCols = NA, fixSignApproximation = FALSE, 
	plotGraph = TRUE, plotToScreen = FALSE, filePrefix = NA, 
	xlimMin = NA, xlimMax = NA, ylimMin = 0, ylimMax = NA, 
	dHist = NA, dFunc = NA, meanStartSymmetric = NA, 
	sdStartSymmetric = NA, meanStartAsymmetric = NA, 
	sdStartAsymmetric = NA, positiveRatioStartAsymmetric = NA, 
	plotSelect = rep(TRUE, 4), showLegend = TRUE, 
	modelName = c("FA", "DA", "AS", "Skewed AS"), xlab = NA, 
	ylab = NA, main = NA, freqAxis = FALSE, lineColor = "black", 
	nsmall = 2, fileType = "TEXT", generateFiles = TRUE, ...)

Arguments

df

data frame containing the data to be investigated.

dfCols

Columns in df to be processed. If NA, they are from the second to the last columns for multi-column data frame and the sole column for single-column data frame.

fixSignApproximation

If TRUE, parameters of normal distributions are determined only by using absolute values, rather than MLE. Each column can be separately controlled by using vector value.

plotGraph

If TRUE, histograms and density plots are plotted and saved to PDF, if FALSE, they are not plotted.

plotToScreen

If TRUE, plotted graphs are also shown in the screen.

filePrefix

File names of saved plots and AIC, AICc table files start with this value.

xlimMin

Minimum of plot range. If NA, it is determined from the data. Each column can be separately controlled by using vector value.

xlimMax

Maximum of plot range. If NA, it is determined from the data. Each column can be separately controlled by using vector value.

dHist

Width of histogram bars. If NA, it is one 20th of the plot range. Each column can be separately controlled by using vector value.

dFunc

Broken line step for the plot of density functions. If NA, it is one 200th of the plot range. Each column can be separately controlled by using vector value.

ylimMin

Minimum of vertical axis of plots. If NA, it is determined by "hist()" function of R. Each column can be separately controlled by using vector value.

ylimMax

Maximum of vertical axis of plots. If NA, it is determined by "hist()" function of R. Each column can be separately controlled by using vector value.

meanStartSymmetric

Start value of mean for mle() in bimodal symmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value.

sdStartSymmetric

Start value of sd for mle() in bimodal symmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value.

meanStartAsymmetric

Start value of mean for mle() in bimodal asymmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value.

sdStartAsymmetric

Start value of sd for mle() in bimodal asymmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value.

positiveRatioStartAsymmetric

Start value of positiveRatio for mle() in bimodal asymmetric model. If NA, it is ratio of positive data. Each column can be separately controlled by using vector value.

plotSelect

Indicate which model's density graph is plotted.

showLegend

If TRUE, legend of the graph is drawn.

modelName

Name of four models.

xlab

Label of x axis. If NA, name of column is used. Each column can be separately controlled by using vector value.

ylab

Label of y axis. If NA, "Density" is used. Each column can be separately controlled by using vector value.

main

Title of graph. If NA, "Histogram of (column name)" is used. Each column can be separately controlled by using vector value.

freqAxis

If TRUE axis for frequency is drawn on right. Each column can be separately controlled by using vector value.

lineColor

Color of density graphs. Four density graphs can be separately controlled by using vector value. If first two color are same, line pattern changes for each density graph.

nsmall

The number of digits to the right of decimal points for AIC and AICc.

fileType

Type of output files for calculation results of AIC and AICc. If "TEXT", output files are tab separated text file. If "CSV", they are CSV file.

generateFiles

Do not use this option. If generateFiles is FALSE, no files are generated. This option is to avoid strict check of CRAN.

...

Other parameters are passed to hist() function.

Details

Calculate AIC and AICc for the following four models and plot their densities.

  1. unimodal symmetric distribution (normal distribution with mean = 0)
    N(0, sd^2)

  2. unimodal asymmetric distribution (normal distribution)
    N(mean, sd^2)

  3. bimodal symmetric distribution (mixture of two normal distributions with opposite sign of mean but same absolute values and weights)
    0.5*N(mean, sd^2) + 0.5*N(- mean, sd^2)

  4. bimodal asymmetric distribution (weighted mixture of two normal distributions with opposite sign of mean and the same absolute values)
    positiveRatio*N(mean, sd^2) + (1 - positiveRatio)*N(- mean, sd^2)

Tables of AIC and AICc are saved as tab separated text file or CSV file, depending of fileType argument. Histogram and model densities plot are saved for each column.

If the start values for mle() (meanStartSymmetric, sdStartSymmetric, meanStartAsymmetric, sdStartAsymmetric, positiveRatioStartAsymmetric) are inappropriate values, mle() does not work properly. If they are not assigned (NA), mean and sd are those of absolute values of the data, and positiveRatio is the ratio of positive data.

Value

AIC

AIC (Akaike's information criterion)

AICc

AICc (AIC with a correction for finite sample sizes)

modelName[1]

list for the unimodal symmetric model

modelName[2]

list for the unimodal asymmetric model

modelName[3]

list for the bimodal symmetric model

modelName[4]

list for the bimodal asymmetric model

mean

estimated value of mean

sd

estimated value of sd

positiveRatio

estimated value of positiveRatio

f

density function

Author(s)

Satoshi Takahashi

Examples

	df = data.frame(ID = c(1:5), IAS = c(8.3, 12.7, -12.7, -7.3, -8.1),
	   IAS2 = c(14.2, 8.8, -12.7, -8.6, -10.5),
	   IAS3 = c(1.04, 1.28, -0.78, -0.84, -0.85))
		# Do not use the option 'generateFiles = FALSE', in the following IASD commands.
	result = IASD(df, generateFiles = FALSE) # calculate AIC's and AICc's
	result = IASD(df, cols = c(2,4), plotGraph = FALSE, generateFiles = FALSE) 
		# use data in the second and fourth columns, do not plot graphs
	result = IASD(df, filePrefix="P.microlepis", xlimMin = -15, 
		xlimMax = 15, dHist = c(1, 1, 0.1), generateFiles = FALSE)  
		# file name of each plot starts with "P.microlepis", plot range 
		# and width of histgram bar is changed

IASD documentation built on Sept. 8, 2023, 5:43 p.m.