Data

To showcase all of the functions in this package we will use four different datasets that have the attributes necessary to display the purpose of each function.

BinomFunc = function(theta, b) {dbinom(b, 20, prob = 1 / (1 + exp(- theta)), log = TRUE)}
BinomData <- scan(url("http://www.stat.umn.edu/geyer/3701/data/q1p7b.txt"))
head(BinomData,3)
GrowthData = read.csv("http://www.stat.umn.edu/geyer/3701/data/growth.csv")
GrowthData[1:3, 1:6]
b = load(url("http://www.stat.umn.edu/geyer/3701/data/q2p3.rda"))
MatrixData = a
head(MatrixData,3)
WeightedData = read.table(url("http://www.stat.umn.edu/geyer/3701/data/q1p4.txt"), header=TRUE)
head(WeightedData,3)
library(ggplot2)
library(dplyr)
library(NecklenJacobTools)

Functions

This package contains several different functions that cover a wide variety of processes including plotting, filtering, standardizing, and calculating values. Each function below will be described in detail with example(s) to show its possible use cases.

choosecol( )

ggwrapper( )

loglikelihood( )

myapply( )

standardCols( )

weighted( )

weightedGIEMO( )

choosecol(data, colName)

choosecol( ) allows the user to select a subset of columns in a data frame. This function takes two arguments, the data frame and a string that corresponds to the name of the column(s) you want to select for.

head(choosecol(GrowthData, "HT10"), 3)
head(choosecol(GrowthData, "HT10.5"), 3)

(Notice that the first function call returns two columns from the data frame because both colNames contain the string "HT10")

ggwrapper(data, X, Y)

ggwrapper( ) allows you to plot two variables from a dataset using a ggplot. This allows you to determine correlation between the dependent and independent variables. The function takes in the dataset first, then the x-var and then the y-var.

ggwrapper(GrowthData, GrowthData[,1], GrowthData[,15])

loglikelihood(data, func, int)

loglikelihood( ) can be used to calculate the maxmimum likelihood estimate given a specific dataset, function and interval (coincidentally these are the the 3 arguments to the function). When the function is provided the correct input, its output will be a single number, the MLE.

loglikelihood(BinomData,BinomFunc,c(-100,100))

myapply(matrix, margin, func)

myapply( ) takes in a 2-D array (matrix) and applies a function specified by you (i.e., mean, sd, etc.) to each row or each column. This is where the margin comes in: margin==1 means row-wise and margin==2 means column-wise.

myapply(MatrixData, 2, mean)

standardCols(matrix)

standardCols( ) standardizes any given matrix by replacing each element of the matrix by its z-score corresponding to the column mean and sd. Often times this type of standardization can be useful to help make the data more consistent across columns.

standardCols(MatrixData)

weighted(matrix)

weighted( ) calculates the weighted mean, variance, and standard deviation of a data vector based on a probability vector corresponding to each element. Each element is multiplied by its probability to determine its weight in the mean/var/sd.

weighted(WeightedData)

weightedGIEMO(matrix)

weightedGIEMO( ) is very similar to weighted( ), except that it does error checks to determine if the data vector is composed of only real numbers and whether the sum of the probability is 1. If any of these errors occur, error messages will be displayed to the user.

try(weightedGIEMO(list(x=rnorm(10), p=c(1:10))))
message(geterrmessage())
weightedGIEMO(WeightedData)

(Notice that the first method does not work because the probability vector does not sum to 1)



neckl004/NecklenJacobTools documentation built on May 25, 2019, 10:34 p.m.