CleanSemAceDataset: Produces a cleaned dataset that works well with when using...

Description Usage Arguments Details Value Author(s) Examples

Description

This function takes a ‘GroupSummary’ data.frame (which is created by the RGroupSummary function) and returns a data.frame that is used by the Ace function.

Usage

1
CleanSemAceDataset(dsDirty, dsGroupSummary, oName_S1, oName_S2, rName = "R")

Arguments

dsDirty

This is the data.frame to be cleaned.

dsGroupSummary

The data.frame containing information about which groups should be included in the analyses. It should be created by the RGroupSummary function.

oName_S1

The name of the manifest variable (in dsDirty) for the first subject in each pair.

oName_S2

The name of the manifest variable (in dsDirty) for the second subject in each pair.

rName

The name of the variable (in dsDirty) indicating the pair's relatedness coefficient.

Details

The function takes dsDirty and produces a new data.frame with the following features:

[A] Only three existing columns are retained: O1, O2, and R. They are assigned these names.

[B] A new column called GroupID is created to reflect their group membership (which is based on the R value). These valuesa re sequential integers, starting at 1. The group with the weakest R is 1. The group with the strongest R has the largest GroupID (this is typically the MZ tiwns).

[C] Any row is excluded if it has a missing data point for O1, O2, or R.

[D] The data.frame is sorted by the R value. This helps program against the multiple-group SEM API sometimes.

Value

A data.frame with one row per subject pair. The data.frame contains the following variables (which can NOT be changed by the user through optional parameters):

R

The pair's R value.

O1

The outcome variable for the first subject in each pair.

O2

The outcome variable for the second subject in each pair.

GroupID

Indicates the pair's group membership.

Author(s)

Will Beasley

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
library(NlsyLinks) #Load the package into the current R session.
dsLinks <- Links79PairExpanded #Start with the built-in data.frame in NlsyLinks
dsLinks <- dsLinks[dsLinks$RelationshipPath=='Gen2Siblings', ] #Use only NLSY79-C siblings

oName_S1 <- "MathStandardized_S1" #Stands for Outcome1
oName_S2 <- "MathStandardized_S2" #Stands for Outcome2
dsGroupSummary <- RGroupSummary(dsLinks, oName_S1, oName_S2)

dsClean <- CleanSemAceDataset( dsDirty=dsLinks, dsGroupSummary, oName_S1, oName_S2, rName="R" )
summary(dsClean)

dsClean$AbsDifference <- abs(dsClean$O1 - dsClean$O2)
plot(jitter(dsClean$R), dsClean$AbsDifference, col="gray70")

NlsyLinks documentation built on May 2, 2019, 4:36 p.m.