In DataScienceSalon/xmar: Extra-Marital Sex: Attitudes and Behaviors

options(knitr.table.format = "html")
options(max.print="75", scipen=999, width = 800)
knitr::opts_chunk$set(echo=FALSE,
                 cache=FALSE,
               prompt=FALSE,
               tidy=TRUE,
               root.dir = "..",
               fig.height = 8,
               fig.width = 20,
               comment=NA,
               message=FALSE,
               warning=FALSE)
knitr::opts_knit$set(width=100, figr.prefix = T, figr.link = T)
knitr::knit_hooks$set(inline = function(x) {
  prettyNum(x, big.mark=",")
})

load(file = "../data/GSS.Rdata")

xmar <- preprocess(GSS)

eda <- univariate(xmar$univariate)

Abstract

Introduction

There is no dearth of research on American's changing attitudes towards marriage, its primacy as a way of life, and its exclusivity. Approximately four-in-ten Americans say that the present institution of marriage is becoming obsolete [@Taylor2010]. According to the same Pew Research 2010 study, 72% of all adults in America were married in 1960. By 2008, this percentage had dropped to 52%. From this apparent decline in marriage, Americans are slowly becoming more accepting of open relationships, consensual non-manogamy (CNG) practices, and the like. Nearly half of the population of Americans would consider an open relationship, though fewer than 4% actually claim to be in one [@Avvo]. Notwithstanding, nearly 22% of married men and 14% of married women admit to having an affair at least once during their marriages, with or without the spousal concent [@Johnson2017].

The purpose of this study is to examine the nature and evolution of attitudes and opinions with respect to (w.r.t.) extra-marital conduct. Serving as the data source, the opinions of over, 65,000 Americans (44% women, 56% men) over a period of 44 years (1972-2016), courtesy of the General Social Survey (GSS) [@NORCa], a project of NORC at the University of Chicago that monitors societal change and studies the growing complexity of American society. Concretely, this report contrasts the attitudes and opinions by various demographic factors over two periods: the years prior to and since the year 2000. Understanding the extent to which such attitudes have changed by gender, age, education, and geographic region is the primary aim of this investigation.

Research Questions

This observational study examines the nature opinion with respect to (w.r.t.) extra-marital conduct, and the degree to which such opinions have or have not evolved since the year 2000, via four research questions.

Research Question 1: To what degree have opinions changed since the year 2000 by age group?
Dependent Variable: Opinion w.r.t. extra-marital conduct
Explanatory Variable: Age group
* Controlling Variable: Period, i.e., years prior to and since 2000

Research Question 2: To what degree have opinions changed since the year 2000 by gender?
Dependent Variable: Opinion w.r.t. extra-marital conduct
Explanatory Variable: Gender
* Controlling Variable: Period, i.e., years prior to and since 2000

Research Question 3: To what degree have opinions changed since the year 2000 by highest education level achieved?
Dependent Variable: Opinion w.r.t. extra-marital conduct
Explanatory Variable: Highest education level achieved
* Controlling Variable: Period, i.e., years prior to and since 2000

Research Question 4: To what degree have opinions changed since the year 2000 by region?
Dependent Variable: Opinion w.r.t. extra-marital conduct
Explanatory Variable: Region
* Controlling Variable: Period, i.e., years prior to and since 2000

Document Organization

The methods section describes the data, the sampling techniques, data preprocessing, and introduces the methods, descriptive and inferential techniques used in the analyses. Hypotheses are offered, analyzed and tested in the results section. The discussion section describes the significance of the findings vis-a-vis currently available research. Lastly, the conclusion synthesizes the key points.

Methods

Data

The General Social Survey (GSS), a project of the NORC at the University of Chicago, provided the data upon which this report is based. Formerly known as the National Opinion Research Center, "NORC at the University of Chicago is an objective non-partisan research institution that delivers reliable data and rigorous analysis to guide critical programmatic, business, and policy decisions" [@NORC]. Principally funded by the National Science Foundation, the GSS has been monitoring evolution, complexity opinions, behaviors, and attributes of American society since 1972. Targeting the adult population, age 18 and over in the United, the data covers a diverse range of issues including national spending priorities, marijuana use, crime and punishment, race relations, quality of life, confidence in institutions, and sexual behavior.

Data Sampling Strategy

The GSS employed a multi-stage area full probability sampling strategy.

At the first stage, NORC employed the Standard Metropolitan Statistical Areas (SMSAs) or non-metropolitan counties selected in NORC's Master Sample, as the Primary Sampling Unit (PSU). These were further subdvided into three categories. Category 1, representing 56% of the US population, comprised of the nations largest metropolitan areas or Consolidated Statistical Areas (CSAs) with a population of at least 1,543,728 (0.5 percent of the 2010 Census U.S. population). The second category, covering 35% of the US population, included Core Based Statistical Areas and CSAs with at least 8 tracts that are predominantly street-style addresses. The third category spanned 9% of the US population and consisted of small counties with at most 8 tracts. Minimum size for tracts were defined as 300 respondents.

Similarly, the second stage sampling was conducted in three categories. Category 1 comprised 216 type A tracts (city addresses) and 8 block-groups within type B (rural) tracts. Category 2, consisted of 8 segments per first-stage selection resulted in 480 segments in the 2010 National Sample Design, but GSS used only 120 of them. For category 3 first-stage selections, the 2010 National Sample Design only selected 5 segments per first-stage selection, but GSS used 4 and one half of them for a total of 56 segments.

The sampling strategy employed by GSS was designed to give each household an equal probability of being included in the sample. For each household selected, sampling procedures were undertaken to ensure that each individual in that household had an equal probability of being interviewed.

Data Sampling Bias

The GSS samples closely resemble distributions reported in the Census and other authoritative sources. However, survey non-response, sampling variation, and various other factors have introduced potential sources of bias and resulted in variance from Census distributions on some variables. For instance all full-probability samples under-represent males and block quota samples under-represented men in full-time employment. Weights were designed (and should be employed when using GSS 2004 or later), to assure proper representation of non-response sub-samples, and other factors. For a full discussion of distributional variation due to non-response, one should refer to the GSS Methodological Reports in the help and resources section at https://gssdataexplorer.norc.org/,

This full probability design was acknowledged by the National Science Foundation to be superior to simple random sampling, and generalizable to the population of adult citizens in America. Consisting of over 60,000 individual responses, spanning over 40 years, the GSS data remains one of the nations treasures, providing data for inference about American society to researchers, academics, students, politicians, and opinion makers. Yet, as respondents were randomly selected (not assigned), one is careful to limit inferences to association, and not causation. That said, the extent to which results of a research question can be generalized to the population will require that the inference conditions are met, specifically the proportion success/failure condition. As such, the question of "generalizeability" will need to be re-addressed independently for each research question.

Data Variables

The following table lists the variables extracted from the GSS data.

r kfigr::figr(label = "variableTbl", prefix = TRUE, link = TRUE, type="Table"): GSS Variables

variables <- read.csv("../data/variables.csv")
knitr::kable(variables) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

Data Preprocessing

The GSS variables were extracted for years 1972 thru 2016 and the following variables were created.

Period: Two factor variable
Prior to 2000: Period from 1972 through 1999 inclusive
Since 2000: Period from 2000 through 2016
Age Group: Four factor variable
18-24 * 25-44
45-64 * 65+
Opinion: Two factor variable:
Traditional: Always wrong and almost always wrong
Non-Traditional: Sometimes wrong and not wrong at all
Region: Five factor variable:
Northeast: Comprised of responses from the New England and Mid Atlantic regions
MidWest: Comprised of responses from the East Central and West Central regions
South: Consisting of responses from the South Atlantic and East and West South Central regions
Mountain
Pacific
Education: Five factor variable:
High School: Education years 0 thru 12
Community College: Education years 13 thru 14
UnderGraduate: Education years 15 thru 16
Graduate: Education years 17 thru 18
Post-Graduate: Education years 19 thru 20

Responses with "NAs", "Don't Know", and other non-response values for the GSS variables in scope, were filtered from the data set.

Data Analysis

The data analysis outlined in the results section begins with a univariate exploratory data analysis of each study variable. Frequencies and proportions of responses for each categorical value are presented as well as confidence intervals for the population proportions at each categorical variable level.

To ensure that the observed sample proportions were within a 5% margin of error (the confidence intervals) of the actual population proportion, minimum required sample sizes were computed for each variable at each level. The minimum sample size $N_{min}$ for level $l$ of a categorical variable was computed as follows:

$$N_{min} = \frac{p_l(1-p_l)}{(\frac{me}{z^})^2}$$ where $p_l$ is the sample proportion for the categorical variable at level $l$ $me$ is the margin of error $z^$ is the critical value on the z-distribution for the designated margin of error , as well as confidence intervals for the

Having established the univariate descriptive and inferential statistics, each research question was explored using the following hypothesis testing approach:
1. State Hypotheses
2. Check Parametric Inference Conditions
3. Select appropriate statistical method / test statistic
4. Compute the p-value, the probability of observing the data given the null hypothesis
5. Interpret and report results.

State Hypotheses

The research questions were aimed at discovering the extent to which opinions have changed over time. As such, the two-sided hypotheses in this study take the following form:
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is a proportion of interest for years prior to 2000
$p_2$ is a proportion of interest for years since 2000

Inference Conditions

Statistical inferences were derived from parametric statistics and the central limit theoreom, whereby the former assumes that sample data comes from a population distribution with a fixed set of parameters, and the later assumes that the distribution is normal and that the parameters are $N(\mu, \sigma^2)$, where $\mu$ is the population proportion parameter $\hat{p}$ and $\sigma^2$ is the variation of of the population proportion parameter. We can assume that the sample comes from a normally distributed population if:

Independence: The sample observations must be independent. Since all samples were obtained through random sampling, independence was assumed for all statistical tests.
Success/Failure: For one proportion inference, there must be a minimum of ten successes and failures for dichotomous variables. For comparison of two proportions, there must be a minimum of five counts in the expected frequency contingency table at each cell. Concretely, there must be at least five counts for each response/explanatory/controlling variable combination.
Sample Size: The sample size must be less than 10% of the population, if drawn without replacement. According to 2010 census, the population of adults ag 18 and older exceeded 200 million. Since the GSS data contains less than 2 million observations, this sample size requirement was assumed for all statistical tests.

Statistical Methods

Two types of statistical methods were used to evaluate the hypotheses statements:
1. Two-proportion z-tests for hypotheses w.r.t. the difference in two proportions.
2. Two-proportion z-tests for confidence intervals w.r.t. the difference in two proportions.

Two proportion z-test for hypothesis tests

To evaluate hypotheses w.r.t. the difference in two proportions, z-tests were conducted using the pooled proportions for the standard error calculation. Pooled proportions, $\hat{p}$, were calculated as follows:

$$\hat{p} = \frac{\hat{p_1}n_1 + \hat{p_2}n_2}{n_1 + n_2}$$
where:
$n1$ is the number of responses of interest prior to the year 2000
$n2$ is the number of responses of interest since the year 2000
$p1$ is the the proportion of responses of interest prior to the year 2000
$p2$ is the the proportion of responses of interest since the year 2000

Given the pooled proportion, the pooled standard error, $SE_{pooled}$, for the difference in proportions was calculated as follows:
$$SE_{pooled} = \sqrt{\frac{\hat{p}(1-\hat{p})}{n_1} + \frac{\hat{p}(1-\hat{p})}{n_2}}$$
where:
$\hat{p}$ is the pooled proortion calculated above
$n1$ is the number of responses of interest prior to the year 2000
$n2$ is the number of responses of interest since the year 2000

Lastly, the z-statistic under the null hypothesis of equal proportions was calculated as follows:
$$Z = \frac{(\hat{p_1} - \hat{p_2}) - 0}{SE_{pooled}}$$ The z-statistic was used to calculate a p-value by comparing the value of the statistic for each random variable $X$ to a standard normal distribution ($X \sim N(\mu = 0, \sigma^2 = 1)$. The selected level of confidence for all statistical tests was 95% ($\alpha = 0.05$). If the p-value was less than or equal to $alpha$, the null hypothesis was rejected. Otherwise, the null hypothesis was not rejected. The p-value for a guassian distribution can be readily obtained by most statistical software packages and is computed as follows:
$$p(x;\mu, \sigma^2) = \frac{1}{\sqrt{2\pi} * \sigma} * exp(-\frac{(x-\mu)^2}{2\sigma^2})$$ where:
$x$ = random variable
$\mu$ = mean of random variable $x$
$\sigma$ = standard deviation of a random variable $x$

Two proportion z-test for confidence intervals

Z-tests were conducted to compute the confidence interval for the difference in two proportions. The 95% confidence intervals for the differences in two proportions were computed as follows:
$$C = (\hat{p_1} - \hat{p_2}) \pm Z^ * SE$$ where:
$\hat{p_1}$ is the the proportion of responses of interest prior to the year 2000
$\hat{p_2}$ is the the proportion of responses of interest since the year 2000
$Z^$ is the $z_{\alpha/2}$ critical value for a two sided z-distribution at a 95% confidence level $\approx 1.96$

The standard error $SE$ is computed as follows:
$$SE= \sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1} + \frac{\hat{p_2}(1-\hat{p_2})}{n_2}}$$
where:
$n1$ is the number of responses of interest prior to the year 2000
$n2$ is the number of responses of interest since the year 2000
$\hat{p_1}$ is the the proportion of responses of interest prior to the year 2000
$\hat{p_2}$ is the the proportion of responses of interest since the year 2000

Interpret and report results

Both hypothesis testing (p-values) and confidence intervals were used to interpret the difference in proportions. For the hypothesis tests, the null hypothesis was evaluated as follows:
p-value $\neq \alpha$, the hypothesis is rejected in favor of the alternative hypothesis.
p-value $= \alpha$, the hypothesis is not rejected in favor of the alternative hypothesis.

Confidence intervals for the difference in proportions were also computed. If the confidence interval did not include zero difference, the null hypothesis was rejected.

Relative risk was also computed to characterize the direction and size of the difference of proportions.

System and Environment

This analysis was implemented using the 64 bit version of the R Programming Language, version 3.4.1. [@TheRFoundation2015] within the R. Studio Version 1.1.330 [@RStudioTeam2016] development environment on a Windows x64-based laptop powered by an Intel Core i7-3610QM CPU @ 2.30GHz, 2301 MHz processor with 4 Cores, 8 Logical Processors, and 16.0 GB of installed memory, running the Microsoft Windows 10 Home operating system, version 10.0.14393 Build 14393. Report writing and generation packages included knitr [@Xie2013], and kfigr [@Koohafkan2015]. Data management functionality was provided by the dplyr [Wickham2015a], reshape2 [@Rcpp2016], xtable [@Dahl2016] and data table [@Dowle2016] packages. Graphics and data visualization were powered by the ggplot2 [@Wickham2016], gridExtra [@BaptisteAuguie2016], and the stargazer [@Hlavac2015] packages. The inference [@VinhNguyen2017] statistcal package provided functionality for extracting inferential values from a fitted model.

Results

As a preliminary step, a univariate exploratory data analysis was undertaken for each variable in the study as described above.

Exploratory Data Analysis

Using tables and barplots, each study variable was examined in terms of:
N: The counts for each level of the categorical variable
Minimum N: The minimum sample size required to ensure that the sample proportions were within a 5% margin of error (the confidence intervals) of the population proportions.
Proportion: The proportional responses at each level of each categorical variable
Cumulative Proportions: The cumulative proportions at each level of each categorical variable
* Confidence Interval: The 95% confidence interval for the population proportion at each categorical level

Furthermore, the conditions for inference were checked in order to characterize the degree to which the samples were representative of the population proportions.

Opinion

The following table provides the descriptive statistics for the Opinion variable.

r kfigr::figr(label = "edaOpinionStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Opinion Variable

knitr::kable(eda$opinion$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaOpinionStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaOpinionPlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$opinion$stats$N) observations, with r eda$opinion$stats$N[1], and r eda$opinion$stats$N[2] traditional and non-traditional opinions, respectively. Of all respondents interviewed since 1972, r round(eda$opinion$stats$Cumulative[1] * 100, 0)% consider extra-marital conduct to be wrong or almost always wrong!

eda$opinion$plot

r kfigr::figr(label = "edaOpinionPlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Opinion Variable

Period

The following table provides the descriptive statistics for the period variable.

r kfigr::figr(label = "edaPeriodStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Period Variable

knitr::kable(eda$period$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaPeriodStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaPeriodPlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$period$stats$N) observations, with r eda$period$stats$N[1], and r eda$period$stats$N[2] observations for the years prior to 2000, and the years since 2000, respectively. Over r round(eda$period$stats$Proportion[1] * 100, -1)% of responses were obtained prior to the year 2000.

eda$period$plot

r kfigr::figr(label = "edaPeriodPlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Period Variable

Age Group

The following table provides the descriptive statistics for the Age Group variable.

r kfigr::figr(label = "edaAgeStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Age Group Variable

knitr::kable(eda$age$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaAgeStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaAgePlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$age$stats$N) observations. Over r round(eda$age$stats$Cumulative[2] * 100, -1)% of responses were obtained from respondents below 45 years of age.

eda$age$plot

r kfigr::figr(label = "edaAgePlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Age Group Variable

Gender

The following table provides the descriptive statistics for the Gender variable.

r kfigr::figr(label = "edaGenderStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Gender Variable

knitr::kable(eda$gender$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaGenderStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaGenderPlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$gender$stats$N) observations. The percentage of male and female respondents were r round(eda$gender$stats$Proportion[1] * 100, 0) and r round(eda$gender$stats$Proportion[2] * 100, 0), respectively.

eda$gender$plot

r kfigr::figr(label = "edaGenderPlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Gender Variable

Education

The following table provides the descriptive statistics for the Education variable.

r kfigr::figr(label = "edaEducationStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Education Variable

knitr::kable(eda$educ$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaEducationStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaEducationPlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$educ$stats$N) observations. High school graduates comprised over r round(eda$educ$stats$Cumulative[1] * 100, -1)% of respondents. High school graduates with some college made up approximately r round(eda$educ$stats$Cumulative[2] * 100, -1)% of the sample. Those with undergraduate, graduate, and post-graduage degrees represented r round(eda$educ$stats$Proportion[3] * 100, 0)%, r round(eda$educ$stats$Proportion[4] * 100, 0)%, and r round(eda$educ$stats$Proportion[5] * 100, 0)%, of the sample respectively.

eda$educ$plot

r kfigr::figr(label = "edaEducationPlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Education Variable

Region

The following table provides the descriptive statistics for the Region variable.

r kfigr::figr(label = "edaRegionStats", prefix = TRUE, link = TRUE, type="Table"): Descriptive Statistics for Region Variable

knitr::kable(eda$region$stats, align = c("l", rep("c", 5))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

As shown in r kfigr::figr(label = "edaRegionStats", prefix = TRUE, link = TRUE, type="Table") and graphically depicted in r kfigr::figr(label = "edaRegionPlot", prefix = TRUE, link = TRUE, type="Figure"), there were a total of r sum(eda$region$stats$N) observations. Over r round(eda$region$stats$Cumulative[2] * 100, -1)% of respondents resided in eastern and southeastern parts of the country, r round(eda$region$stats$Proportion[3] * 100, 0)% in the Midwest and the rest in the western states.

eda$region$plot

r kfigr::figr(label = "edaRegionPlot", prefix = TRUE, link = TRUE, type="Figure"): Frequency and Proportions for Region Variable

Summary of Exploratory Data Analysis

Over r round(eda$opinion$stats$Cumulative[1] * 100, 0)% of all respondents since 1972 held the traditional view that extra-marital conduct was wrong or almost always wrong. Granted, over r round(eda$period$stats$Proportion[1] * 100, -1)% of the responses were obtained prior to the year 2000. Over r round(eda$age$stats$Cumulative[2] * 100, -1)% of respondents were less than 45 years old at the time of the interview, and women comprised nearly r round(eda$gender$stats$Proportion[2] * 100, 0)% of all responses. A slight majority (r round(eda$educ$stats$Cumulative[1] * 100, 0)%) of respondents were high school graduates and those with undergraduate, graduate, and post-graduage degrees represented r round(eda$educ$stats$Proportion[3] * 100, 0)%, r round(eda$educ$stats$Proportion[4] * 100, 0)%, and r round(eda$educ$stats$Proportion[5] * 100, 0)%, of the sample respectively. Respondents from the Northeast, South, Midwest, Mountain, and Pacific regions comprised r round(eda$region$stats$Proportion[1] * 100, 0)%, r round(eda$region$stats$Proportion[2] * 100, 0)%, r round(eda$region$stats$Proportion[3] * 100, 0)%, r round(eda$region$stats$Proportion[4] * 100, 0)% and r round(eda$region$stats$Proportion[5] * 100, 0)% of responses, respectively.

All conditions of inference: independence, success/failure, and sample size, were met. Therefore, one can assume that the true population proportions were within a 5% margin of error from their corresponding sample proportion.

Research Question 1: Evolution of opinion by age group

Here, the degree to which opinions w.r.t. extra-marital conduct have changed by age group was examined via the following research question.

To what degree have opinions changed since the year 2000 by age group?

data0 <- xmar$bivariate$ageGroup %>% select(Opinion, Period, AgeGroup) 
data1 <- xmar$bivariate$ageGroup %>% filter(AgeGroup == "15-24") %>% select(Opinion, Period) 
data2 <- xmar$bivariate$ageGroup %>% filter(AgeGroup == "25-44") %>% select(Opinion, Period) 
data3 <- xmar$bivariate$ageGroup %>% filter(AgeGroup == "45-64") %>% select(Opinion, Period) 
data4 <- xmar$bivariate$ageGroup %>% filter(AgeGroup == "65+") %>% select(Opinion, Period) 

age0 <- X23D(data0)
age1 <- analyze(data1, y = "Opinion", x = "Period", success = "Traditional", scope = "Age Group 15-24", title = "Opinions by Period (Age Group 15-24)")
age2 <- analyze(data2, y = "Opinion", x = "Period", success = "Traditional", scope = "Age Group 25-45", title = "Opinions by Period (Age Group 25-44)")
age3 <- analyze(data3, y = "Opinion", x = "Period", success = "Traditional", scope = "Age Group 45-65", title = "Opinions by Period (Age Group 45-64)")
age4 <- analyze(data4, y = "Opinion", x = "Period", success = "Traditional", scope = "Age Group 65+", title = "Opinions by Period (Age Group 65+)")
age <- rbind(age1$test$df, age2$test$df, age3$test$df, age4$test$df)

Hypotheses

The following hypotheses were tested to discover the extent to which opinion has changed by age group, over the periods of the study. Generally stated, they are:
$H_0$: $p_1 - p_2 = 0$
$H_a$: $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion within an age group for years prior to 2000
$p_2$ is the proportion of traditional opinion within an age group for years since 2000

The specific hypotheses for each age group are stated below.

Inference Conditions

The inference conditions for a two-proportion hypothesis tests were examined as follows:

Independence: Each observation was subject to a stratified random sampling and so independence was assumed.
Success/Failure: Expected counts for each response/explanatory/controlling variable combination must be five or greater. As shown in r kfigr::figr(label = "ageExp", prefix = TRUE, link = TRUE, type="Table"), the expected counts exceed this minimum threshold.
Sample Size: As indicated in section the methods section, the sample size falls well below 10% of the U.S. adult population.

r kfigr::figr(label = "ageExp", prefix = TRUE, link = TRUE, type="Table"): Contingency table of expected counts of opinions by age group, controlling for period

stargazer::stargazer(format(age0$expected, quote = FALSE), type = 'html')

As such, the conditions for two-proportion z-tests were met.

Test Results

Age Group 15-24

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct for the age group 15-24 over the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in age group 15-24 for years prior to 2000
$p_2$ is the proportion of traditional opinion in age group 15-24 for years since 2000

r age1$test$stmt$type

age1$plots$observed

r kfigr::figr(label = "age1Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions for age group 15-24 prior to and since the year 2000

As shown in r kfigr::figr(label = "age1Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "age1ZTest", prefix = TRUE, link = TRUE, type="Table"), r age1$test$stmt$detail

r kfigr::figr(label = "age1ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for age group 15-24

knitr::kable(age1$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r age1$test$stmt$conclude

Age Group 25-44

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct for the age group 25-44 over the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in age group 25-44 for years prior to 2000
$p_2$ is the proportion of traditional opinion in age group 25-44 for years since 2000

r age2$test$stmt$type

age2$plots$observed

r kfigr::figr(label = "age2Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions for age group 25-44 prior to and since the year 2000

As shown in r kfigr::figr(label = "age2Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "age2ZTest", prefix = TRUE, link = TRUE, type="Table"), r age2$test$stmt$detail

r kfigr::figr(label = "age2ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for age group 25-44

knitr::kable(age2$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r age2$test$stmt$conclude

Age Group 45-64

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct for the age group 45-64 over the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in age group 45-64 for years prior to 2000
$p_2$ is the proportion of traditional opinion in age group 45-64 for years since 2000

r age3$test$stmt$type

age3$plots$observed

r kfigr::figr(label = "age3Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions for age group 45-64 prior to and since the year 2000

As indicated in r kfigr::figr(label = "age3Plot", prefix = TRUE, link = TRUE, type="Table") and r kfigr::figr(label = "age3ZTest", prefix = TRUE, link = TRUE, type="Table"), r age3$test$stmt$detail

r kfigr::figr(label = "age3ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for age group 45-64

knitr::kable(age3$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r age3$test$stmt$conclude

Age Group 65+

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct for the age group 65+ over the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in age group 65+ for years prior to 2000
$p_2$ is the proportion of traditional opinion in age group 65+ for years since 2000

r age4$test$stmt$type

age4Plot <- age4$plots$observed

r kfigr::figr(label = "age4Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions for age group 65+ prior to and since the year 2000

As indicated in r kfigr::figr(label = "age4ZTest", prefix = TRUE, link = TRUE, type="Table") and r kfigr::figr(label = "age4ZTest", prefix = TRUE, link = TRUE, type="Table"), r age4$test$stmt$detail

r kfigr::figr(label = "age4ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for age group 65+

knitr::kable(age4$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r age4$test$stmt$conclude

Research Question 2: Evolution of opinion by gender

Here, the degree to which opinions w.r.t. extra-marital conduct have changed by gender was examined via the following research question.

To what degree have opinions changed since the year 2000 by gender?

The Data

data0 <- xmar$bivariate$gender %>% select(Opinion, Period, Gender) 
data1 <- xmar$bivariate$gender %>% filter(Gender == "Male") %>% select(Opinion, Period) 
data2 <- xmar$bivariate$gender %>% filter(Gender == "Female") %>% select(Opinion, Period) 

gender0 <- X23D(data0)
gender1 <- analyze(data1, y = "Opinion", x = "Period", success = "Traditional", scope = "Male",   title = "Opinions by Period (Male)")
gender2 <- analyze(data2, y = "Opinion", x = "Period", success = "Traditional", scope = "Female", title = "Opinions by Period (Female)")
gender <- rbind(gender1$test$df, gender2$test$df)

Hypotheses

The degree to which opinions have changed by gender was examined via the following hypotheses:
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion among those of a specific gender for years prior to 2000
$p_2$ is the proportion of traditional opinion among those of a specific gender for years since 2000

The specific hypotheses by gender are stated below.

Inference Conditions

The inference conditions for a two-proportion test for differences in the proportion of traditional opinion between the genders, were examined as follows:

Independence: Each observation was subject to a stratified random sampling and so independence was assumed.
Success/Failure: Expected counts for each response/explanatory/controlling variable combination must be five or greater. As shown in r kfigr::figr(label = "ageExp", prefix = TRUE, link = TRUE, type="Table"), the expected counts exceed this minimum threshold.
Sample Size: As indicated in section the methods section, the sample size falls well below 10% of the U.S. adult population.

r kfigr::figr(label = "genderExp", prefix = TRUE, link = TRUE, type="Table"): Contingency table of expected counts of opinions by gender, controlling for period

stargazer::stargazer(format(gender0$expected, quote = FALSE), type = 'html')

As such, the conditions for two-proportion z-tests were met.

Test Results

Male Population

The following hypotheses were tested to discover the extent to which opinion has changed among men over the periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion among men for years prior to 2000
$p_2$ is the proportion of traditional opinion among men for years since 2000

r gender1$test$stmt$type

gender1$plots$observed

r kfigr::figr(label = "gender1Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed Opinions among men prior to and since the year 2000

As indicated in r kfigr::figr(label = "gender1Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "gender1ZTest", prefix = TRUE, link = TRUE, type="Table"), r gender1$test$stmt$detail

r kfigr::figr(label = "gender1ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for male population

knitr::kable(gender1$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r gender1$test$stmt$conclude

Female Population

The following hypotheses were tested to discover the extent to which opinion has changed among women over the periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion among women for years prior to 2000
$p_2$ is the proportion of traditional opinion among women for years since 2000

r gender2$test$stmt$type

gender2$plots$observed

r kfigr::figr(label = "gender2Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions among women prior to and since the year 2000

As indicated in r kfigr::figr(label = "gender2Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "gender2ZTest", prefix = TRUE, link = TRUE, type="Table"), r gender2$test$stmt$detail

r kfigr::figr(label = "gender2ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for female population

knitr::kable(gender2$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r gender2$test$stmt$conclude

Research Question 3: Evolution of opinion by education

Here, the degree to which opinions w.r.t. extra-marital conduct have changed by education was examined via the following research question.

To what degree have opinions changed since the year 2000 by highest education level achieved?

data0 <- xmar$bivariate$educ %>% select(Opinion, Period, Educ) 
data1 <- xmar$bivariate$educ %>% filter(Educ == "High School") %>% select(Opinion, Period) 
data2 <- xmar$bivariate$educ %>% filter(Educ == "Community College") %>% select(Opinion, Period) 
data3 <- xmar$bivariate$educ %>% filter(Educ == "UnderGraduate") %>% select(Opinion, Period) 
data4 <- xmar$bivariate$educ %>% filter(Educ == "Graduate") %>% select(Opinion, Period) 
data5 <- xmar$bivariate$educ %>% filter(Educ == "Post-Graduate") %>% select(Opinion, Period) 

educ0 <- X23D(data0)
educ1 <- analyze(data1, y = "Opinion", x = "Period", success = "Traditional", scope = "High School Graduate", title = "Opinions by Period (High School)")
educ2 <- analyze(data2, y = "Opinion", x = "Period", success = "Traditional", scope = "Community College Graduate", title = "Opinions by Period (Community College)")
educ3 <- analyze(data3, y = "Opinion", x = "Period", success = "Traditional", scope = "UnderGraduate", title = "Opinions by Period (UnderGraduate)")
educ4 <- analyze(data4, y = "Opinion", x = "Period", success = "Traditional", scope = "Graduate", title = "Opinions by Period (Graduate)")
educ5 <- analyze(data4, y = "Opinion", x = "Period", success = "Traditional", scope = "Post-Graduate", title = "Opinions by Period (Post-Graduate)")
educ <- rbind(educ1$test$df, educ2$test$df, educ3$test$df, educ4$test$df, educ5$test$df)

Hypotheses

The following hypotheses were tested to discover the extent to which opinion has changed by education level, over the periods of the study. Generally stated, they are: $H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion at a level of education for years prior to 2000
$p_2$ is the proportion of traditional opinion at a level of education for years since 2000

The specific hypotheses for each educ are stated below.

Inference Conditions

The inference conditions for a two-proportion test for differences in the proportion of traditional opinion at various education levels, were examined as follows:

Independence: Each observation was subject to a stratified random sampling and so independence was assumed.
Success/Failure: Expected counts for each response/explanatory/controlling variable combination must be five or greater. As shown in r kfigr::figr(label = "educExp", prefix = TRUE, link = TRUE, type="Table"), the expected counts exceed this minimum threshold.
Sample Size: As indicated in section the methods section, the sample size falls well below 10% of the U.S. adult population.

r kfigr::figr(label = "educExp", prefix = TRUE, link = TRUE, type="Table"): Contingency table of expected counts of opinions by education level, controlling for period

stargazer::stargazer(format(educ0$expected, quote = FALSE), type = 'html')

As such, the conditions for two-proportion z-tests were met.

Test Results

High School Graduates

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct for high school graduates for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion among high school graduates for years prior to 2000
$p_2$ is the proportion of traditional opinion among high school graduates for years since 2000

r educ1$test$stmt$type

educ1$plots$observed

r kfigr::figr(label = "educ1Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions of high school graduates prior to and since the year 2000

As indicated in r kfigr::figr(label = "educ1Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "educ1ZTest", prefix = TRUE, link = TRUE, type="Table"), r educ1$test$stmt$detail

r kfigr::figr(label = "educ1ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period for high school graduates

knitr::kable(educ1$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r educ1$test$stmt$conclude

Community College

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct among those who have attended some college or have earned an associates or technical degree, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion among those who have attended some college or have earned an associates or technical degree, for years prior to 2000
$p_2$ is the proportion of traditional opinion among those who have attended some college or have earned an associates or technical degree, for years since 2000

r educ2$test$stmt$type

educ2$plots$observed

r kfigr::figr(label = "educ2Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions of community graduates prior to and since the year 2000

As indicated in r kfigr::figr(label = "educ2Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "educ2ZTest", prefix = TRUE, link = TRUE, type="Table"), r educ2$test$stmt$detail

r kfigr::figr(label = "educ2ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion among those who have attended some college or have earned an associates or technical degree

knitr::kable(educ2$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r educ2$test$stmt$conclude

UnderGraduate

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct among undergraduates for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion among undergraduates for years prior to 2000
$p_2$ is the proportion of traditional opinion among undergraduates for years since 2000

r educ3$test$stmt$type

educ3$plots$observed

r kfigr::figr(label = "educ3Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions of undergraduates prior to and since the year 2000

As indicated in r kfigr::figr(label = "educ3Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "educ3ZTest", prefix = TRUE, link = TRUE, type="Table"), r educ3$test$stmt$detail

r kfigr::figr(label = "educ3ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion among undergraduates for the two periods of the study.

knitr::kable(educ3$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r educ3$test$stmt$conclude

Graduate

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct among those holding graduate degrees, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion among those holding graduate degrees, for years prior to 2000
$p_2$ is the proportion of traditional opinion among those holding graduate degrees, for years since 2000

r educ4$test$stmt$type

educ4$plots$observed

r kfigr::figr(label = "educ4Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions of graduates prior to and since the year 2000

As indicated in r kfigr::figr(label = "educ4Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "educ4ZTest", prefix = TRUE, link = TRUE, type="Table"), r educ4$test$stmt$detail

r kfigr::figr(label = "educ4ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period among those holding graduate degrees

knitr::kable(educ4$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r educ4$test$stmt$conclude

Post-Graduate

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct among those holding post-graduate degrees, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion among those holding post-graduate degrees, for years prior to 2000
$p_2$ is the proportion of traditional opinion among those holding post-graduate degrees, for years since 2000

r educ5$test$stmt$type

educ5$plots$observed

r kfigr::figr(label = "educ5Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions of post-graduates prior to and since the year 2000

As indicated in r kfigr::figr(label = "educ5Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "educ5ZTest", prefix = TRUE, link = TRUE, type="Table"), r educ5$test$stmt$detail

r kfigr::figr(label = "educ5ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period among those holding post-graduate degrees

knitr::kable(educ5$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r educ5$test$stmt$conclude

Research Question 4: Evolution of opinion by region

Here, the degree to which opinions w.r.t. extra-marital conduct have changed by region was examined via the following research question.

To what degree have opinions changed since the year 2000 by geographic region?

The Data

data0 <- xmar$bivariate$region %>% select(Opinion, Period, Region) 
data1 <- xmar$bivariate$region %>% filter(Region == "Northeast") %>% select(Opinion, Period) 
data2 <- xmar$bivariate$region %>% filter(Region == "Midwest") %>% select(Opinion, Period) 
data3 <- xmar$bivariate$region %>% filter(Region == "South") %>% select(Opinion, Period) 
data4 <- xmar$bivariate$region %>% filter(Region == "Mountain") %>% select(Opinion, Period) 
data5 <- xmar$bivariate$region %>% filter(Region == "Pacific") %>% select(Opinion, Period) 

region0 <- X23D(data0)
region1 <- analyze(data1, y = "Opinion", x = "Period", success = "Traditional", scope = "Northeast Region", title = "Opinions by Period (Northeast)")
region2 <- analyze(data2, y = "Opinion", x = "Period", success = "Traditional", scope = "Midwest Region", title = "Opinions by Period (Midwest)")
region3 <- analyze(data3, y = "Opinion", x = "Period", success = "Traditional", scope = "South Region", title = "Opinions by Period (South)")
region4 <- analyze(data4, y = "Opinion", x = "Period", success = "Traditional", scope = "Mountain Region", title = "Opinions by Period (Mountain)")
region5 <- analyze(data4, y = "Opinion", x = "Period", success = "Traditional", scope = "Pacific Region", title = "Opinions by Period (Pacific)")
region <- rbind(region1$test$df, region2$test$df, region3$test$df, region4$test$df, region5$test$df)

Hypotheses

The following hypotheses were tested to discover the extent to which opinion has changed by region level, over the periods of the study. Generally stated, they are: $H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$

where:
$p_1$ is the proportion of traditional opinion by region for years prior to 2000
$p_2$ is the proportion of traditional opinion by region for years since 2000

The specific hypotheses for each region are stated below.

Inference Conditions

The inference conditions for a two-proportion test for differences in the proportion of traditional opinion at various region levels, were examined as follows:

Independence: Each observation was subject to a stratified random sampling and so independence was assumed.
Success/Failure: Expected counts for each response/explanatory/controlling variable combination must be five or greater. As shown in r kfigr::figr(label = "regionExp", prefix = TRUE, link = TRUE, type="Table"), the expected counts exceed this minimum threshold.
Sample Size: As indicated in section the methods section, the sample size falls well below 10% of the U.S. adult population.

r kfigr::figr(label = "regionExp", prefix = TRUE, link = TRUE, type="Table"): Contingency table of expected counts of opinions by region, controlling for period

stargazer::stargazer(format(region0$expected, quote = FALSE), type = 'html')

As such, the conditions for two-proportion z-tests were met.

Test Results

Northeast Region

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct in the Northeast region, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in the Northeast region, for years prior to 2000
$p_2$ is the proportion of traditional opinion in the Northeast region, for years since 2000

r region1$test$stmt$type

region1$plots$observed

r kfigr::figr(label = "region1Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions from the Northeast region for periods prior to and since 2000

As indicated in r kfigr::figr(label = "region1Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "region1ZTest", prefix = TRUE, link = TRUE, type="Table"), r region1$test$stmt$detail

r kfigr::figr(label = "region1ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion in the Northeast region, for the two periods of the study.

knitr::kable(region1$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r region1$test$stmt$conclude

South Region

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct in the South region, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in the South region, for years prior to 2000
$p_2$ is the proportion of traditional opinion in the South region, for years since 2000

r region2$test$stmt$type

region2$plots$observed

r kfigr::figr(label = "region2Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions from the South region for periods prior to and since 2000

As indicated in r kfigr::figr(label = "region2Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "region2ZTest", prefix = TRUE, link = TRUE, type="Table"), r region2$test$stmt$detail

r kfigr::figr(label = "region2ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion among residents in the South region, for the two periods of the study.

knitr::kable(region2$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r region2$test$stmt$conclude

Midwest Region

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion in the Midwest region, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in the Midwest region, for years prior to 2000
$p_2$ is the proportion of traditional opinion in the Midwest region, for years since 2000

r region3$test$stmt$type

region3$plots$observed

r kfigr::figr(label = "region3Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions from the Midwest region for periods prior to and since 2000

As indicated in r kfigr::figr(label = "region3Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "region3ZTest", prefix = TRUE, link = TRUE, type="Table"), r region3$test$stmt$detail

r kfigr::figr(label = "region3ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion in the Midwest region, for the two periods of the study.

knitr::kable(region3$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r region3$test$stmt$conclude

Mountain Region

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct in the Mountain region, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in the Mountain region, for years prior to 2000
$p_2$ is the proportion of traditional opinion in the Mountain region, for years since 2000

r region4$test$stmt$type

region4$plots$observed

r kfigr::figr(label = "region4Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions from the Mountain region for periods prior to and since 2000

As indicated in r kfigr::figr(label = "region4Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "region4ZTest", prefix = TRUE, link = TRUE, type="Table"), r region4$test$stmt$detail

r kfigr::figr(label = "region4ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period in the Mountain region

knitr::kable(region4$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r region4$test$stmt$conclude

Pacific Region

The following hypotheses were devised to ascertain the difference in proportion of traditional opinion w.r.t. extra-marital conduct in the Pacific region, for the two periods of the study.
$H_0$ $p_1 - p_2 = 0$
$H_a$ $p_1 - p_2 \neq 0$
where:
$p_1$ is the proportion of traditional opinion in the Pacific region, for years prior to 2000
$p_2$ is the proportion of traditional opinion in the Pacific region, for years since 2000

r region5$test$stmt$type

region5$plots$observed

r kfigr::figr(label = "region5Plot", prefix = TRUE, link = TRUE, type="Figure"): Observed opinions from the Pacific region for periods prior to and since 2000

As indicated in r kfigr::figr(label = "region5Plot", prefix = TRUE, link = TRUE, type="Figure") and r kfigr::figr(label = "region5ZTest", prefix = TRUE, link = TRUE, type="Table"), r region5$test$stmt$detail

r kfigr::figr(label = "region5ZTest", prefix = TRUE, link = TRUE, type="Table"): Two-proportion z-test of traditional opinion by period in the Pacific region

knitr::kable(region5$test$df) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

r region5$test$stmt$conclude

Discussion

ageGroupDiscussion <- discussion(age)
genderDiscussion <- discussion(gender)
educDiscussion <- discussion(educ)

Evolution of Opinion by Age Group

There has been a slight increase in the proportion of people holding traditional opinions w.r.t. extra-marital conduct over the periods up to and following the year 2000 across most age groups. r kfigr::figr(label = "agePlot", prefix = TRUE, link = TRUE, type="Figure") shows that the degree of change in opinion appears to correlate negatively with age; whereby, the greatest amount of change (r round(ageGroupDiscussion$data %>% summarize(max(PctChange)), 2)%) was associate with the 15-24 age group. The r round(ageGroupDiscussion$data %>% summarize(min(PctChange)), 2)% change within the 65+ age group was not significant.

r kfigr::figr(label = "ageDisc", prefix = TRUE, link = TRUE, type="Table"): Summary of analysis by age group

knitr::kable(ageGroupDiscussion$data, align = c("l", rep("c", 4))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

ageGroupDiscussion$plot

r kfigr::figr(label = "agePlot", prefix = TRUE, link = TRUE, type="Figure"): Summary of analysis by age group

Evolution of Opinion by Gender

Similarly, there was a trend towards more traditional opinion for both males and females, with r round(genderDiscussion$data %>% summarize(max(PctChange)), 2)% and r round(genderDiscussion$data %>% summarize(min(PctChange)), 2)% increases respectively. As indicated in r kfigr::figr(label = "genderPlot", prefix = TRUE, link = TRUE, type="Figure"), the migration towards more traditional views was most pronounced among men.

r kfigr::figr(label = "genderDisc", prefix = TRUE, link = TRUE, type="Table"): Summary of analysis by gender

knitr::kable(genderDiscussion$data, align = c("l", rep("c", 4))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

genderDiscussion$plot

r kfigr::figr(label = "genderPlot", prefix = TRUE, link = TRUE, type="Figure"): Summary of analysis by gender

Evolution of Opinion by Education Level

Similarly, there was a trend towards more traditional opinion for both males and females, with r round(educDiscussion$data %>% summarize(max(PctChange)), 2)% and r round(educDiscussion$data %>% summarize(min(PctChange)), 2)% increases respectively. As indicated in r kfigr::figr(label = "educPlot", prefix = TRUE, link = TRUE, type="Figure"), the migration towards more traditional views was most pronounced among men.

r kfigr::figr(label = "educDisc", prefix = TRUE, link = TRUE, type="Table"): Summary of analysis by education level

knitr::kable(educDiscussion$data, align = c("l", rep("c", 4))) %>%  
  kableExtra::kable_styling(bootstrap_options = c("striped", "condensed"), full_width = T, position = "center")

educDiscussion$plot

r kfigr::figr(label = "educPlot", prefix = TRUE, link = TRUE, type="Figure"): Summary of analysis by education level

Conclusion

Amid claims of an increasing lack of deference to and respect for certain institutions, especially on behalf of mellennials, this study indicates that the belief in the sanctity and exclusivity of marriage has not only remained stable for nearly half a century, but has slightly increased. Most notable is that the trend towards more conservatively held views is led largely by the mellennial generation. But this reveals only part of the story. Future studies might examine questions such as the rates of marriage and divorce, the number of marriages per capita, marrying age, and the average length of marriage. Whereas marriage was once an expected milestone, are their fewer of us making the decision than before? Are people approaching the institution with greater consideration? What can data tell us about ourselves, how we live...how we choose to live?

Appendix

Source Code

Preprocess Data

This function extracts filters non-responses and creates the study variables.

Exploratory Data Analysis

The following function performs the exploratory univariate data analysis on the study variables.

Exploratory Data Analysis

The following function performs the exploratory univariate data analysis on the study variables.

References

DataScienceSalon/xmar documentation built on May 28, 2019, 12:24 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DataScienceSalon/xmar Extra-Marital Sex: Attitudes and Behaviors

In DataScienceSalon/xmar: Extra-Marital Sex: Attitudes and Behaviors

Abstract

Introduction

Research Questions

Document Organization

Methods

Data

Data Sampling Strategy

Data Sampling Bias

Data Variables

Data Preprocessing

Data Analysis

State Hypotheses

Inference Conditions

Statistical Methods

Two proportion z-test for hypothesis tests

Two proportion z-test for confidence intervals

Interpret and report results

System and Environment

Results

Exploratory Data Analysis

Opinion

Period

Age Group

Gender

Education

Region

Summary of Exploratory Data Analysis

Research Question 1: Evolution of opinion by age group

Hypotheses

Inference Conditions

Test Results

Age Group 15-24

Age Group 25-44

Age Group 45-64

Age Group 65+

Research Question 2: Evolution of opinion by gender

The Data

Hypotheses

Inference Conditions

Test Results

Male Population

Female Population

Research Question 3: Evolution of opinion by education

Hypotheses

Inference Conditions

Test Results

High School Graduates

Community College

UnderGraduate

Graduate

Post-Graduate

Research Question 4: Evolution of opinion by region

The Data

Hypotheses

Inference Conditions

Test Results

Northeast Region

South Region

Midwest Region

Mountain Region

Pacific Region

Discussion

Evolution of Opinion by Age Group

Evolution of Opinion by Gender

Evolution of Opinion by Education Level

Conclusion

Appendix

Source Code

Preprocess Data

Exploratory Data Analysis

Exploratory Data Analysis

References

R Package Documentation

Browse R Packages

We want your feedback!

DataScienceSalon/xmar
Extra-Marital Sex: Attitudes and Behaviors