library(learnr) library(learnSTATS) library(faux) library(cocor) data("correlation") correlation <- na.omit(correlation) correlation <- sim_df(correlation, #data frame sample(100:150, 1), #how many of each group between = "gender") correlation <- correlation[ , -1] correlation <- messy(correlation, prop = .01, 2:5, replace = NA) correlation$temporality[ correlation$temporality < 1 ] <- NA correlation$temporality[ correlation$temporality > 7 ] <- NA correlation$relativity[ correlation$relativity < 1 ] <- NA correlation$relativity[ correlation$relativity > 7 ] <- NA correlation$positive[ correlation$positive < 1 ] <- NA correlation$positive[ correlation$positive > 7 ] <- NA library(ggplot2) cleanup <- theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.line.x = element_line(color = "black"), axis.line.y = element_line(color = "black"), legend.key = element_rect(fill = "white"), text = element_text(size = 15)) correlation$gender <- as.factor(correlation$gender) knitr::opts_chunk$set(echo = FALSE)
In our first statistical analysis section, we will begin by creating the simplest of regression models: correlation. Correlation includes two variables, usually continuous, to examine their co-relation. The learning objectives include:
The following videos are provided as a lecture for the class material. The lectures are provided here as part of the flow for the course. You can view the lecture notes within R using vignette("Correlation", "learnSTATS")
. You can skip these pages if you are in class to go on to the assignment.
In this next section, you will answer questions using the R code blocks provided. Be sure to use the solution
option to see the answer if you need it!
Please enter your name for submission. If you do not need to submit, just type anything you'd like in this box.
question_text( "Student Name:", answer("Your Name", correct = TRUE), incorrect = "Thanks!", try_again_button = "Modify your answer", allow_retry = TRUE )
Title: Big Data Analytics Services for Enhancing Business Intelligence
Abstract: This article examines how to use big data analytics services to enhance business intelligence (BI). More specifically, this article proposes an ontology of big data analytics and presents a big data analytics service-oriented architecture (BASOA), and then applies BASOA to BI, where our surveyed data analysis shows that the proposed BASOA is viable for enhancing BI and enterprise information systems. This article also explores temporality, expectability, and relativity as the characteristics of intelligence in BI. These characteristics are what customers and decision makers expect from BI in terms of systems, products, and services of organizations. The proposed approach in this article might facilitate the research and development of business analytics, big data analytics, and BI as well as big data science and big data computing.
library(learnSTATS) data("correlation") head(correlation)
head(correlation)
Go back to your graphing notes to remember how to make these plots! The cleanup
code has been loaded for you. Create a scatter plot of temporality and relativity.
scatter <- ggplot(correlation, aes(temporality, relativity)) scatter + cleanup + geom_point() + xlab("Temporality Rating") + ylab("Relativity Rating") + coord_cartesian(ylim = c(0.5,7.5), xlim = c(0.5, 7.5))
question_text( "Why type of relationship do these two variables have?", answer("Positive relationship.", correct = TRUE), incorrect = "They appear to have a positive relationship.", try_again_button = "Modify your answer", allow_retry = TRUE )
Create a scatter plot of expectability and relativity, grouping by gender. Include the linear line on the graph.
scatter <- ggplot(correlation, aes(expectability, relativity, color = gender, fill = gender)) scatter + cleanup + geom_point() + geom_smooth(method = "lm") + xlab("Expectability Rating") + ylab("Relativity Rating") + coord_cartesian(ylim = c(1,7), xlim = c(1,7)) + scale_fill_discrete(name = "Gender", labels = c("Men", "Women")) + scale_color_discrete(name = "Gender", labels = c("Men", "Women"))
question_text( "What type of relationship do these two variables appear to have for each group?", answer("No relationship for women, small positive for men.", correct = TRUE), incorrect = "There appears to be no relationship for women, but small and positive for men.", try_again_button = "Modify your answer", allow_retry = TRUE )
Using Hmisc
create a correlation table for all the variables using Pearson's correlation. Do not forget that you will need to restructure gender to create a point biserial correlation. Also, do not forget that Hmisc
does not like missing data.
library(Hmisc, quietly = T)
library(Hmisc, quietly = T) correlation$gender2 <- as.numeric(correlation$gender) rcorr(as.matrix(na.omit(correlation [ , -1])))
question_text( "Which correlation was the strongest?", answer("Temporality and gender", correct = TRUE), incorrect = "Temporality and gender - be sure to look at the first output for r.", try_again_button = "Modify your answer", allow_retry = TRUE )
Calculate correlation and confidence interval for temporality and relativity.
with(correlation, cor.test(relativity, temporality, method = "pearson")) ##OR cor.test(correlation$relativity, correlation$temporality, method = "pearson")
question_text( "Does that confidence interval include zero?", answer("Negative", correct = TRUE), incorrect = "No, which indicates it is unlikely to be zero in the population.", try_again_button = "Modify your answer", allow_retry = TRUE )
Calculate the difference in correlations for 1) temporality and expectability and 2) temporality and positive emotion.
library(cocor)
library(cocor) cocor(~expectability + temporality | temporality + positive, data = correlation)
question_text( "Is there a significant difference in their correlations?", answer("Yes there is.", correct = TRUE), incorrect = "Yes, temporality is positively related to expectability, while negatively related to positive emotion.", try_again_button = "Modify your answer", allow_retry = TRUE )
Calculate the difference in correlations for gender on temporality and relativity.
men <- subset(correlation, gender == "men") women <- subset(correlation, gender == "women") inddata <- list(men, women) cocor(~temporality + relativity | temporality + relativity, data = inddata)
question_text( "Is there a significant difference in their correlations?", answer("Depends on your data.", correct = TRUE), incorrect = "In my case, yes as men have a small positive relationship while women have a medium negative relationship. Remember that the data is simulated differently for everyone!", try_again_button = "Modify your answer", allow_retry = TRUE )
On this page, you will create the submission for your instructor (if necessary). Please copy this report and submit using a Word document or paste into the text window of your submission page. Click "Generate Submission" to get your work!
encoder_logic()
encoder_ui()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.