Basic Syntax: Input And Output"
In drhur: Learning R with Dr. Hu

chooseCRANmirror(graphics = FALSE, ind = 1)

knitr::opts_chunk$set(echo = TRUE,
                      message = FALSE,
                      warning = FALSE,
                      out.width="100%")

if (!require(pacman)) install.packages("pacman")

p_load(
  lubridate,
  here,
  rio,
  tidyverse,
  drhur
)

Key Points

Research Question: We want to study the impact of perceptive inequality on social and political behavior. What useful information can we get from WVS7?
Does a person's family economic status affect their level of education?
How do citizens of different countries differ in their level of trust in government?
Does perceptive inequality have an impact on a person's social and political behavior?
Basic Concepts
Data Input
Data Types
Data Attributes
Data Output

Basic Concepts

Object-Oriented Programming Language

Object-Oriented Programming (OOP)
- C++ (first applied)
- Python (first from the ground up)
- R
Classes: User-definable object structure with its characteristics and methods.
Objects: Concrete examples of specific categories, such as "China" for "country", "R" for "letter". Each of these instances has Properties of the class it belongs to.
Methods: Some commands defined in the class can be used to express some characteristic behaviors of objects of this class, for example, people will eat, drink and sleep, then "eating, drinking and sleeping" is the "method" of people.
Properties: Used to describe some aspects of an object, such as hair color, skin color, height, and weight to a person. These Properties are common to all objects in the same class, but the specific values may be different. Properties can be thought of as "parameters" for this object.

Encapsulation: Bundling all of an object's properties and methods together so that other objects cannot see or change them from the outside. For example, you cannot turn a cold-blooded animal into a warm-blooded animal, but you can affect the body temperature of a cold-blooded animal by adjusting the outdoor temperature.
Polymorphism: Using a command to perform different actions on different objects. The most typical example issummary().

summary(wvs7)
summary(wvs7$age)

Learning by Doing

R's Polymorphism allows us to use common methods or functions to deal with object types that have not yet been defined. In addition to summary(), plot() is another example of Polymorphism. Please use plot() to process two different objects.

plot(wvs7$age)
plot(wvs7$age, wvs7$incomeLevel)

Inheritance: The so-called inheritance means that all sub-classes of the same parent class automatically have the characteristics of the parent class. As an analogy, if humans had legs, then every individual human being would have this property.
Safety: When the command acts on the object, it will judge the class of the object. If it is not an object that can be processed within the scope of the command definition, it will stop running and give an error message.

Function

R = Function + Object
Function：Command

The principle of OOP: don't change data manually , let the commands do it. Syntax: <command name>(<target data>, <condition 1>, <condition 2>, ...)

{height=500}

light <- function(finger){
  {{shadow <- finger + 5}}
}
handShadow <- light(finger = 3)
handShadow

Data Packages

Command Collection
App
r nrow(available.packages()) in CRAN (more in Github)
install.packages("drhur")
devtools::install_github("sammo3182/drhur")

{height=300}

`<-`

Assignment operator, the shorthand for the assign() command

Syntax: <variable name> <- <object>

aValidObject <- 1:5
aValidObject

`->`, `=`, `<<-`

Four symbols for assignment in R:

assign()
<-
<<-
=

Why `<-`

Intuitive

a <- 12
25 -> b

Will not be confused with =
Shortcut input
- PC: Alt + -
- Mac: option + -

What Time to Use the Command?

=: When you do not want to create an object.

median(y <- 1:10); y
median(x = 1:10); x

<<-: Invoking parent variables

new_counter <- function() {
  i <- 0
  function() {
    # do something useful, then ...
    i <<- i + 1
    i
  }
}

Naming Rules

Don't start with a number (Error: 1stday).
No special symbols except . and _ (Error: M&M).
Case sensitive (X != x), ! means "not"/"no", != means "not equal to".
Don't override built-in commands if necessary(avoid: list <- c(1:5)).
Ideographic

Please create a compliant and non-compliant object:

# Create a non-compliant object

# 5var_name <- data_frame($education)


# Create a compliant object

var_name5 <- data_frame(wvs7$education)

Learning by Doing

question("Which variables are compliant? Please select all valid variable names.",
  answer("my_data_frame <- data_frame(wvs7$education)", correct = TRUE),
  answer("mydata&frame <- data_frame(wvs7$education)"),
  answer("MyDataFrame <- data_frame(wvs7$education)", correct = TRUE),
  answer("1data_frame <- data_frame(wvs7$education)"),
  incorrect = "Incorrect")

Data Input

Built-in Data

data()

Learning by Doing

Select a data in data to open and check the variables in it by summary:

# example

data(uspop)
summary(uspop)

Data Types That Can Be Read Directly

.RDS (single object)
.RData (multiple objects)
.txt
.csv

Syntax: <name><- <read command>(<data path>)

df_rds <- readRDS("aDataset.rds")
df_txt <- read.table("D:/aDataset.txt")
df_csv <- read.csv("./aDataset.csv")

Data Types Need To Call The Package To Read

Call the package through library or require, and then use the commands in it.

# SPSS, Stata, SAS
library(haven)
df_spss <- read_spss("<FileName>.sav")
df_stata <- read_dta("<FileName>.dta")
df_sas <- read_sas("<FileName>.sas7bdat")  

# Quick Import of Forms
library（reader）
df_csv <- read.csv("<FileName>.csv")
df_table <- read.table("<FileName>.csv/txt")

# Excel
library(readxl)
df_excel <- read_excel("<FileName>.xls")
df_excel2 <- read_excel("<FileName>.xlsx")

# JSON (JavaScript Object Notation)
library(rjson)
df_json <- fromJSON(file = "<FileName>.json" )

# XML/Html
library(xml)
df_xml <- xmlTreeParse("<url>")
df_html <- readHTMLTable(url, which=3)

rio:the Swiss Army Knife of data reading.

library(rio)
df_anything <- import(<AnyTypeOfData>)

Data Type

Vector
Matrix
Data frame
List
Array

Vector

The command c() which performs a composition function can be used to create a vector.

numeric vector

vec_integer <- c(1, -2, NA)
vec_double <- c(1.5, -2.34, 1/3)

Notes: 1. NA means not available. 2. The data in a single vector must have the same type (numeric, character, or logical).

Learning by Doing

Generate a vector containing all even numbers from 1 to 100:

# hint: help(seq)

x <- seq(2,100,by=2)

character vector

vec_chr <- c("牛", "^_^", "R is hard，but I can nail it.")

Learning by Doing

Generate a sequence of letters a-z:

vec_letters <- c("a", "b", "c", "d", "e")

letters[1:26]

logic vector

vec_tf <- c(TRUE, TRUE, FALSE)
vec_tf
# c(TRUE, TRUE, FALSE) == c(1, 1, 0)

Learning by Doing

Assuming x is a vector containing (1,1,0), convert it to a logical vector:

x <- c(1, 1, 0)
x <- as.logical(x)

factor vector
- ordinal vector

vec_fac <- factor(c(1, 2, 2, 3))

vec_ord <- ordered(c(1, 2, 2, 3))
vec_fac2 <- factor(c(1, 2, 2, 3), 
                   levels = c(3, 2, 1), 
                   labels = c("Apple", "Pear", "Orange"))

Learning by Doing

After getting a data set, you must first have a general understanding of the data.

check the types of variables in the wvs7 data:

str(wvs7)

check the properties of the incomeLevel variable in the data set:

class(wvs7$incomeLevel)

view the value of incomeLevel and the frequency of each value:

table(wvs7$incomeLevel)

POSIXct/POSIXlt vector
- as.POSIXct (numeric input), integer storage
- as.POSIXlt (character input), column store
- as.POSIXct uses the number of seconds elapsed from a certain time to the first year of UNIX (1970-01-01 00:00:00) to record the time, that is, expresses the time (count time) by counting.
- as.POSIXlt expresses the time in a list (list time), each part of time is an element of the list.

#`as.POSIXct` and `as.POSIXlt`
ct <- as.POSIXct("2023-03-20 10:11:12")
lt <- as.POSIXlt("2023-03-20 10:11:12")

unlist(ct)
unlist(lt)

Sys.time() # get the current time
today() # get the year, month, and day of the day
now() # get the current day's year, month, day, hour, minute, and second time zone
# CST is the time zone where the computer ip is located during operation

# the full pack
time1 <- Sys.time()
time2 <- as.POSIXlt(Sys.time())
time2$wday # week of the day

## What if we only care about the date?

Sys.Date()
date1 <- as.Date("2019-01-02")
class(date1)  # check type of data

lubridate: the swiss army knife of time data

library(lubridate)

ymd("20221016")
mdy("10-16-2022")
dmy("16/10/2022")
ymd_hms("2022-10-16 09:00:00", tz = "Etc/GMT+8")

OlsonNames()

Learning by Doing

When facing vectors with different orders, such as:

x=c("20190101",'01012019','021901')

How should we identify the time?

#help(parse_date_time)

parse_date_time(x,orders = c("ymd","dmy","dym"))

Matrix

See drhur("algebra") for matrix.

Array

Array : As the name implies, it is an "array" of columns, which can be used to record data of more than two dimensions, and can be created by the array command.

# create two vectors of different lengths
vector1 <- c(5, 9, 3)
vector2 <- c(10, 11, 12, 13, 14, 15)

# enter these vectors into an array
result <- array(c(vector1, vector2), dim = c(3, 3, 2))
result

List

List: A "list" that can contain many different types of objects.

ls_monks <- list(name = c("Wukong Sun", "Sanzang Tang", "Wuneng Zhu", "Wujing Sha"),
                 power = c(100, 20, 90, 40),
                 buddha = c(TRUE, TRUE, FALSE, FALSE))

ls_monks

Data Frame

Data Frame： A special kind of column/matrix

columns: "variable", all columns are of equal length
Row: "Observations"

In Excel:

In R:

df_toy <- data.frame(female = c(0,1,1,0),
           age = c(29, 39, 38, 12),
           name = c("Iron Man", "Black Widow", "Captain Marvel", "Captain America"))

df_toy

In Rstudio:

Data Attributes

class, typeof: query variable attributes
nchars: get the length of the string
levels: get or set the level of the factor
nrow: returns the number of rows of the specified matrix
ncol: used to return the number of columns of the specified matrix
dim: the subspace formed by the column vector, that is, the dimension

vec_integer <- c(1, -2, NA)

vec_double <- c(1.5, -2.34, 1/3)

vec_chr <- c("牛", "^_^", "R is hard，but I can nail it.")

vec_fac <- factor(c(1, 2, 2, 3))

ls_monks <- list(name = c("Wukong Sun", "Sanzang Tang", "Wuneng Zhu", "Wujing Sha"),
                 power = c(100, 20, 90, 40),
                 buddha = c(TRUE, TRUE, FALSE, FALSE))

df_toy <- data.frame(female = c(0,1,1,0),
           age = c(29, 39, 38, 12),
           name = c("Iron Man", "Black Widow", "Captain Marvel", "Captain America"))

class(vec_double)
typeof(vec_integer)

nchar(vec_chr)
levels(vec_fac)

length(vec_double)
length(ls_monks)
length(df_toy)

nrow(df_toy)
ncol(df_toy)
dim(df_toy)

Learning by Doing

Convert the following vector to numeric type:

c(FALSE, TRUE)

# help(as.numeric)

as.numeric(c(FALSE, TRUE))

The value of the gender variable female in wvs7 isTRUE,FALSE

In the specific analysis, the character variable is inconvenient to operate, and it can be converted into a numerical variable.

as.numeric(wvs7$female) - 1

Data Output

Syntax: <command>(<data to be saved>, file = <storage path>)

Store as R data

saveRDS(df_toy, file = "df_toy.rds")
save(df_toy, ls_monks, file = "test.rdata")

Save as csv file

write.csv(df_toy, file = "toy.csv")

Hint: If your data is in Chinese, there may be garbled characters in the stored csv.

You can store data in STATA, SPSS, SAS Excel, JSON, Matlab, HTML and other formats through special software packages or "Swiss Army Knife" (rio::export), but do you really want to do this?

STATA (.dta, \<14): 3.16 G = R (.rds): 0.05 G

| Method | Average Time | Minimum | Maximum | |:-----------------|:----------------:|:-----------:|:-----------:| | base::readRDS | 19.65 | 18.64 | 21.01 | | fst::read_fst | 1.39 | 0.56 | 3.41 | | haven::read_sav | 104.78 | 101.00 | 111.85 | | qs::qread | 3.33 | 3.00 | 4.24 |

: Average time (in seconds) taken by the four ways of reading GSS data in R

| Method | Average Time | Minimum | Maximum | File Size | |:----------------|:----------------:|:-----------:|:-----------:|:-------------:| | base::saveRDS | 98.36 | 93.09 | 103.24 | 30.9 MB | | fst::write_fst | 2.70 | 1.86 | 4.05 | 122.1 MB | | qs::qsave | 5.03 | 4.35 | 6.62 | 44.6 MB |

: Average time taken to write GSS data (and file size) in R

Summary

Basic Concepts
- OOP
- Order
- Data Packages
- Assignment
Data Input
- Built-in/cvs
- Packet Assist
Data Type
- Vector
- Matrix
- Array
- List
- Data Frame
Data Attributes
Data Output
- Store as R data
- Store as other data types

Any scripts or data that you put into this service are public.

drhur documentation built on May 31, 2023, 6:03 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

drhur
Learning R with Dr. Hu

Basic Syntax: Input And Output"
In drhur: Learning R with Dr. Hu

Key Points

Basic Concepts

Object-Oriented Programming Language

Learning by Doing

Function

Data Packages

`<-`

`->`, `=`, `<<-`

Why `<-`

What Time to Use the Command?

Naming Rules

Learning by Doing

Data Input

Built-in Data

Learning by Doing

Data Types That Can Be Read Directly

Data Types Need To Call The Package To Read

Data Type

Vector

Learning by Doing

Learning by Doing

Learning by Doing

Learning by Doing

Learning by Doing

Matrix

Array

List

Data Frame

Data Attributes

Learning by Doing

Data Output

Store as R data

Save as csv file

Summary

Try the drhur package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

drhur Learning R with Dr. Hu

Basic Syntax: Input And Output" In drhur: Learning R with Dr. Hu

Key Points

Basic Concepts

Object-Oriented Programming Language

Learning by Doing

Function

Data Packages

<-

->, =, <<-

Why <-

What Time to Use the Command?

Naming Rules

Learning by Doing

Data Input

Built-in Data

Learning by Doing

Data Types That Can Be Read Directly

Data Types Need To Call The Package To Read

Data Type

Vector

Learning by Doing

Learning by Doing

Learning by Doing

Learning by Doing

Learning by Doing

Matrix

Array

List

Data Frame

Data Attributes

Learning by Doing

Data Output

Store as R data

Save as csv file

Summary

Try the drhur package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

drhur
Learning R with Dr. Hu

Basic Syntax: Input And Output"
In drhur: Learning R with Dr. Hu

`<-`

`->`, `=`, `<<-`

Why `<-`