profiles_revised: Cleaned OkCupid profile data

Description Usage Format Details Source Examples

Description

Cleaned profile data of 59,946 OkCupid users who were living within 25 miles of San Francisco, had active profiles during a period in the 2010s, and had at least one picture in their profile. The original data and codebook can be found at https://github.com/rudeboybert/JSE_OkCupid.

Usage

1

Format

A data.frame with 59946 rows and 22 variables:

age

Age

body_type

Body type

diet

Dietary habits

drinks

Drinking habits

drugs

Drug usage habits

education

Education level

ethnicity

Ethnicity

height

Height in inches with random noise added (random uniform from -1, 0, 1)

income

Income

job

Job

offspring

Number of offspring

orientation

Sexual orientation

pets

Number of pets

religion

Religious affiliation

sex

Sex. Note at the time OkCupid only allowed for male/female binary. This has since been relaxed.

sign

Astrological sign

smokes

Smoking habits

speaks

Languages spoken

status

Relationship status

Details

The differences between the cleaned and original version of profiles data are:

Essay Responses

Due to file size restrictions, only the first 140 characters of each user's first essay response (my self summary) is included

Missing income values

Previously coded as -1, they are now coded as NA

All other missing values

Previously coded as "", they are now coded as NA

offspring and sign

String instances of "?’" are replaced with apostrophes

Source

https://github.com/rudeboybert/JSE_OkCupid

Examples

1
2
3
4
5
library(tibble)
profiles_revised

# If using RStudio:
# View(profiles)

rudeboybert/okcupiddata documentation built on May 20, 2021, 11:14 a.m.