create_profiles_cluster: Create profiles of observed variables using two-step cluster...

Description Usage Arguments Details Value Examples

View source: R/create_profiles_cluster.R

Description

Create profiles of observed variables using two-step cluster analysis

Usage

1
2
3
4
5
6
7
8
9
create_profiles_cluster(
  df,
  ...,
  n_profiles,
  to_center = FALSE,
  to_scale = FALSE,
  distance_metric = "squared_euclidean",
  linkage = "complete"
)

Arguments

df

with two or more columns with continuous variables

...

unquoted variable names separated by commas

n_profiles

The specified number of profiles to be found for the clustering solution

to_center

Boolean (TRUE or FALSE) for whether to center the raw data with M = 0

to_scale

Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1

distance_metric

Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust)

linkage

Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist)

Details

Function to create a specified number of profiles of observed variables using a two-step (hierarchical and k-means) cluster analysis.

Value

A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object.

Examples

1
2
3
4
5
d <- pisaUSA15
m3 <- create_profiles_cluster(d, 
                              broad_interest, enjoyment, instrumental_mot, self_efficacy,
                              n_profiles = 3)
summary(m3)

Example output

Prepared data: Removed 354 incomplete cases
Hierarchical clustering carried out on: 5358 cases
K-means algorithm converged: 5 iterations
Clustered data: Using a 3 cluster solution
Calculated statistics: R-squared = 0.424
 broad_interest    enjoyment     instrumental_mot self_efficacy  
 Min.   :1.000   Min.   :1.000   Min.   :1.000    Min.   :1.000  
 1st Qu.:2.200   1st Qu.:2.400   1st Qu.:1.500    1st Qu.:1.625  
 Median :2.800   Median :3.000   Median :2.000    Median :2.000  
 Mean   :2.655   Mean   :2.782   Mean   :2.072    Mean   :2.134  
 3rd Qu.:3.200   3rd Qu.:3.000   3rd Qu.:2.500    3rd Qu.:2.500  
 Max.   :5.000   Max.   :4.000   Max.   :4.000    Max.   :4.000  
    cluster     
 Min.   :1.000  
 1st Qu.:1.000  
 Median :2.000  
 Mean   :1.784  
 3rd Qu.:2.000  
 Max.   :3.000  

prcr documentation built on Feb. 9, 2020, 5:08 p.m.