clusterEval: Summarise the differences between the clusters

Description Usage Arguments Details Value Examples

Description

This function simply calculates summaries of each cluster and returns these as a list.

Usage

1
clusterEval(clusterResult)

Arguments

clusterResult

A list of class 'clusterResult' return by running clusterPeople()

Details

This function only has one input, the clusterResults obtained by applying clusterPeople

Value

A list containing:

clusterMeans

A data frame containing the mean of each feature per cluster

clusterSds

A data frame containing the standard deviation of each feature value per cluster

clusterFrac

A data frame containing the fraction of each cluster with non-zero values for the feature

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# set database connection
dbconnection <- DatabaseConnector::createConnectionDetails(dbms = dbms,server = server,
user = user,password = pw,port = port,schema = cdmDatabaseSchema)

# then extract the data - in thie example using default groups
clusterData <- dataExtract(dbconnection, cdmDatabaseSchema,
cohortDatabaseSchema=cdmDatabaseSchema,
workDatabaseSchema='scratch.dbo',
cohortid=2000006292, agegroup=NULL, gender=NULL,
type='group', groupDef = 'default',
historyStart=1,historyEnd=365,  loc=getwd())

# initialise the h2o cluster
h2o.init(nthreads=-1, max_mem_size = '50g')

# cluster the males aged between 30 and 50 into 15 clusters
clusterPeople <- clusterRun(clusterData, ageSpan=c(30,50), gender=8507,
                         method='kmeans', clusterSize=15,
                         normalise=F, binary=F,fraction=T)

# get the summary details of each cluster:
clusterSum <- clusterEval(clusterResult)

jreps/patientCluster documentation built on May 20, 2019, 10:46 a.m.