federateHdbscan: Federated hdbscan

View source: R/client_func.R

federateHdbscanR Documentation

Federated hdbscan

Description

Function for hdbscan federated analysis on the virtual cohort combining multiple cohorts.

Usage

federateHdbscan(loginFD,
                       logins,
                       func,
                       symbol,
                       metric = 'euclidean',
                       minPts = 10,
                       chunk = 500L,
                       mc.cores = 1,
                       TOL = .Machine$double.eps,
                       width.cutoff = 500L,
                       ...)

Arguments

loginFD

Login information of the FD server

logins

Login information of data repositories

func

Encoded definition of a function for preparation of raw data matrices. Two arguments are required: conns (list of DSConnection-classes), symbol (name of the R symbol) (see datashield.assign).

symbol

Encoded vector of names of the R symbols to assign in the DataSHIELD R session on each server in logins. The assigned R variables will be used as the input raw data. Other assigned R variables in func are ignored.

metric

Either euclidean or correlation for distance metric between samples. For Euclidean distance, the data from each cohort will be centered (not scaled) for each variable. For correlation-based distance, the data from each cohort will be centered scaled for each sample.

minPts

Minimum size of clusters, see dbscan::hdbscan. Default, 10.

chunk

Size of chunks into what the resulting matrix is partitioned. Default, 500L.

mc.cores

Number of cores for parallel computing. Default, 1.

TOL

Tolerance of 0. Default, .Machine$double.eps.

width.cutoff

Default, 500L. See deparse1.

...

see dbscan::hdbscan

Value

An object of class hdbscan.


vanduttran/dsMOdual documentation built on Jan. 19, 2025, 6:36 a.m.