bdm.ptsne: Parallelized t-SNE (ptSNE)
In jgarriga65/bigMap: Big Data Mapping

bdm.ptsne

R Documentation

Parallelized t-SNE (ptSNE)

Description

Starts the parallelized t-SNE algorithm (pt-SNE). This is the first step of the mapping protocol.

Usage

bdm.ptsne(
  data,
  bdm,
  theta = 0.5,
  Y.init = NULL,
  mpi.cl = NULL,
  threads = 4,
  layers = 2,
  info = 0
)

Arguments

`data`	Input data (a matrix, a big.matrix or a .csv file name).
`bdm`	A `bdm` data mapping instance.
`theta`	Accuracy/speed trade-off factor, a value between 0.33 and 0.8. (Default value is `theta = 0.0`). If `theta < 0.33` the algorithm uses the exact computation of the gradient. The closer is this value to 1 the faster is the computation but the coarser is the approximation of the gradient.
`Y.init`	A `n 2 layers` matrix with initial mapping positions. (By default `Y.init=NULL` will use random initial positions).
`mpi.cl`	MPI (inter-node parallelization) cluster as generated by `bdm.mpi.start()`. (By default `mpi.cl = NULL` a 'SOCK' (intra-node parallelization) cluster is generated).
`threads`	Number of parallel threads (according to data size and hardware resources, `i.e.` number of cores and available memory. Default value is `threads = 4`).
`layers`	Number of layers (`minimum` 2, `maximum` the number of threads). Default value is `layers = 2`.
`info`	Output information: 1 yields inter-round results, 0 disables intermediate results. Default value is `info = 0`.

Value

A bdm data mapping instance.

Examples


# --- load example dataset
bdm.example()
# --- perform ptSNE
## Not run: 
# --- run ptSNE
m <- bdm.ptsne(ex$data, ex$map, threads = 10, layers = 2)
# --- plot the Cost function
bdm.cost(m)
# --- plot ptSNE output
bdm.ptsne.plot(m, class.lbls = ex$labels)

## End(Not run)

jgarriga65/bigMap documentation built on June 10, 2024, 7:05 a.m.