findPathF1: Find best subset of points for follow-up experiments, using...
In NITPicker: Finds the Best Subset of Points to Sample

Description Usage Arguments Value Examples

View source: R/Pathfinder.R

findPathF1 finds the best subset of points to sample from a time course (or spatial axis, along a single axis), based on a set of example curves. Specifically, it finds subsets of points that estimate the shape of the curve effectively.

1
2
3

findPathF1(tp, training, numSubSamples, spline = 1,
  resampleTraining = T, iter = 20, knots = 100, numPerts = 1000,
  fast = T, mult = F, weights = c())

`tp`	A numerical vector of time points (or spatial coordinates along a single axis)
`training`	this is a numerical matrix of training data, where the rows represent different samples, columns represent different time points (or points on a single spatial axis), and the values correspond to measurements. (If `mult==TRUE`, then this is instead a list of training matrices)
`numSubSamples`	integer that represents the number of time points that will be subsampled
`spline`	A positive integer representing the spline used to interpolate between knots when generating perturbations. Note that this does NOT designate the spline used when calculating the L2-error.
`resampleTraining`	A boolean designating whether the exact training data should be used (False) or whether a probability distribution of curves should be generated and training curves resampled (True).
`iter`	A positive integer, representing the maximum number of iterations employed during time warping (see time_warping in fdasrvf library)
`knots`	A positive integer– for time warping to work optimally, the points must be evenly sampled. This determines how many points do we evenly sample before conducting time warping
`numPerts`	a positive integer, representing the number of sampled curves to output.
`fast`	is a boolean, which determines whether the algorithm runs in fast mode where the sum of the perturbations is calculated prior to integration.
`mult`	is a boolean. If mult is true, then training will be a list of training matrices. This will be the case if there are multiple genes to consider at the same time. Training sets will be normalised by the size of the L2-error.
`weights`	is a vector of numbers that is the same length as the number of training curves. This describes the relative importance of these curves.

An integer vector of the indices of the time points selected to be subsampled. The actual time points can be found by tp[output]. The length of this vector should be numSubSamples.

 
#load data:
#matrix with 12 rows, representing months (time)
#and 35 columns, representing cities (experiments)
mat=CanadianWeather$monthlyTemp 
#find a set of points that help predict the shape of the curve:
a=findPathF1(c(1:12), mat, 5, numPerts=3) #make numPerts>=20 for real data 
print(a) #indices of months to select for follow-up experiments
print(rownames(CanadianWeather$monthlyTemp)[a]) #month names selected