findPathF3: Find best subset of points for follow-up experiments, using...
In NITPicker: Finds the Best Subset of Points to Sample

Description Usage Arguments Value Examples

View source: R/Pathfinder.R

findPathF3 finds the best subset of points to sample from a time course (or spatial axis, along a single axis), based on a set of example curves. Specifically, it finds subsets of points that estimate the shape of the curve, normalised by the variance.

1
2
3

findPathF3(tp, training1, training2, numSubSamples, spline = 1,
  resampleTraining = F, iter = 20, knots = 100, numPerts = 1000,
  fast = T)

`tp`	A numerical vector of time points (or spatial coordinates along a single axis)
`training1`	this is a numerical matrix of training data of experimental condition 1, where the rows represent different samples, columns represent different time points (or points on a single spatial axis), and the values correspond to measurements.
`training2`	this is a numerical matrix of training data of experimental condition 2, where the rows represent different samples, columns represent different time points (or points on a single spatial axis), and the values correspond to measurements.
`numSubSamples`	integer that represents the number of time points that will be subsampled
`spline`	A positive integer representing the spline used to interpolate between knots when generating perturbations. Note that this does NOT designate the spline used when calculating the L2-error.
`resampleTraining`	A boolean designating whether the exact training data should be used (False) or whether a probability distribution of curves should be generated and training curves resampled (True).
`iter`	A positive integer, representing the maximum number of iterations employed during time warping (see time_warping in fdasrvf library)
`knots`	A positive integer– for time warping to work optimally, the points must be evenly sampled. This determines how many points do we evenly sample before conducting time warping
`numPerts`	a positive integer, representing the number of sampled curves to output.
`fast`	is a boolean, which determines whether the algorithm runs in fast mode where the sum of the perturbations is calculated prior to integration.

An integer vector of the indices of the time points selected to be subsampled. The actual time points can be found by tp[output]. The length of this vector should be numSubSamples.

 

#Set up data:
namAtlantic=CanadianWeather$region[as.character(colnames(CanadianWeather$monthlyTemp))]
atlanticCities=which(namAtlantic=="Atlantic")
matAtlantic=CanadianWeather$monthlyTemp[, names(atlanticCities)]

namContinental=CanadianWeather$region[as.character(colnames(CanadianWeather$monthlyTemp))]
continentalCities=which(namContinental=="Continental")
matContinental=CanadianWeather$monthlyTemp[, names(continentalCities)]

#find a set of points that helps capture the difference 
#between Atlantic and Continental cities, normalised by the variance
#make numPerts >=20 for real data
a=findPathF3(c(1:12),  matAtlantic,  matContinental, 5, numPerts=3) 
print(a) #indices of months to select for follow-up experiments
print(rownames(CanadianWeather$monthlyTemp)[a]) #month names selected