PULS function for functional data (only used when you know that the data shouldn't be converted into functional because it's already smooth, e.g. your data are step function)

PULS(
  toclust.fd,
  method = c("pam", "ward"),
  intervals = c(0, 1),
  spliton = NULL,
  distmethod = c("usc", "manual"),
  labels = toclust.fd$fdnames[2]$reps,
  nclusters = length(toclust.fd$fdnames[2]$reps),
  minbucket = 2,
  minsplit = 4
)

Arguments

toclust.fd

A functional data object (i.e., having class fd) created from fda package. See fda::fd().

method

The clustering method you want to run in each subregion. Can be chosen between pam and ward.

intervals

A data set (or matrix) with rows are intervals and columns are the beginning and ending indexes of of the interval.

spliton

Restrict the partitioning on a specific set of subregions.

distmethod

The method for calculating the distance matrix. Choose between "usc" and "manual". "usc" uses fda.usc::metric.lp() function while "manual" uses squared distance between functions. See Details.

labels

The name of entities.

nclusters

The number of clusters.

minbucket

The minimum number of data points in one cluster allowed.

minsplit

The minimum size of a cluster that can still be considered to be a split candidate.

Value

A PULS object. See PULS.object for details.

Details

If choosing distmethod = "manual", the L2 distance between all pairs of functions \(y_i(t)\) and \(y_j(t)\) is given by: $$d_R(y_i, y_j) = \sqrt{\int_{a_r}^{b_r} [y_i(t) - y_j(t)]^2 dt}.$$

See also

Examples

# \donttest{ library(fda)
#> Loading required package: splines
#> Loading required package: Matrix
#> Loading required package: fds
#> Loading required package: rainbow
#> Loading required package: MASS
#> Loading required package: pcaPP
#> Loading required package: RCurl
#> #> Attaching package: ‘fda’
#> The following object is masked from ‘package:graphics’: #> #> matplot
# Build a simple fd object from already smoothed smoothed_arctic data(smoothed_arctic) NBASIS <- 300 NORDER <- 4 y <- t(as.matrix(smoothed_arctic[, -1])) splinebasis <- create.bspline.basis(rangeval = c(1, 365), nbasis = NBASIS, norder = NORDER) fdParobj <- fdPar(fdobj = splinebasis, Lfdobj = 2, # No need for any more smoothing lambda = .000001) yfd <- smooth.basis(argvals = 1:365, y = y, fdParobj = fdParobj) Jan <- c(1, 31); Feb <- c(31, 59); Mar <- c(59, 90) Apr <- c(90, 120); May <- c(120, 151); Jun <- c(151, 181) Jul <- c(181, 212); Aug <- c(212, 243); Sep <- c(243, 273) Oct <- c(273, 304); Nov <- c(304, 334); Dec <- c(334, 365) intervals <- rbind(Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec) PULS4_pam <- PULS(toclust.fd = yfd$fd, intervals = intervals, nclusters = 4, method = "pam") PULS4_pam
#> n = 39 #> #> Node) Split, N, Cluster Inertia, Proportion Inertia Explained, #> * denotes terminal node #> #> 1) root 39 8453.2190 0.7072663 #> 2) Jul 15 885.3640 0.8431711 #> 4) Aug 8 311.7792 * #> 5) Aug 7 178.8687 * #> 3) Jul 24 1589.1780 0.7964770 #> 6) Jul 13 463.8466 * #> 7) Jul 11 371.2143 * #> #> Note: One or more of the splits chosen had an alternative split that reduced inertia by the same amount. See "alt" column of "frame" object for details.
# }