Creates a MonoClust object after partitioning the data set using Monothetic Clustering.
MonoClust( toclust, cir.var = NULL, variables = NULL, distmethod = NULL, digits = getOption("digits"), nclusters = 2L, minsplit = 5L, minbucket = round(minsplit/3), ncores = 1L )
toclust | Data set as a data frame. |
---|---|
cir.var | Index or name of the circular variable in the data set. |
variables | List of variables selected for clustering procedure. It could be a vector of variable indexes, or a vector of variable names. |
distmethod | Distance method to use with the data set. Can be chosen
from "euclidean" (for Euclidean distance), "mahattan" (for Manhattan
distance), or "gower" (for Gower distance). If not set, Euclidean distance
is used unless |
digits | Significant decimal number printed in the output. |
nclusters | Number of clusters created. Default is 2. |
minsplit | The minimum number of observations that must exist in a node in order for a split to be attempted. Default is 5. |
minbucket | The minimum number of observations in any terminal leaf
node. Default is |
ncores | Number of CPU cores on the current host. If greater than 1,
parallel processing with |
A MonoClust
object. See MonoClust.object
.
Chavent, M. (1998). A monothetic clustering method. Pattern Recognition Letters, 19(11), 989-996. doi: 10.1016/S0167-8655(98)00087-7 .
Tran, T. V. (2019). Monothetic Cluster Analysis with Extensions to Circular and Functional Data. Montana State University - Bozeman.
# Very simple data set library(cluster) data(ruspini) ruspini4sol <- MonoClust(ruspini, nclusters = 4) ruspini4sol#> n = 75 #> #> Node) Split, N, Cluster Inertia, Proportion Inertia Explained, #> * denotes terminal node #> #> 1) root 75 244373.900 0.6344215 #> 2) y < 91 35 43328.460 0.9472896 #> 4) x < 37 20 3689.500 * #> 5) x >= 37 15 1456.533 * #> 3) y >= 91 40 46009.380 0.7910436 #> 6) x < 63.5 23 3176.783 * #> 7) x >= 63.5 17 4558.235 * #> #> Note: One or more of the splits chosen had an alternative split that reduced inertia by the same amount. See "alt" column of "frame" object for details.# data with circular variable library(monoClust) data(wind_sensit_2007) # Use a small data set set.seed(12345) wind_reduced <- wind_sensit_2007[sample.int(nrow(wind_sensit_2007), 10), ] circular_wind <- MonoClust(wind_reduced, cir.var = 3, nclusters = 2)#> Warning: binary variable(s) 1 treated as interval scaledcircular_wind#> n = 10 #> #> Node) Split, N, Cluster Inertia, Proportion Inertia Explained, #> * denotes terminal node #> #> 1) root 10 0.475149600 0.4561506 #> 2) WS < 5.8 3 0.006581825 * #> 3) WS >= 5.8 7 0.251828000 * #> Circular variable's first cut #> WDIR : 168.2