mapclust

Introduction

Hierarchical classification can also be considered by introducing constraints that allow, for example, geographical proximity to be taken into account.

Spatialpatches

The spatial distribution of data can be heterogeneous and present local aggregations or spatial patches. P. Petitgas proposed an algorithm to identify them in the context of fish population density (WOILLEZ et al., 2009). The parameters are the geographical coordinates (X,Y) and a variable of interest (var).

The position of a patch is then determined by its center of gravity. The algorithm starts with the highest value of var and then considers each observation in decreasing order of value of var. The highest value initiates the first patch. Then, the observation considered is assigned to the nearest patch, provided that its distance from the center of gravity of the patch is smaller than the threshold distance dlim. Otherwise, the observation forms a new patch. The results on spatial patches are of course influenced by the choice of the dlim threshold and the location of the highest values of var.

mapclust

In order to better understand the spatial distribution of individuals, spatial patch construction is applied hierarchically top-down by varying the maximum acceptable distance (dlim) between observation points and the centre of gravity of a patch. This approach leads to a Hierarchical Top-down Classification (EVERITT et al., 2001) based at each step on the Spatialpatches algorithm: at each node (parent group) of the hierarchy we divide into two patches (child nodes). Spatialpatches has been developed to describe the spatial distribution patterns of a fish population based on density data, so the variable of interest is assumed to be positive. var can therefore be related to a positive count, frequency or real variable.

In the case where the variable of interest var is real, means that it can take negative values, it was necessary to adapt our mapclust classification algorithm. We no longer work with the values of var but with the values of the probability density f associated with var. In the same way we were able to adapt the method to the case where var is multidimensional by working on a Kernel density estimate.

mapclust: Divisive Hierarchical Clustering using Spatials Patches

R Package:

Introduction

Spatialpatches

mapclust

mapclust

Dataset:

Run:

Import data:

Export:

Dendrogram

Partition evaluation

Console

Map

Silhouette

Summary

Authors:

L. Bellanger

mail: <lise.bellanger@univ-nantes.fr>

P. Husi

mail: <philippe.husi@univ-tours.fr>

A. Coulon

Maintainer:

A. Coulon

mail: <arthur.coulon@univ-tours.fr>

Contributor:

B. Desachy

B. Martineau

Get started with the mapclust application

Table of content

Introduction

Spatialpatches

mapclust

The App

Run (first time)

Print label option

Import your data

CSV Format and write.table

Label

Header

Separator

Quote

Decimal

Univariate data

Multivariate data

Evaluation Plot

Within Sum of Square Plot (WSSPlot)

Example:

Average silhouette Plot

Example:

Output

The selection of a partition is done by clicking on the dendrogram at the desired height.

Map

Silhouette

Additionnal informations

References