## Hands-on Tutorial

# Introduction to Fuzzy c-means for Clustering Algorithm

## Basic introduction and implementation of Fuzzy c-means clustering algorithm using Python

--

There are a lot of clustering algorithms out there for the numerical data type. The k-means is one of the basic clustering algorithms that is commonly used by the researcher or analyst. But have you ever heard about the Fuzzy c-means before for clustering? If you haven’t, this article is for you.

In this short article, you will explore the Fuzzy c-means, starting from the basic structure of fuzzy, manual calculation and formula of Fuzzy c-means, and the implementation of Fuzzy c-means in Python using dummy data.

Okay, without further ado, let’s jump in!

## Hard partition vs. fuzzy partition

Before talking about the basic theory of Fuzzy c-means, firstly better we talk about how the data points are theoretically allocated into clusters. Basically, there are two approaches, hard partition and fuzzy partition.

**Hard partition** — where the data points are strictly allocated as a member of one cluster and are not a member of another cluster, assuming that the number of clusters is known. The k-means is one of the algorithms that use a hard partition.

For instance, there are *X = {x1, x2, …, x10}*. They will be assigned into two clusters, let’s say cluster 1 and cluster 2. However, *x6* and *x7* are unfortunately in a grey area of two clusters.

Let’s say *U* is the partition matrix for *X*. Thus, the elements of matrix U will be as follows. The columns represent the data points while the rows are the clusters.

Remember that in a hard partition, there are only binary values [0, 1] so every data point must be assigned to one cluster. In this case, *x6* is in cluster 1 while *x7* is in cluster 2.