K Means Clustering Algorithm: Complete Guide in Simple Words

Are you looking for a complete guide on K Means Clustering Algorithm?. If yes, then you are in the right place. Here I will discuss All details related to K Means Clustering Algorithm. So, give your few minutes to this article in order to get all the details regarding the K Means Clustering Algorithm.

Hello, & Welcome!

In this blog, I am gonna tell you-

K Means Clustering Algorithm

Before moving into K Means Clustering, You should have a brief idea about Clustering in Machine Learning.

That’s why Let’s start with Clustering and then we will move into K Means Clustering Algorithm.

Clustering in Machine Learning

Clustering is nothing but different groups. Items in one group are similar to each other. And Items in different groups are dissimilar with each other.

In Machine Learning, clustering is used to divide data items into separate clusters. Similar items are put into one cluster.

In that image, Cluster 1 contains all red items which are similar to each other. And in cluster 2 all green items are present.

Clustering is known as Unsupervised Learning.

Now you may be wondering where clustering is used?

You can see the clustering in the Supermarket. In the supermarket, all similar items are put in one place. For example, one variety of Mangoes are put in one place, where other varieties of Mangoes are placed in another place.

Now let’s see Different Types of Clustering-

Types of Clustering Algorithm in Machine Learning

Clustering is of 3 Types-

Exclusive Clustering.
Overlapping Clustering.
Hierarchical Clustering.

1. Exclusive Clustering

It is known as Hard Clustering. That means data items exclusively belong to one cluster. Two clusters are totally different from each other. As you saw in the previous image. Where Red Items are totally different from Green Items.

Example of Exclusive Clustering is K Means Clustering.

2. Overlapping Clustering

Overlapping clustering is a soft cluster. That means data items may belong to more than one cluster.

As you can see in that image, two clusters are overlapping. Because data points are not belonging to one cluster.

An example of Overlapping Clustering is Fuzzy or C Means Clustering.

3. Hierarchical clustering

Hierarchical Clustering groups similar objects into one cluster. The final cluster in the Hierarchical cluster combines all clusters into one cluster.

An example of Hierarchical clustering is Dendrogram.

Now you gained brief knowledge about Clustering and its types. Our main focus is K means Clustering, so let’s move into it.

What is K Means Clustering Algorithm in Machine Learning?

The objective of the K Means Clustering algorithm is to find groups or clusters in data. Here “K” represents the number of clusters.

Let’s understand K means Clustering with the help of an example-

Suppose we have two variables in our dataset. And we decided to plot those two variables on the X and Y-axis.

So after plotting on X and Y-axis, it looks something like that-

So here the question is- can we identify certain groups among all data points?.

And do we identify the number of groups?

The answer is with the help of the K Means Clustering Algorithm.

K Means Clustering Algorithms allows us to easily identify those clusters or groups. After using K means clustering, we can classify our data points into different clusters. K means clustering separate data points into two different clusters. Look something like that-

So, how K Means Clustering separate data points into different clusters?

Let’s see in the next section-

How Does K-Means Clustering Algorithm Works?

Suppose, we have some data points. Something like that-

And our task is to separate these data points into 3 different clusters. Which look something like that-

I will explain the whole working of K Means Clustering in the Step by Step manner.

So, Are you Excited?

Yes.

Let’s start-

Step 1

We start with this dataset, where we don’t have any idea about which item belongs to which clusters.

The first step in K means clustering is to choose the number of Clusters. That is K.

Now you may be wondering How to Choose the optimal K value?

So, don’t worry. I will discuss How to choose the K value later in the next section.

For now, let’s imagine we choose the number of clusters as 3.

So, k=3.

Step 2

The next step is randomly select 3 distinct data items. These data items are centroids of each cluster.

Suppose we have chosen these 3 data items.

Now the next step is-

Step 3

Calculate the distance between the first point and selected three clusters.

That means we have to choose the first point in our dataset and calculate the distance from all three randomly chosen clusters.

For calculating the distance, you can use the Euclidean distance formula.

So, we calculate the distance from the first point to all three clusters.

Step 4

After calculating the distance, assign that first point to the nearest cluster. In other words, assign to the clusters whose distance is minimum to that point.

In that example, this first point is near to the Red Cluster. That’s why this first point came under Red Cluster. As you can see in that image-

How to Choose K Value in K Means Clustering Algorithm?

Step 5

After assigning this first point to Red Cluster, we need to calculate the new centroid for Red Cluster.

We calculate the new centroid in the same way as we calculate the mean. I will explain in the next section, where I will explain the solved example of K means clustering.

Suppose we calculate the new centroid for the red cluster that lies something here-

So, our first point is assigned to the Red cluster. Now we take the second data point and repeat the same procedure as we did for the first data point.

Calculate the distance from all three clusters and whose distance is minimum from that second point, we assign to that cluster.

So after performing the same step with all data points, we get our final result. That looks something like that-

This result is not the same as we expect in the beginning.

Why we didn’t get the same result?.

The answer is due to random points selection. At step 2, we randomly select some data points and make them three clusters.

So, what will happen if we select some other random data points for clusters?.

The answer is we will get some different results from that result. There may be some data points that are in the green cluster present in the red cluster. That means data points change based on the randomly selected clusters.

So, K means clustering algorithms iterates over again and again unless and until the data points within each cluster stop changing.

That means it again starts from the beginning and repeat all the 5 steps and check the result. When no data points change its cluster location, K means clustering stops.

Now you may be wondering, How many Iteration it performs?

So, it depends on you. If you tell the machine to perform 30 iterations. So it will perform 30 iterations. It doesn’t mean that in 30 iterations, you will get different results. After some iterations, you will get the same result again and again. And that represents that K means clustering perfectly cluster the data points.

I hope now you understood the work procedure of K means clustering algorithm.

But you may be wondering How to the K value. Right?

So, let’s see how you can choose the K value in K means clustering.

How to Choose K Value in K Means Clustering Algorithm?

This is the most common question that comes into mind when someone learns K means Clustering.

How to choose K Value, is there any formula for choosing the K value?

So, yes there is a formula that is known as WCSS ( Within Cluster Sum of Square).

This formula is for 3 clusters. Here , we take all the points in each cluster and then we are calculating the distance for each point to the cluster.

Suppose for cluster 1, we take all the points in cluster 1, and then we calculate the distance from each point to the cluster. And then we sumup.

So for k=1, the value of WCSS is larger. As we increase the number of clusters or K, the value of WCSS decreases.

So, how many clusters or k we have?.

The answer is we can have as many clusters as we have data points. That means if we have 30 data points, so we can have 30 clusters. In that case, Every single point has its own cluster.

Suppose when you have 30 data points and you have 30 clusters. Each data point has its own cluster. In that case, what is the value of WCSS?

The WCSS become 0. Because each point has its own centroid, so the distance between the point and centroid becomes 0.

So, as the WCSS becomes lower, the higher the number of clusters, and the more accurate the result.

But how to find the optimal number of clusters?.

The answer is with the help of the Elbow Method.

Elbow Method is the visual method. Where you decide the optimal number of clusters with the help of a chart.

As you can see in that image, there is a huge change from K=1 to K=2. Because WCSS value decreases from 8000 to 3000.

From k=2 to k=3, there is also a huge change. WCSS value changes from 3000 to 1000.

But, after k=3, there is no huge change in WCSS value.

So, you can say that k=3 is the optimal number of clusters for that data points.

So, that’s all about how to choose optimal K value.

I hope you understood the method.

Now let’s see some manually solved examples in K means clustering.