__Find k for kmeans__ using the __elbow method__?

The KMeans algorithm can cluster observed data. But how many clusters (k) are there?

The elbow method finds the

__optimal value for k (#clusters)__.**Related course:** Complete Machine Learning Course with Python

## Determine optimal k

The technique to determine __K, the number of clusters__, is called __the elbow method__.

With a bit of fantasy, you can see an elbow in the chart below.

We’ll plot:

- values for K on the horizontal axis
- the distortion on the Y axis (the values calculated with the cost function).

This results in:

When K increases, the centroids are closer to the clusters centroids.

The improvements will decline, at some point rapidly, creating the elbow shape.

That point is the optimal value for K. In the image above, K=3.

## Elbow method example

The example code below creates finds the optimal value for k.

# clustering dataset |

If you are new to Machine Learning, I highly recommend this book

How is the calculated distortion different from kmeanModel.inertia_ ? Thank you!