FUNDAMENTALS OF COMPUTER

DATABASE FUNDAMENTALS

BASICS OF BIG DATA

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
Identify the correct method for choosing the value of ‘k’ in k-means algorithm?
A
Dimensionality reduction
B
Elbow method
C
Both Dimensionality reduction and Elbow method
D
Data partitioning
Explanation: 

Detailed explanation-1: -So, we need to use something called an elbow plot to find the best k. It plots the WCSS against the number of clusters or k. This is called an elbow plot because we can find an optimal k value by finding the “elbow” of the plot, which is at 3.

Detailed explanation-2: -The elbow method uses the sum of squared distance (SSE) to choose an ideal value of k based on the distance between the data points and their assigned clusters. We would choose a value of k where the SSE begins to flatten out and we see an inflection point.

Detailed explanation-3: -The elbow method is a graphical representation of finding the optimal ‘K’ in a K-means clustering. It works by finding WCSS (Within-Cluster Sum of Square) i.e. the sum of the square distance between points in a cluster and the cluster centroid.

Detailed explanation-4: -The Silhouette Method Average silhouette method computes the average silhouette of observations for different values of k. The optimal number of clusters k is the one that maximize the average silhouette over a range of possible values for k. This also suggests an optimal of 2 clusters.

There is 1 question to complete.