COMPUTER SCIENCE AND ENGINEERING
MACHINE LEARNING
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
|
|
0
|
|
1
|
|
-1
|
|
0.5
|
Detailed explanation-1: -A dataset with a 50/50 split of samples for the two classes would have a maximum entropy (maximum surprise) of 1 bit, whereas an imbalanced dataset with a split of 10/90 would have a smaller entropy as there would be less surprise for a randomly drawn example from the dataset.
Detailed explanation-2: -If we have a set with k different values in it, we can calculate the entropy as follows: Where P(valuei) is the probability of getting the ith value when randomly selecting one from the set. 16 instances: 9 positive, 7 negative. This makes sense – it’s almost a 50/50 split; so, the entropy should be close to 1.
Detailed explanation-3: -This is considered a high entropy, a high level of disorder ( meaning low level of purity). Entropy is measured between 0 and 1. (Depending on the number of classes in your dataset, entropy can be greater than 1 but it means the same thing, a very high level of disorder.
Detailed explanation-4: -The entropy is 0 if all samples of a node belong to the same class, and the entropy is maximal if we have a uniform class distribution. In other words, the entropy of a node (consist of single class) is zero because the probability is 1 and log (1) = 0.