MACHINE LEARNING

APPLICATION OF SUPERVISED LEARNING

NEURAL NETWORK

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
What if we would like to have prediction output (binary classification) represented by probability, which activation function is the best choice?
A
tanh
B
ReLu
C
Sigmoid
D
Linear
Explanation: 

Detailed explanation-1: -The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.

Detailed explanation-2: -In a binary classifier, we use the sigmoid activation function with one node. In a multiclass classification problem, we use the softmax activation function with one node per class. In a multilabel classification problem, we use the sigmoid activation function with one node per class.

Detailed explanation-3: -The sigmoid activation is an ideal activation function for a binary classification problem where the output is interpreted as a Binomial probability distribution. The sigmoid activation function can also be used as an activation function for multi-class classification problems where classes are non-mutually exclusive.

Detailed explanation-4: -For binary classification, the logistic function (a sigmoid) and softmax will perform equally well, but the logistic function is mathematically simpler and hence the natural choice.

Detailed explanation-5: -11) Which of the following functions can be used as an activation function in the output layer if we wish to predict the probabilities of n classes (p1, p2.. pk) such that sum of p over all n equals to 1? Softmax function is of the form in which the sum of probabilities over all k sum to 1.

There is 1 question to complete.