MACHINE LEARNING

APPLICATION OF SUPERVISED LEARNING

DEEP LEARNING

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
Which of the following activation functions helps with the vanishing gradients problem?
A
Sigmoid
B
Tanh
C
ReLU
D
SELU
E
Softmax
Explanation: 

Detailed explanation-1: -The rectified linear activation function overcomes the vanishing gradient problem, allowing models to learn faster and perform better. The rectified linear activation is the default activation when developing multilayer Perceptron and convolutional neural networks.

Detailed explanation-2: -Certain activation functions, like the sigmoid function, squishes a large input space into a small input space between 0 and 1. Therefore, a large change in the input of the sigmoid function will cause a small change in the output. Hence, the derivative becomes small.

Detailed explanation-3: -Residual neural networks (ResNets) One of the newest and most effective ways to resolve the vanishing gradient problem is with residual neural networks, or ResNets (not to be confused with recurrent neural networks).

Detailed explanation-4: -Leaky or parametric RELU activation functions provide a robust solution to vanishing gradients by ensuring a derivative sufficiently greater than zero.

There is 1 question to complete.