COMPUTER SCIENCE AND ENGINEERING
MACHINE LEARNING
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
|
|
SGD
|
|
BGD
|
|
MGD
|
|
MBGD
|
Detailed explanation-1: -In Gradient Descent (GD), we perform the forward pass using ALL the train data before starting the backpropagation pass to adjust the weights. This is called (one epoch). In Stochastic Gradient Descent (SGD), we perform the forward pass using a SUBSET of the train set followed by backpropagation to adjust the weights.
Detailed explanation-2: -The problem with gradient descent is that the weight update at a moment (t) is governed by the learning rate and gradient at that moment only. It doesn’t take into account the past steps taken while traversing the cost space.
Detailed explanation-3: -Stochastic Gradient Descent. When the cost function is very irregular (as in Figure 4-6), this can actually help the algorithm jump out of local minima, so Stochastic Gradient Descent has a better chance of finding the global minimum than Batch Gradient Descent does.
Detailed explanation-4: -Mean Square Error / Quadratic Loss / L2 Loss It is the most commonly used Regression loss function.