APPLICATION OF SUPERVISED LEARNING
ARTIFICIAL INTELLIGENCE
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
|
|
Gradient size
|
|
Gradient direction
|
|
Learning rate
|
|
Number of samples used
|
Detailed explanation-1: -SGD can be used when the dataset is large. Batch Gradient Descent converges directly to minima. SGD converges faster for larger datasets. But, since in SGD we use only one example at a time, we cannot implement the vectorized implementation on it.
Detailed explanation-2: -When the batch is the size of one sample, the learning algorithm is called stochastic gradient descent. When the batch size is more than one sample and less than the size of the training dataset, the learning algorithm is called mini-batch gradient descent.
Detailed explanation-3: -Batch gradient descent vs Stochastic gradient descent Stochastic gradient descent (SGD or “on-line") typically reaches convergence much faster than batch (or “standard") gradient descent since it updates weight more frequently.