MCQ IN COMPUTER SCIENCE & ENGINEERING

COMPUTER SCIENCE AND ENGINEERING

MACHINE LEARNING

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
A tech startup is building an image classification model. During the process, they copied over some of their validation data into their training examples, creating duplicate values in the training and validation subsets.Which is a possible result of taking this approach? (Select TWO.)
A
The model may perform worse with the test dataset than with the validation dataset
B
This is a common practice in machine learning and will improve the overall performance of the model
C
This could lead to overfitting the model
D
This is a good way to increase the training dataset size and therefore strengthen the model’s ability to generalize
Explanation: 

Detailed explanation-1: -Step 2: Data Cleaning Next, this data flows to the cleaning step. To make sure the data paints a consistent picture that your pipeline can learn from, Cortex automatically detects and scrubs away outliers, missing values, duplicates, and other errors.

Detailed explanation-2: -What Amazon SageMaker option should the company use to train their ML models that reduces the management and automates the pipeline for future retraining? Create and train your XGBoost algorithm on your local laptop and then use an Amazon SageMaker endpoint to host the ML model.

Detailed explanation-3: -Machine Learning algorithm to be used purely depends on the type of data in a given dataset.

Detailed explanation-4: -Machine learning algorithms build a model based on sample data, known as “training data, ” in order to make predictions or decisions without being explicitly programmed to do so.

There is 1 question to complete.