MCQ IN COMPUTER SCIENCE & ENGINEERING

COMPUTER SCIENCE AND ENGINEERING

MACHINE LEARNING

Question [CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
A Data Scientist at a credit card company trained a classification model to predict fraud at the time of a transaction. The Data Scientist used a confusion matrix to evaluate the performance of the model. Using the confusion matrix below, determine the percent of positive records that were classified correctly. Choose the answer that also labels this evaluation metric correctly. True Positive True Negative Predicted Positive 100 90 Predicted Negative 25 250
A
80%; Recall
B
52.6%; Recall
C
80%; Precision
D
52.6%; Precision
Explanation: 

Detailed explanation-1: -Step 2: Data Cleaning Next, this data flows to the cleaning step. To make sure the data paints a consistent picture that your pipeline can learn from, Cortex automatically detects and scrubs away outliers, missing values, duplicates, and other errors.

Detailed explanation-2: -What Amazon SageMaker option should the company use to train their ML models that reduces the management and automates the pipeline for future retraining? Create and train your XGBoost algorithm on your local laptop and then use an Amazon SageMaker endpoint to host the ML model.

Detailed explanation-3: -Which of the following methods DOES NOT prevent a model from overfitting to the training set? Early stopping is a regularization technique, and can help reduce overfitting. Dropout is a regularization technique, and can help reduce overfitting. Data augmentation can help reduce overfitting by creating a larger dataset.

Detailed explanation-4: -The most widely used metrics and tools to assess a classification model are: Confusion matrix. Cost-sensitive accuracy. Area under the ROC curve.

There is 1 question to complete.