Table of content

  • What are evaluation metrics?
  • Confusion Metrics
  • Accuracy,Precision, Recall, R1 Score.

What are Evaluation Metrics ?

  • You have built multiple models for your classification task, but how do you decide which one is better?
  • How do you know how is your model performing? How good is it in predicting the outcome?
  • All these questions can be answered with the help of evaluation metrics.

Confusion matrix

  • A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model.
  • For a binary classification problem, we would have a 2 x 2 matrix as shown below with 4 values:
  • A good model is one which has high TP and TN rates, while low FP and FN rates.

Accuracy

It is a measure of correctness that is achieved in true prediction. In simple words, it tells us how many predictions are actually positive out of all the total positive predicted.

Accuracy is a valid choice of evaluation for classification problems which are well balanced and not skewed or there is no class imbalance.

Precision

It is a measure of correctness that is achieved in true prediction. In simple words, it tells us how many predictions are actually positive out of all the total positive predicted.

Precision is a valid choice of evaluation metric when we want to be very sure of our prediction.

Recall

It is a measure of actual observations which are predicted correctly, i.e. how many observations of positive class are actually predicted as positive.

It is also known as Sensitivity. Recall is a valid choice of evaluation metric when we want to capture as many positives as possible.

F1 Score

  • The F1 score is a number between 0 and 1 and is the harmonic mean of precision and recall.
  • F1 score sort of maintains a balance between the precision and recall for your classifier. If your precision is low, the F1 is low and if the recall is low again your F1 score is low.
  • We use Harmonic mean because it is not sensitive to extremely large values, unlike simple averages.

Leave a Reply