Sara Iris Garcia

Putting it in a few words, a confusion matrix is a summarization of the performance of an algorithm. It is a table that describes the performance of a classifier model with known labels.

Confusion matrix is well used in Machine Learning because it not only indicates the errors made by the model but also describes the types of error.

Image 1: Example of a Confusion matrix

Let’s have a look at what does the table is referring:

TP – True Positives: model predicted Positive and actual class is also Positive.

FP – False Positives: model predicted Positive, but the actual class is Negative. (aka Type I error)

FN – False Negatives: model predicted False but the actual class is Positive. (aka Type II error)

TN – True Negatives: model predicted Negative and actual class is also Negative.

As you can see, the confusion matrix is the count of each error type the model made. Having this information, we can calculate the following metrics:

Accuracy: Percentage of the correct classifications the model made from all the observations.

\begin{equation} Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \end{equation}

Precision: Percentage of the correctly predicted Positive observations among the total of the predicted Positive observations. In other words, it’s the percentage of the positive predicts that are correct.

\begin{equation} Precision = \frac{TP}{TP + FP} \end{equation}

Recall or Sensibility: Is the percentage of the correctly predicted Positive observations among the total of the Positive observations.

\begin{equation} Recall = \frac{TP}{TP + FN} \end{equation}

F1 score: Combines precision and recall. Can be interpreted as the weighted average of the precision and recall on a scale from 0 to 1, where 1 means a perfect classification.

\begin{equation} F1-score = \frac{2∗(PRECISION ∗ RECALL)}{PRECISION + RECALL} \end{equation}

Sara Iris Garcia

Simple guide to Confusion Matrix

TP – True Positives: model predicted Positive and actual class is also Positive.

FP – False Positives: model predicted Positive, but the actual class is Negative. (aka Type I error)

FN – False Negatives: model predicted False but the actual class is Positive. (aka Type II error)

TN – True Negatives: model predicted Negative and actual class is also Negative.

Sara Iris Garcia

Recent Posts

My PyCon Thailand 2019 experience

From my talk at PyConES 2018

An introduction to Gradient Descent Algorithm

Categories

Tags

My PyCon Thailand 2019 experience

From my talk at PyConES 2018

An introduction to Gradient Descent Algorithm

Pages

Social