Confusion Matrix — It is matrix showing the actual label and the Predicted Labels.
Fig : For the reference
Definition of all the terms :
- True Positive (TP):- When our actual value is Positive or 1 and Our model also predicted it as True/Positive (1).
- True Negative (TN) :- When our actual value is Negative or 0 and Our model also predicted it as Negative/False (0).
- False Positive (FP) :- When in actual our value is Negative (0) and Our model predicted it as Positive /True (1).
- False Negative (FN) :- When in actual our value is Positive (1) and Our model predicted it as Negative/ False (0).
In Machine learning we have different type of metrics to check the accuracy of the Model according to the situation of Problem Statement.
Metrics to check the Accuracy of the model :-
- Accuracy Score = TP+TN /TP+FP+TN+FN
- Recall Score / Sensitivity / True Positive Rate (TPR)= TP /TP+FN
- Precision Score = TP/TP+FP
- Specificity = TN/TN+FP
- False Positive Rate = 1- Specificity = FP/FP+TN
f-1 Score =( 1+B)²+ (Precision x Recall) /B² x (Precision + Recall)
Definition and concepts of Scoring metrics:-
- Accuracy — Accuracy define that how many labels are predicted correctly by our model Out of total number of labels.
for example —
2. Recall /TPR /Sensitivity —Recall defines that how many positive (1) labels are predicted correctly or Positive from all the Positive(1) labels .
Some Important point to remember for conceptual understanding:-
- When we are increase the threshold value then probably increase the False Negative label count that leads to decrease the Recall accuracy .
- And when we decrease the threshold value our model probably predict Negative (0) label into Positive label (1) that reduce the False Negative (FN) Label count and this may leads to increase the RECALL SCORE.
When we use Recall -:
- In case of Medical Situation we will priority to Recall Score .
- In medical case suppose that A cancer Patient come into the hospital and Doctor say that He has not Cancer then It will be Very Dangerous Situation for the Patient because he just relax and go back to the Home.
- So to reduce the FN we will use Recall score.
From the above dataset we are going to calculate the recall score :-
3. Precision Score —Precision define that Out of all labels that predicted as Positive (1) and how many are really Positive (1).
Some Important point to remember for conceptual understanding:-
- When we increase the threshold value the count of False Positive (FP) Label probably decreases . In such cases may be our model predict the Actual Positive Label (1) into Negative (0) label .In this Case Our Precision Score may increase or be constant.
- But when we decrease the threshold value the count of False Positive Label probably increases. In such cases our model probably predict Negative (0) label into Positive (1) Label. In this case our model Precision Score may decreases.
When to Use Precision:-
- Let’s take an example : Suppose that we are training our model to predict whether the mail is SPAM or NOT .
- In this case we have to look up on some important point that if we receive an important mail and Suppose that our model predicted it is a SPAM mail then it will be very dangerous for the Clients. He may missed the important mail.
- So, In this case we will use Precision metrics to reduce the False Positive Label Count .