Get our whitepaper on form submission right to your inbox!!!
Binary classifications are the problems when we try to predict/ assess if a particular observation/ object is true to its real nature.
The classic examples for binary classification will be;
So, since we are trying to predict if some observation is true or false, our prediction can fall into either of the 4 categories below.
Above is called Confusion Matrix, which helps easily understand where does each ‘prediction’ falls, in the context of the ‘actual’ true or false state.
Each of the cells in the confusion matrix denotes an important accuracy metric of our prediction, which accuracy metric we need to focus more depends on the application.
This metric represents how accurately we have predicted the result; true observations as true and false observations as false.
Accuracy = (true positives + true negatives)/total observations
while accuracy is quite intuitive and a generally used measure of prediction performance, it will not be a sufficient parameter when our observation sample is highly biased.
e.g: Transaction fraud prediction :
Fraudulent transactions, as a count represent < 2% of total transactions in general circumstances.
We develop a model to predict fraudulent transactions, and we get a confusion matrix like below, for 100 samples that are tested.
while the model accuracy remains at 98% [(1+97)/100], we clearly can see that out of the actual 2 fraudulent transactions, the model misses one.
this is when we try to refine our focus towards how many are predicted accurately based on our focus.
True positive rate (or Model Recall) — higher the better
True positive rate is how accurate is the model when capturing the ‘True’ results.
If our focus is primarily on capturing ‘True’ results, the true positive rate/ model recall is a great indicator.
However, a common phenomenon is that when we try to increase model recall, the false positive rate also increases.
False-positive rate (or false alarms) — lower the better
If the false alarms are costly, this is a metric we need to focus on.
Taking the same example of fraudulent transactions, if we are trying to use our predictions to determine and block fraudulent transactions, and if our model has a high false-positive rate; it means that we are blocking a high number of genuine transactions too.
It is going to cause damage to the user experience.
False positives are generally the costliest outcomes.
False-negative rate (misses) — lower the better
This evaluates how many did we miss.
we need to try and make this metric closer to zero.
This can be done by improving model recall (a good model) or costing us more false alarms (a bad model).
Which metric we need to focus on when evaluating the model performance heavily lies on the business application.
While accuracy is a good measure, it mostly looks like ‘True negatives’ is a parameter that no one really cares about; and accuracy can be mostly made of ‘True negatives’.