Follow the Data

A data driven blog

Cost-sensitive learning

I have been looking at a binary classification problem (basically a problem where you are supposed to predict “yes” or “no” based on a set of features) where the cost of misclassifying a “yes” as a “no” is much more expensive than misclassifying a “no” as a “yes”.

Searching the web for hints about how to approach this kind of scenario, I discovered that there are some methods explicitly designed for this, such as MetaCost [pdf link] by Domingos and cost-sensitive decision trees, but I also learned that there are a couple of very general relationships that apply for a scenario like mine.

In this paper (pdf link) by Ling and Sheng, it is shown that if your (binary) classifier can produce a posterior probability estimate for predicting (e g) “yes” in a test set, then one can make that classifier cost-sensitive simply by choosing the classification threshold (which is often taken as 0.5 in non-cost-sensitive classifiers) according to p_threshold = FP / (FP + FN), where FP is the false positive rate and FN is the false negative rate. Equivalently, one can “rebalance” the original samples by sampling “yes” and “no” examples proportionally so that p_threshold becomes 0.5. That is, the prior probabilities of the “yes” and “no” classes and the costs are interchangeable.

So in principle, one could either manipulate the classification threshold or the training set proportions to get a cost-sensitive classifier. Of course, further adjustment may be needed if the classifier you are using does not produce explicit probability estimates.

The paper is worth reading in full as it shows clearly why these relationships hold and how they are used in various cost-sensitive classification methods.

P.S. This is of course very much related to the imbalanced class problem that I wrote about in an earlier post, but at that time I was not thinking that much about the classification-cost aspect yet.


Single Post Navigation

3 thoughts on “Cost-sensitive learning

  1. Juan Lopez on said:

    Nice post Mikael. I recently understood the connection between “class imbalance” and “cost-sensitive learning”. In the absence of asymmetrical costs, class imbalance is not a problem all (Provost made this perfectly clear to me in “Machine Learning from Imbalanced Data Sets 101”, link: It is difficult to beat a classifier that predict every observation belongs to the “abundant” class.
    When costs are asymmetrical, then imbalance is a problem. Then you understand all the different solutions to the problem (weighting, upsampling, undersampling).

  2. Pingback: Cost-sensitive learning | Big Data Analytics 10...

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: