Friday, February 10, 2006

When it comes to prediction, Machine Learning kicks CHAID and CART's Tails

Conventional wisdom for direct marketing analytics regards CHAID and CART as cutting edge analysis for discovering correlations. (see "Optimal Database Marketing" by Drozdenko/Drake, Chapter 8, 2002)

However, what's not commonly known is that these are more than 20-year old techniques, and recent academic work has shown that analyses like CHAID and CART are more applicable for hypotheses testing rather than predictive analysis. And it's the latter that's more key to direct marketers. So, what's the cutting edge technology for predictive analytics, you say? Machine Learning!

In our own experience for conducting predictive analysis for direct marketing optimization, we have found that machine learning techniques such as Random Forests(tm) are far superior to the single-tree based analyses like CHAID and CART. Machine Learning has far better accuracy for prediction, mainly because instead of trying to come up with a single best tree algorithm for prediction, it develops many different simple trees (many trees = forest!) and collectively combines them into a "forest model".