Predictive Analytics with Data Mining:
How It Works
by Eric Siegel, Ph.D.
Published in DM Review's DM Direct, February 2005.
Although you've probably heard many times that predictive analytics will optimize your marketing campaigns, it's hard to envision, in more concrete terms, what it will do. This makes it tough to select and direct analytics technology. How can you get a handle on its functional value for marketing, sales and product directions without necessarily becoming an expert?
The answer is, in order to know precisely how predictive analytics may benefit current marketing operations, you do need to learn a few specifics about how it works. This short article covers just enough of the inside mechanics to eliminate predictive analytics' "voodoo" status. Here you will learn what a predictive model is, and how, by actively guiding marketing campaigns, it constitutes a key form of business intelligence. To this end, we'll take a look inside to see how a model works and how it is created.
1. Predictors Rank Your Customers to Guide Your Marketing
Predictive analytics' central building block is the predictor, a single value measured for each customer. For example, recency, which is based on the number of weeks since the customer's last purchase, has higher values for more recent customers. This predictor is usually a reliable campaign response predictor: you will receive more responses from those customers more highly ranked by recency. That means that if you contact your customers in order of recency -- first, call the most-recent customer; next, call the next-most-recent customer; and so on -- you will improve your response rate.
For each prediction goal, there are an abundance of predictors that will help rank your customer database. For example, consider a customer's online behavior: Customers who spend less time logged on may be less likely to renew their annual subscription. In this case, retention campaigns can be cost-effectively targeted to customers with a low monthly usage predictor value.
2. Combined Predictors Means Smarter Rankings
It turns out you can do even better by using more than one predictor at a time, combining them with a model. Creating this model is the very purpose of predictive analytics.
One way to combine two predictors is with a formula, such as simply adding them together. If both recency and personal income influence the chance that a customer will respond to a mailing, a good predictor may be:
recency + personal income
Voilà, a new, improved predictor. If recency is twice as important, give it twice the weight:
2 x recency + personal income
A scheme such as this that combines predictors is called a model -- in the case of the summation above, a linear model. For this reason, predictive analytics is also called predictive modeling.
Other predictive models are business intelligence rules, such as:
If the customer is rural, and her monthly usage is high,
then the customer will probably renew.
If you discover that urban customers who spend more time exploring new service features are at a greater risk to cancel, expand this rule-based model with a second rule:
If the customer is urban, and new feature exploration is high,
then the customer will probably not renew.
The right combination of predictors will perform better prediction by considering multiple aspects of the customer and her behavior. To match the complexity of customer decisions, a predictive model must usually be much richer and more complex than the above examples, combining dozens of predictors.
3. The Computer Makes Your Model from Your Customer Data
The real trick is to find the best predictive model. This is a difficult problem, since there are so many options. There are many kinds of models, such as linear formulas and business rules. And, for each kind of model, there are all the weights or rules or other mechanics that determine precisely how the predictors are combined. In fact, there are so many choices, it is literally impossible for a person to try them all and find the best one.
Predictive analytics is data mining technology that uses your customer data to automatically build a predictive model specialized for your business. This process learns from your organization's collective experience by leveraging your existing logs of customer purchases, behavior and demographics. The wisdom gained is encoded as the predictive model itself. Predictive modeling software has computer science at its core, undertaking a mixture of number crunching, trial, and error.
Wisdom Gained: A Predictive Model is Built from Customer Data
Predictive analytics can generate business rules that may make clear sense, or you could end up with a complex formula that is hard to decipher. The choice is up to you, keeping in mind that a simpler, more intuitive model may not perform prediction as well.
4. A Simple Curve Shows How Well Your Model Works
Either way, you need solid proof that the model is a good one. A profit curve (shown below) estimates the profit you'll receive from a campaign guided by predictive analytics, depending on how many prospects you contact. The profit this curve predicts depends on the ranking of your customers given by a predictive model, the cost per contact (e.g., printing and mailing costs), and the average profit per respondent.
A Typical Profit Curve
As shown by the upper line, the more customers you contact, the greater your profit, up to a point. This predicted profit line rises initially, since you will contact customers more likely to respond first. After exhausting those highly ranked, though, contacting the remaining customers will only serve to decrease your profit. You'll probably want to stop your campaign at the high profit peak, although that choice may depend on your longer-term marketing strategies.
The effectiveness of the predictive ranking is clear. With no predictive model and no means to rank customers, there may be loss rather than profit, as shown by the lower, diagonal line. This takes place when too few customers respond to make the campaign costs pay off. In this case, the more customers you contact, the more money you lose. Given this state of affairs, the rise (and eventual fall) of the upper profit line is a testimony to how well the model predicts.
A careful combination of predictors performs better customer prediction by considering multiple aspects of your customers and their behaviors. Predictive analytics finds the right way to combine predictors by building a model optimized according to your customer data.
Predictive analytics builds models automatically, but the overall business process to direct and integrate predictive analytics is by no means automatic -- it truly needs your marketing expertise. To learn about this process and why it requires your direct involvement, check out the article, "Driven with Business Expertise, Analytics Produces Actionable Predictions."
About the Author
Eric Siegel, Ph.D., is President of San Francisco-based Prediction Impact, providing analytics training and services to medium through Fortune 100 companies. He is an expert in data mining and predictive analytics and an award-winning teacher of graduate-level courses in those areas. Siegel served as a computer science professor at Columbia University, where he forwarded data mining technology in the realms of machine learning performance optimization, text mining and data visualization. He cofounded two New York City-based software companies for customer/user profiling and data mining. With data mining, Eric has also solved problems in computer security, fraud detection, computational linguistics and information retrieval. You can reach him at firstname.lastname@example.org or (415) 683-1146.