This blog describes basic concepts,
benefits and challenges of implementation of Net Lift Models in direct
marketing campaigns. Net
lift models predict which customer segments are likely to make a purchase ONLY
if prompted by a marketing undertaking. The
modeling work was conducted using stepwise logistic regression in SAS
Enterprise Miner ®.
The paper provides examples how net
lift probability decomposition models leveraged differences between purchasers
in test group and control group to predict which customer segments need a
marketing contact and which customers segments are likely to make purchasing
decision without a nudge.
TRADITIONAL APPROACH TO DIRECT MARKETING LIST MODELING
Majority of direct marketing
campaigns are based on purchase propensity models, selecting customer email,
paper mail or other marketing contact lists based on customers’ probability to
make a purchase.
Scoring
Rank
|
Response
Rate
|
Lift
|
1
|
28.1%
|
3.41
|
2
|
17.3%
|
2.10
|
3
|
9.6%
|
1.17
|
4
|
8.4%
|
1.02
|
5
|
4.8%
|
0.58
|
6
|
3.9%
|
0.47
|
7
|
3.3%
|
0.40
|
8
|
3.4%
|
0.41
|
9
|
3.5%
|
0.42
|
10
|
0.1%
|
0.01
|
Total
|
8.2%
|
Table 1. Example of standard purchase propensity model output used to
generate direct campaign mailing list at 1800Flowers.com
This purchase propensity model had a
‘nice’ lift (rank’s response rate over total response rate) for the top 4 ranks
on the validation data set. Consequently, we would contact customers included
in top 4 ranks. After the catalog campaign had been completed, we conducted
post analysis of mailing list performance vs. control group. The control group
consisted of customers who were not contacted, grouped by the same purchase
probability scoring ranks.
Sample campaign post analysis results:
Mailing Group
|
|
Scoring
Rank
|
Response
Rate
|
1
|
27.0%
|
2
|
20.3%
|
3
|
10.7%
|
4
|
8.9%
|
Total
|
16.7%
|
Control Group
|
Response
Rate
|
27.9%
|
20.9%
|
10.0%
|
7.5%
|
16.5%
|
Incremental Response
Rate
|
-0.91%
|
-0.56%
|
0.66%
|
1.38%
|
0.15%
|
Table 2. Campaign Post analysis
As shown the table 2, the top four customer ranks selected by propensity model perform we and control group. However, even though mailing/test group response rate was at decent le incremental response rate (mailing group net of control group) for combined top 4 ranks was low incremental response rate, our undertaking would be likely generating a negative ROI.
What was the
reason that our campaign shown such poor incremental results? The purchase
propensity model did its job well and we did send an offer to people who were
likely to make a purchase. Apparently, modeling based on expected purchase
propensity is not always the right solution for a successful direct marking
campaign. Since there was no increase in response rate over control group, we
could have been contacting customers who would have bought our product without
promotional direct mail. Customers in top ranks of purchase propensity model
may not need a nudge or they are buying in response to a contact via other
channels. If that is the case, the customers in the lower purchase propensity
ranks would be more ‘responsive’ to a marketing contact.
We should be
predicting incremental impact – additional purchases generated by a campaign,
not purchases that would be made without the contact. Our marketing mailing can
be substantially more cost efficient if we don’t mail customers who are going
to buy anyway.
Since customers
very rarely use promo codes from catalogs or click on web display ads, it is
difficult to identify undecided, swing customer based on the promotion codes or
web display clickthroughs.
Net lift models
predict which customer segments are likely to make a purchase ONLY if prompted
by a marketing undertaking.
Purchasers from
mailing group include customers that needed a nudge, however, all purchasers in
the holdout/control group did not need our catalog to made their purchasing
decision. All purchasers in the control group can be classified as ‘need no
contact’. Since we need a model that would separate ‘need contact’ purchasers
from ‘no contact’ purchasers, the net lift models look at differences in
purchasers in mailing (contact) group versus purchasers from control group.
In order to
classify our customers into these groups we need mailing group and control
group purchases results from similar prior campaigns. If there are no
comparable historic undertakings, we have to create a small scale trial before the
main rollout.
All models
described in this project used stepwise logistic regression on data partitioned
into test and validation sets. All data prep work was done in base SAS ® and
all modeling was done in SAS Enterprise Miner ®.
NET LIFT MODELS
There has been recent mentions of a
target selection (i.e., case selection) technique referred to as net lift, uplift, incremental response, differential response, and possible
other names. When posed as a return maximization problem, net lift
and the usual target selection
practice coincide. Net lift applies to
target selection in situations with a binary treatment; return maximization provides direction on how to handle problems in
situations with more than one treatment.
Definition of Uplift modeling: Analytically modeling to predict
the influence on a customer's buying behavior that results from choosing one
marketing treatment (customer-facing action) over another. The secondary
treatment is often passive – make no contact – as evaluated over a control group.
The uplift model answers the question, “How much more likely is this treatment
to generate the desired outcome than the alternative treatment?” For each customer,
the model's prediction drives the decision of which treatment to apply [3].
Problem statement
Given the following data [2]:
·
Cases P = {1,…,n},
·
Treatments J = {1,…,U},
·
expected return R(i,t) for each case
and treatment
,
·
non-negative integers n1,…,nU such
that
n1 + … + nU = n
find a treatment assignment
f: P→J
so that the total return
∑[i=1 to n] Rif(i)
is maximized, subject to the
constraints that the number of cases assigned to treatment j is not to exceed nj
(j=1,…,U) [2].
Example 1: Mailing campaign
·
P: a group of customers,
·
two treatments:
1. treatment 1: send a promotional coupon; Ri1 is
the expected return if a coupon is sent to customer i,
2.
treatment 2: no coupon is sent; the
expected return is zero: Ri2
= 0
Solution to the maximization problem:
•
assign treatment 1 to the customers
with the n1 largest values of Ri1
•
assign treatment 2 to the remaining
customers
This solution can also be derived
from the Neyman-Pearson lemma.
Example 2: Marketing action case
•
P: a group of customers,
•
two treatments:
•
treatment 1: exercise some marketing
action; Ri1 is
the expected return if treatment 1 is given to customer i,
•
treatment 2: exercise no the
marketing action; let Ri2
be the expected return if treatment 2 is given to customer
Solution to the maximization problem:
As for to the solution to Example 1,
to attain the maximum return:
•
assign treatment 1 to the customers
with the n1 largest values
of Ri1 – Ri2
•
assign treatment 2 to the remaining
customers
The difference Ri1
– Ri2 is
called net lift, uplift, incremental response,
differential response, etc.
If one considers only the response to treatment 1, bases targeting on
a model built out of responses to previous marketing actions, one is proceeding
as if the situation were as in Example1. One would mistakenly maximize
Such maximization would not yield the
maximum return. One needs to consider the return from cases subjected to no
marketing action.
Example 3: A toy
example
Consider the following toy example with a population of n = 3 cases, and U = 3 treatments, n1
= n2 = n3 = 1 and returns:
Note that neither case 2 nor case 3 were assigned the treatment that maximize their return.
Although the possibility of a return
of 18 exists, this possibility is not realized, since case 2 is not assigned
treatment 2.
(In a case like this, one would probably
advice that more resources be allocated to treatment 2, so that n2 > 1.)
Example 4:
General case
The problem can be cast as a standard
integer linear programming problem. If we let
then the problem can be written as:
subject to the constraints:
Note:
In general, the best assignment that solves the linear
programming problem does not vary continuously with the coefficients:
Problem: make a (calling time, weekday) assignment so that expected total
number of contacts is maximized, subject to the constraint that the call centre
capacity is limited.
•
small changes in the returns Rij result in only small changes
in the best total return,
•
but, the assignment that yields the
best return may vary considerably.
Example 5: A (n almost real) example and variation
Each week, a call center is responsible for contacting a group of
customers. The length n of the list
is not fixed, but it does not vary much from week to week.
Based on what is known of the customers, and on historical
observations, it is possible to estimate the expected probability of
successfully contacting each customer at different combinations of time of the
day and call type (“home” or “other”).
Un-adjusted probabilities of
successful contact are not constant in time…
Remarks:
which suggests that insisting on solving the full
maximization problem is an over-kill
•
in practice, proper call optimization
is carried dynamically
A solution sketch:
•
segment customers, including the
probabilities of successful contact at different times as segmentation
variables, so that the probability of contact is approximately constant for the
segment
•
solve the optimization problem for
the fraction of each segment that has to be contacted at each time
NET LIFT MODELING APPROACH – PROBABILITY
DECOMPOSITION MODELS
Segments used in probability decomposition models:
Contacted Group
|
Control Group
|
|
Purchasers prompted by contact
|
A
|
D
|
Purchasers not needing contact
|
B
|
E
|
NonPurchasers
|
C
|
F
|
Figure 2. Segments in probability decomposition models
Standard purchase propensity models
are only capable of predicting all purchasers (combined segments A and B). The probability
decomposition model predicts purchasers segments that need to be contacted
(segment A) by leveraging two logistic regression models, as shown in the
formula below [1].
P(A I AUBUC) =
|
P(AUB I AUBUC) x
|
(2 - 1/P(AUB I AUBUE))
|
Probability of purchase prompted by
contact
|
Probability of purchase out of contact
group
|
Probability of purchaser being in
contact group out of all purchasers
|
Summary of probability decomposition modeling
process:
1. Build stepwise logistic regression purchase propensity model (M1) and
record model score for every customer in a modeled population.
2. Use past campaign results or small scale trial campaign results to
create a dataset with two equal size sections of purchasers from contact group
and control group. Build a stepwise regression logistic model predicting which
purchasers are from the contact group. The main task of this model will be to
penalize the score of model built in the step 1 when purchaser is not likely to
need contact.
3. Calculate net purchasers score based on probability decomposition
formula
Results of the probability decomposition modeling process for
marketing offer mailing.
S co ring R a nk
|
Co nta ct
Gro up
R e sp o nse %
|
Co ntro l
Gro up
R e sp o nse %
|
Incre me nta l
R e sp o nse
R a te
|
1
|
18.8%
|
12.9%
|
5.9%
|
2
|
7.8%
|
5.4%
|
2.4%
|
3
|
6.9%
|
4.5%
|
2.5%
|
4
|
4.3%
|
3.6%
|
0.7%
|
5
|
3.9%
|
3.5%
|
0.4%
|
6
|
4.1%
|
4.1%
|
0.0%
|
7
|
3.7%
|
4.0%
|
-0.2%
|
8
|
4.7%
|
4.1%
|
0.6%
|
9
|
5.0%
|
6.7%
|
-1.7%
|
10
|
11.0%
|
15.7%
|
-4.7%
|
Table 3. Post analysis of campaign leveraging probability
decomposition model
Scoring
Ranks 1 thru 6 show positive incremental response rates. The scoring ranks are
ordered based on the incremental response rates.
CONCLUSION
The probability decomposition model is
just one in a group of methods known as net lift models. The net lift models help
maximize ROI of marketing campaigns as they let us avoid contacting customers
or prospects who are highly likely to buy a product or service anyway. The
traditional purchase propensity model may do a good job ranking customers based
on their probability to make a purchase but it does not have the ability to
select the true responders, the customers who will only make a purchase if
contacted. The probability decomposition model has its challenges; it is
relatively difficult to interpret as it combines scores of two separate model
scores. Following is a list of conditions required for net lift model:
•
presence of randomized control group
•
analyzed marketing contact is not the
only communication leading to purchase
•
purchase rate is not correlated to
lift, purchase propensity model is not sufficient
•
presence of similar/repetitive
marketing campaigns or small scale tests
•
variation in average lift across
scoring ranks
References
1. Jun Zhong, VP Targeting and Analytics, Card Services Customer
Marketing, Wells Fargo in the presentation: “Predictive Modeling & Today’s
Growing Data Challnges” at Predictive Analytics World in San Francisco, CA in
2009.
2. Lo, Victor S.Y. “The True Lift Model - A Novel Data Mining Approach to
Response Modeling” in Database Marketing, SIGKDD Explorations. Volume 4 (2002),
Issue 2, pg 78-86
3. Siegel, Eric, “Uplift Modeling: Predictive Analytics Can’t Optimize
Marketing Decisions Without It”, Predictive Impact, Inc., 2011.
No comments:
Post a Comment