Scoring a File After Building a Model
Friday, May 10th, 2013Two difficult issues come up frequently in response modeling: how to take different offers into account when building the model from a promotion that included multiple offers, and how to go about scoring the promotion file for future use when you may want to use different offers at different times – including new offers that you haven’t tested or even thought of yet.
To begin addressing these questions, let’s assume we’re building a direct mail ZIP-code-level customer-acquisition model for a continuity media club (this means the only things we know about the persons to whom we are mailing are the census data associated with their ZIP code) and that the mailing on which the model will be based went to 100,000 persons.
As noted, the total mailing pulled 3 percent, with the $2.95 offer pulling 5 percent, the $4.95 offer pulling 3 percent and the $6.95 offer pulling 1 percent. The first decision we have to make is whether to build three models, one for each offer, or one model that in some way captures the effect of the different offers.
The statisticians will tell you that in general it’s better to build one model with a larger sample size than to build three individual models each based on a smaller sample size. What’s more, in this case the sample size for the $6.95 offer is so small that a separate model couldn’t be built even if we wanted to build one. So, the model that we finally come up with will have to “capture” the effect of offer. We’ll show you how that is done later.
Let’s further assume that we decide to start off the modeling process by asking our statistician to bear with us and select only variables we personally think are important. Then we decide to start with the variable AVERAGE HOUSEHOLD INCOME IN THE ZIP OODE (AVGHHINC). And we run our model and discover that the variable is “significant” and that the model looks like this:
Response= a- b(AVGHHINC)
You read this equation as follows: a person’s expected response is equal to some number “a” minus some other number “b” times the average income that exists within the ZIP code in which the person resides. This indicates that response to this promotion is inversely proportional to income. Said another way, the higher the average income in the ZIP code into which we are mailing, the less likely there will be response to our continuity offer. A not uncommon finding.
Let’s try another variable – percentage of married women in the ZIP code (%MAR). We run this simple model and find the following:
Response= a-b(%MAR)
In this new model we have different values for a and b, but again the negative sign indicates that as the percentage of married women in the ZIP code increases, the response rate to our continuity promotion goes down. In our model it now looks pretty clear that we are doing better, at least in terms of response, in downscale markets where income is low and the percentage of unmarried women is high.
Next, let’s assume that we have exhausted our look at variables one at a time and that these two demographic variables are the only ones found to be significantly related to response. Now, we ask our statistician (who by now is going crazy because this isn’t really the way you would proceed) to build a multivariate model attempting to include both the income and marital-status variables. With the result being:
Response = a -bl(AVGHHINC) – b2(%MAR)
In this two-variable model, the values of the “a” and the “b’s” would again be different from our earlier models, but the negative signs associated with the b’s would continue to indicate that as income goes up and as the percentage of married women in the ZIP code goes up, response goes down.
Up to this point, we have said nothing about offer, but our traditional response analysis tells us that response is very significantly affected by offer. To build offer into the model, we have to create something called a dummy variable. Dummy variables can take on only two conditions: a yes represented by a 1 if the condition exists, and a no represented by a 0 if the condition does not exist.
In this case, any one person could have received any one of three offers. In situations such as these, where there are more than two categories, the rule is that you create one less dummy variable than the number of possible categories.
Let’s call these dummy variables D1 and D2. In the model-building process, each person will now have two variables added to their record: D1 and D2. One more step is required. We have to decide on a convention or rule for assigning the 1 ‘s and O’s.
But we can be arbitrary. Let’s decide that a person will receive a 1 on the dummy variable Dl if they received the $2.95 offer. This means that anyone who received the $2.95 offer will receive a 1 on D1 and will automatically receive a 0 on the dummy variable D2.
Let’s also agree that anyone who received the $4.95 offer will receive a 1 on the dummy variable D2 and will therefore receive a 0 on the Dummy Variable Dl.
Persons who received the $6.95 offer will receive a 0 on both dummy variables DI and D2.
Now we ask our statistician to the run the model again and we get the following:
Response = a – bl(AVGHHINC) – b2(%MAR) +b3Dl – b4D2
What do we have? Since both D1 and D2 appear in the final equation, we can assume that the modeling process found them to be “statistically significant” – as we knew they would be by looking at the response analysis. Additionally, if we were using real numbers instead of letters, we would also see that the value of b3 (the coefficient of Dl) would be larger than the value of b4. We know this because the lift from the $2.95 offer is larger than the lift from the $4.95 offer.
We also know that this new model, which takes offer into account, will “fit” the data better because we know that response is affected by offer and that a model that failed to take offer into account, when offer was important, would have to do relatively poorly.
So, what about scoring the file? While the modeling procedure has quantified the effect of offer, what needs to be done now is to score everyone on the database under the assumption that one or more, or perhaps none of the offers, will be mailed to this entire file.
Let’s suppose we decide that we might want to promote some segment of the file either the $2.95 offer or the $4.95 offer and we agree that we will not promote the $6.95 offer because of its low response rate. If both scores were computed for everyone on the file, given the additive nature of this particular model, and the files were sorted and ranked twice in descending order of each score, the rankings under each scoring system would be identical.
The estimates of each person’s probability of response to each of the offers should be fairly accurate, assuming a logistic regression model were used to create the final scoring equation, and could be stored on the database for future use when promoting these offers.
However, if all we want to do is rank the file in descending order of response, without regard to the absolute accuracy of the expected response rate, it would be sufficient to score the file leaving out the effect of offer (let D1, D2, and even a=)). Sort the file and create 10, or 20 or 100 segments and score only the segment number on the database.
This segment number can be used in the future when deciding how deeply you want to promote into the file. Unfortunately, this decision requires still another analysis. You begin this analysis by questioning what the response rate be would if you promoted the entire file and, given the power of the model(s) built (i.e., how much better or worse than average the response rate for each segment is), how deeply into the file you can reasonably expect to promote… assuming, of course, no difference in expected backend performance.
But that’s a subject for another blog article.