Do You Really Want To Wait For A Data Warehouse?
Tuesday, December 6th, 2011
We were recently asked to critique a plan to have a database system developed that would support database marketing programs. The recommended solution included upgrading a corporate data warehouse that would be the source of data required to support a marketing decision support system. The plan called for a three-tiered relational architecture. The time frame: 18 to 24 months.
The essential requirements of the system included the ability to: profile, model, segment and score customers, plan, design and execute promotions, track results and provide management reports.
In order to accomplish these limited objectives what’s required is: the identification, collection and storage of relevant marketing data from a variety of internal operational systems and external data sources, including mail houses, telemarketing suppliers, and data providers, and the development of the required modeling, promotion and reporting systems.
Given all of the direct marketing systems available today (both relational and non-relational or proprietary) there’s no need (our opinion) to spend years to build a direct marketing database and decision support system from the ground up that would meet the requirements outlined above, unless of course you enjoy doing that kind of thing.
So, why did the plan we reviewed suggest that it would take one to two years to implement?
First of all, it could take a year or more just to upgrade the corporate data warehouse, let alone the associated data marts and end user software tools.
Why it takes so long has to do with the amount of change required in the operational systems providing data to the data warehouse. For example, if a number of source systems contain name and address data, then it will be the job of the warehouse to first identify which fields are considered attributes of a customer, then identify which customers can be defined as duplicates, and then based on some agreed upon logic determine the “correct” name and address for that customer.
Whether this “correct” name is feed back to each input or source system, or whether the logic in the source system is changed to access the data warehouse for name and address data is an open question. The central fact is that operational systems have to be modified to take advantage of the benefits of the consolidation. The extent that common data is found in more than one input file determines the work that has to be done in the data warehouse and in modifications to the operational systems.
Hence, the decision to integrate with a data warehouse as opposed to building a simple marketing database (defined here as a one way consolidation of data from multiple sources, without a feedback mechanism to the source operational systems) caused what was initially a relatively simple request to explode into a complicated, time consuming and expensive project.
What’s more, because the repository was large and could be used for other activities in addition to marketing, a secondary repository, just containing the data needed for marketing was envisioned — a data mart or marketing database.
Finally, in order to meet performance demands for decision support and management reporting, a third data level, containing summary data and the tools required to access summary data was required. In other words, because the volumes were large, a database with more than 5 million customers is considered large, and because database management system was relational, using the detailed data available in the marketing database would not be practical, i.e. it could take hours to answer even relatively simple queries.
Looking at it objectively, what had started as a request to support marketing, grew inadvertently or otherwise into a much larger project. Couple this factor with the choice of relational technology and you will wind up with a two or a three-year project.
So, how to get around this problem? Assuming you are forced or choose to use a relational database management system as your base platform, as opposed to one the many proprietary systems designed for direct marketing applications, you can eliminate the need for a data warehouse/data mart configuration by limiting the data going into the warehouse to just data needed for marketing purposes.
Just make sure that the interfaces are one way feeds from the legacy systems to the database. Thus the data warehouse and the data mart become one in the same thing. Another approach might be to have a relational warehouse and use a proprietary system as the platform for the second tier, eliminating the need for summary data and the third tier tools necessary to access data within acceptable performance standards.
The third choice is to use to one of the proprietary platforms as your marketing database and significantly reduce your development time and benefit from their high performance characteristics. Of course, in order to go proprietary you will probably have to overcome the objections of your IT department which may insist on using a specific relational product regardless of its performance characteristics, because they have settled on that relational standard for the their operational systems and don’t wish to support another piece of technology.