Salford Systems logo white space
white space
white space
Resources > Case Studies
Customer Success Stories

Fleet Uses CARTŪ Data Mining Technology to Understand Customer Characteristics and Habits

Hybrid Analysis Methods Target Q3 Home Equity Product Promotion Mailing


Computer Fleet Financial Group, a Boston-based financial services company with assets of more than $97 billion, is currently redesigning its customer service infrastructure, including a $38 million investment in a data warehouse and marketing automation software. To profit from this repository of valuable information on more than 15 million customers, Fleet's analysts are using data mining software, including Salford Systems' (San Diego, Calif.) CARTŪ, to learn about their customers and to better target product promotions, such as home equity lines of credit.

"The real key is implementing a disciplined business plan that enables us to sell the right product to the right customer," says Randall Grossman, senior VP and manager of Fleet's Customer Data Management and Analysis (CDMA) group. To do that, Fleet needed to learn about customers' financial characteristics and buying habits so as to target the mailing list for the company's third quarter home equity product promotion. Victor Lo, a Fleet lead analytic consultant and VP, and his team, were tasked with developing a model to estimate each prospect's probability of responding to the mailing, as well as to estimate the expected profitability of respondents. Based on this expected profitability, the database would be segmented by scores that identify which prospects should receive one of several home equity marketing pieces and which should not receive a mailing at all.

Previous home equity product modeling had been conducted through third party consultants who used a matrix, or a two-dimensional table, to determine which prospects should be mailed which promotional package. The mailings had been profitable, but Fleet's analysts knew that there was more to be learned about customers and prospects. During the first quarter's, (1998), home equity product promotion, Fleet became more involved in the modeling process by assigning prospect response scores and further targeting the mailing. The subsequent third quarter home equity product mailing list was handled completely in-house by combining CART and other data mining and statistical techniques.


Building the Foundation

"The goal of the mailing project for our home equity line of credit was to identify characteristics of would-be customers and to create a predictive model that could score new prospects", says Lo. "We chose to employ CART because it is an advanced, non-parametric data analysis technique that can efficiently handle missing data values. By hybridizing CART and logistic regression techniques, we were able to use each methodology's strength to complement the other's. CART, in particular, brings with it the unique advantage of helping analysts understand people's behavioral patterns, and it provides excellent predictive accuracy with a proven methodological track record."

The first step in the modeling process was to gather the historical data on which to create the model. The team selected a sample of approximately 20,000 customers for which Fleet had a record of responses; included were 100 percent of past profitable respondents, as well as two percent of past non-respondents. The customer records were "massaged" into a data set and output as a text file that could be fed into different modeling tools.


Mating Methods

The data set was then transferred into CART to display the interaction of the data. The resulting effects were subsequently incorporated into a logistic regression model that illustrated the overall and local landscape of the data. When the data were fed into CART, the software automatically generated a decision tree whose branches, or nodes, showed the hierarchy of binary data splits and displayed the data set's myriad variables and their interactions. This hierarchy distilled nearly 100 predictor variables into a more manageable amount of approximately 25. In addition, the CART nodes provided probability ratios that were used to understand why one segment would be more responsive than another.

"The CART analysis enables an intuitive understanding about the variables and the interactions among them", says Lo. "In Fleet's marketing of this product promotion, this is an essential piece of the puzzle - CART helps provide a human understanding of why certain segments respond better than others, as well as what their needs are and what types of offers will provoke a response."


Understanding Customers

Opening door The CART model illustrated certain characteristics of "best" respondents by predicting the expected balance they would carry on the credit line, as well as how much they might transfer from another line. In addition, the CART results painted a portrait of the principal characteristics of the least responsive customers. These prospects would either not likely respond to a Fleet product offer because they do not have a need for a large line of credit, or - equally of concern - they would respond but their subsequent credit line usage and/or likely losses would not be profitable for the bank. "Within a predictive model, accuracy is very important, but it is also important to obtain a true understanding of one's customers", says Lo. "The more we can learn about why a product does or doesn't suit a certain segment, the better we can manage the business and our profitability in the long term; for this project, CART gave us that understanding."

This home equity product mailing might bring an increase in revenues compared with past product mailings. More importantly, it is expected to have a much higher profitability due to more efficient targeting and lower marketing costs, which would give Fleet a higher return on investment. Lo's team is cautiously taking into consideration other factors, such as the mailing's time of year and the number of other financial product offerings customers have received recently.

"Test and control groups are needed to validate the efficiency of our targeting with this predictive mode,l"says Lo. "We are, however, very confident that Fleet will achieve a high response rate with this mailing. Our customers have many more dimensions than the previous mailing model could encapsulate for predictions. Creating a hybrid model using CART and our other data mining and statistical tools was a more sophisticated approach that painted a very descriptive portrait of our prospects, enabling us to increase the probability of their response."


Broadening Applications

The third quarter home equity promotion is Fleet's first CART modeling application. The company is currently evaluating and applying many more tools to manage its customer information and data warehouses; in the meantime, future plans for CART include applying it to various neural network algorithms and other projects. Once Fleet has received results from this home equity mailing, for example, the responses will be analyzed again in CART to validate the robustness of the original model and to determine other successes, such as whether or not the probability scores are accurately reflected in the response rate. Then CART will be used to construct a new model for the next home equity product mailing based on this mailing's response rates. Says Lo, "This test and learn approach is on-going, and the demand for better models and sophisticated techniques will continually grow."

Fleet will continue to use CART to gain a deeper understanding of its customers so that the information can be applied to classification and segmentation applications among Fleet's other product lines. "Fleet's product managers are anxious to have us determine customer characteristics, identify cross selling opportunities for products - such as certificates of deposit, money markets and mutual funds - and build predictive models for their promotions,"says Lo. "CART's insight into our customers will help us better support our marketing departments, and it will help Fleet harvest an impressive return on our data warehouse and customer information investments."



Return to top



Pfizer Enlists CARTŪto Score Male Erectile Dysfunction Diagnostic Test

Pfizer Inc., a major research based, global health care company, is at the forefront of research on therapies for male erectile dysfunction (ED). Recently, the company has received FDA approval for a new treatment called Viagra, the first oral treatment for this condition.


Multi-Faceted Marketing Package

Pfizer Pfizer plans to launch an ambitious marketing campaign that is designed to create awareness about male erectile dysfunction (ED), as well as to promote Viagra. To enhance the success of the product launch, the company is taking a multi-faceted approach because that ED is a sensitive topic that many experts believe is under- diagnosed and under-treated.

For part of the marketing campaign, Pfizer is developing Outcomes Research Tools for primary care physicians, urologists and other professionals. One of these tools is the Sexual Health Inventory for Men, (SHIM:IIEF-5), which is a five question, self-administered diagnostic test that can help indicate the presence or absence of ED. The SHIM:IIEF-5 can serve as a clinical aid to prompt further investigation of ED, as well as a discussion about available treatment options. This diagnostic test was developed using a combination of Pfizer's exhaustive professional research efforts and sophisticated data analysis tools, such as Salford Systems' CARTŪ.


Creating the Diagnostic Test

Pfizer awarded an ED research grant to a team led by Dr. Raymond C. Rosen, an internationally-recognized ED expert. The research resulted in the development of a multi-dimensional scale for assessing ED. This scale, the International Index of Erectile Function (IIEF), which became the primary efficacy measure in the Viagra phase trials, was published in 1997 in UROLOGY 49:822-830.1 For use in research and clinical settings, this questionnaire is a self administered, 15-item measure that is cross culturally valid and psychometrically sound, with the ability to detect treatment related changes in patients with ED. The IIEF addresses five relevant domains of male sexual function - erectile function, orgasmic function, sexual desire, intercourse satisfaction and overall satisfaction. The SHIM:IIEF-5 is based on this IIEF measure.


Preserving Accuracy While Trimming Time

After the clinical research, Pfizer led a worldwide market research effort that interviewed primary care physicians and urologists to determine the IIEF's usability in those commercial settings. The overall findings indicated that an abbreviated version of the IIEF would further increase acceptance by doctors and patients, making it a more valuable diagnostic tool to help identify patients with ED. Pfizer then tasked its researchers to use proven statistical methods to reduce the 15-item IIEF to five questions that would conform to the National Institutes of Health (NIH) definition of ED2, while best distinguishing between the presence and absence of ED.

"We used our rich data set and a combination of statistical techniques to determine a diagnostically optimal set of five questions that would conform to the NIH's definition of ED," says Dr. Joseph C. Cappelleri, associate director of biometrics in the statistics group at Pfizer Central Research in Groton, Conn. "In addition, we needed an objective way to identify the cut-off point that best distinguished between the presence and absence of ED. Salford Systems' CART classification and regression tree software was ideally suited for accurately providing this optimal cut-off point."

The SHIM:IIEF-5 was developed using data from four major studies of men diagnosed with ED and two control samples of men without a history of ED. The data for the diagnostic evaluation of the SHIM:IIEF-5 included 1,152 men: 1,036 with diagnosed ED and a control group of 116 men without ED. For the trial data, men met inclusionary criteria, such as being 18 years or older, being in a stable, sexual relationships for the last six months, and having a clinical diagnosis of ED for at least six months. Men in the control group - those not clinically diagnosed with ED - were volunteers recruited from an outpatient community health center.


Methods for Success

The data were analyzed using CART and logistic regression methodologies in concert. CART was used to rate the relative importance of each of the IIEF's 15 items in terms of their ability to discriminate between the presence and absence of ED.

Salford Systems' CART, an integral part of Dr. Cappelleri's data analysis, is a highly accurate tool with a mature theoretical foundation that is based on the original CART code developed by world renowned statisticians from Stanford University and the University of California at Berkeley. The software automatically considers all variables at the same time and categorizes the data by binary (two-way) splits. This series of splits is displayed as an easy to interpret decision tree, which CART automatically optimizes by choosing the tree structure with the lowest "misclassification cost," or probability that values are placed in the wrong categories. "In ranking the relative importance of the IIEF's 15 items, CART was an indispensable tool because it evaluated all items simultaneously from a multivariate framework", says Dr. Cappelleri. "Within seconds, CART ranked the items according to how well they partitioned the outcome measure - ED or no ED."


Selecting the Questions

Once Cappelleri was confident with his model, the next step was to evaluate if the top-ranking questions conformed to the NIH's definition of ED. Dr. Cappelleri and his Pfizer colleagues found firm agreement between the CART results and the NIH definition of ED. Such corroboration extended quantifiable support to the NIH definition and credence to the effectiveness of the CART results. The five specific items selected and their diagnostic evaluation are expected to be highlighted and discussed in an upcoming professional publication.


Making the Score

The next step after selecting the questions was to develop a scoring system that would be easy to administer. In this case, Dr. Cappelleri wanted to determine a cut-off point in which men scoring at that point or lower on the SHIM:IIEF-5 could be classified as having ED, while men scoring higher could be classified as having normal erectile functionality. In addition to CART, logistic regression was applied to help generate a receiver operating characteristic curve that further supported the CART results. However, the curve generated a series of possibilities, rather than a definitive cut-off point. Dr. Cappelleri then used CART to develop a scoring system to determine an objective SHIM:IIEF-5 score that gave a high level of sensitivity (high probability of correctly identifying ED) and specificity (high probability of correctly identifying men without ED). This scoring system is expected to be easy to administer and quick to calculate in a clinical setting.


Validating Results

Pfizer When the models were complete, Dr. Cappelleri used CART's unique cross-validation feature to ensure that the SHIM:IIEF-5 model would stand up to new, fresh data. For the cross-validation procedure, CART automatically withholds a randomly chosen 10 percent of the data as a "test sample." The remaining 90 percent of the data, or the "learning sample," is used to generate a model, and the test sample is then dropped through the learning model to determine if the results are still valid. Users can change the number of times this validation process is completed, but, after CART's 10 model default, the results were so robust that Dr. Cappelleri was confident that "it convincingly corroborated the original results from the data. "We compared the misclassification table of the cross validation results with that of the original results," recalls Dr. Cappelleri. "These two misclassification tables were very similar, giving the same almost identical for both sensitivity and specificity."


To Your Health

Research on a diagnostic tool for ED was spurred by Pfizer's commitment to improving patients' quality of life, helping them live longer, healthier and more fulfilling lives. As a tool in the identification of such an under-diagnosed condition, the SHIM:IIEF-5 is a crucial part of the Outcomes Research Tools and marketing programs for Viagra.

"Male erectile dysfunction is a very sensitive topic, and our research shows that an easy to use, robust and accurate instrument is necessary to aid in diagnosing the condition," says Dr. Cappelleri. "When patients' scores indicate the presence of erectile dysfunction, doctors can further investigate the situation and then discuss treatment options that can lead to improving the patients' health, self-esteem, quality of life and interpersonal relationships. At Pfizer, we recognize that one way to improve patient care is by developing and applying scientifically sound methods to help diagnose and treat conditions. Proven analytical tools, such as CART, enhance our insight into data and help us make efficient use of our resources."


REFERENCES
  1. UROLOGY 49:822-830 article: "The International Index of Erectile Function (IIEF): A Multidimensional Scale for Assessment of Erectile Dysfunction" by Raymond C. Rosen, Alan Riley, Gorm Wagner, Ian H. Osterloh, John Kirkpatrick and Avanish Mishra.
  2. NIH Consensus Development Panel on Impotence: Impotence published by the Journal of the American Medical Association 270:83-90, 1993.




Return to top



white space
© Copyright 2003-2004 Salford Systems - Print this page white space