CART Variable Importance
CART automatically produces a predictor ranking (variable importance) based on the contribution predictors make to the construction of the tree. Predictor rankings are strictly relative to a specific tree; change the tree and you might get very difffernt rankings. Importance is determined by playing a role in the tree, either as a main splitter or as a surrogate. CART users have the option of fine tuning the variable importance algorithm.
Variable importance, for a particular predictor, is the sum across all nodes in the tree of the improvement scores that the predictor has when it acts as a primary or surrogate (but not competitor) splitter. Specifically, for node i, if the predictor appears as the primary splitter, then it has a contribution toward the importance as:
importance_contribution_node_i = improvement
If instead, the predictor appears as the n'th surrogate instead of the primary predictor, the expression is:
importance_contribution_node_i = (p ^ n) * improvement
in which p is the "surrogate improvement weight": a user controlled parameter which is equal to 1.0 by default and can be set anywhere between 0 and 1. Thus, you are able to specify that surrogate splits contribute less towards a predictor's improvement than do primary splits. This parameter is controlled with the BOPTIONS IMPORTANCE option.
Linear combination splits do not contribute in any way to variable improvement.
If, in the absence of linear combinations, the improvement weight is greater than 0, and the variable has importance = 0.0, it does not appear in the tree as a primary or surrogate splitter, although it may appear as a competitor.
November 2000.
All content above is copyright © 2000, Salford Systems. All rights reserved worldwide. The content of this site is made available to support the licensed use of Salford Systems data mining software products. No part of this site is to be copied, duplicated, modified or redistributed in whole or part without the prior written permission of Salford Systems.

