Salford Systems logo white space
Navigation
white space
white space
white space
white space
white space
Support > Frequently Asked Questions > CART
Insufficient Memory


CART is an extremely memory-intensive program, storing all data used for tree growing in RAM. This means certain large problems may not fit into available memory.


SOLUTION 1: Use A Larger Version of CART

Switch to a larger-memory version of Salford Systems CART. See the platform and operating specific documentation for the limits of the version you are using now, or call Salford Systems for further information.


SOLUTION 2: Adjust Parameters

CART memory requirements depend on several factors, including the number of levels of your dependent variable, the number of independent variables in the search list, the number of categorical predictors, how deep the tree is allowed to grow, and the size of the learning sample. You can experiment with altering these problem dimensions by using the LIMIT, MEMORY and ADJUST commands. MEMORY and ADJUST display your current memory requirements and show you parameter values (if any) that allow your problem to be accommodated.

One extremely useful parameter to adjust is BOPTIONS SUBSAMPLE, which directs CART to search for optimal splits near the top of the tree on a subsample of data, but uses the full data set to populate child nodes. This device saves work space in the nodes with the largest number of cases without reducing the size of the learning sample, while keeping the number of cases in the smaller nodes at their original number.

Other parameters that can be adjusted to save memory include the linear combinations option and the number of categorical predictors. Small amounts of memory can be saved by reducing the number of surrogates tracked and the number of competitors printed.


SOLUTION 3: Limit The Search List

CART will search over all variables in the learning sample looking for the best splits. The search list can be limited by listing explicit split candidates on the MODEL or KEEP statements or by eliminating variables on the EXCLUDE command. If you have an idea as to which variables are unlikely to be useful predictors, you can eliminate them from your first runs.

Alternatively, you can run several models, each with a different set of model variables, eliminating the most unimportant variables from each. The elimination process can continue in this way until you can pool all the remaining variables in a single model. Note, however, that this runs the risk of missing some important interactions.


SOLUTION 4: Subdivide Data

Break the problem into smaller pieces. Start by generating an exploratory tree of DEPTH = 1 to find the optimal split at the root node. Such a small tree should allow you to accommodate a rather large data set. Then, using the optimal or near optimal split, divide the data set into two subsets and run separate CART runs on each subsample. If one split still leaves a problem that is too large, repeat the process a second time to yield four subsets.

Once you have determined the truly important variables in this way, go back to the original data, limit the search list to just those variables, and run the model again for all cases.


Steinberg, Dan and Phillip Colla. CART--Classification and Regression Trees. San Diego, CA: Salford Systems, 1997.
white space
© Copyright 2003-2004 Salford Systems - Print this page white space