On regression tree models for longitudinal data

Cecília Castro
CMAT-UM

December 3, 2019

Major advantages of decision tree modeling is that the interpretation is simple and allows prediction “out-of-sample”.
There are several algorithms for construct a decision tree that differ in the structure of the tree, the splitting criteria, when to stop splitting and how to estimate the simple models within the leaf nodes.
The classification and regression trees (CART) algorithm by Breiman, Friedman, Olshen and Stone [1] is probably the most popular algorithm for tree induction.
Many of the ideas found in the CART algorithms, were implemented in methods of rpart R program ([3],[4]).
Hajjem et al. (([5],[6])) and Sela & Simonoff ([2]) independently proposed an estimation method, which fits a model consisting of the sum of a random effects term and a tree-structured term. The tree-structured term estimates the population-level expected response, the so-called “fixed effects”. So, they generalize linear mixed effects model to tree-based models.
Methods that account for the longitudinal structure of the data are crucial in prediction. In fact, the recognition of a correlation “within-subject” should improve the accuracy of predictions.
In this talk the RE-EM algorithm ([2]) will be discuss as well the beneficts (to explain and to predict) of take in account the longitudinal structure at the individual level. It will be present some examples using the R package REEMtree.

References

[1] L. Breiman, J.H. Friedman, R.A. Olshen and C.J Stone. Classi cation and Regression Trees. Wadsworth, Belmont, Ca, 1983.

[2] A. Hajjem, F. Bellavance, and D. Larocque. Mixed-e ects regression trees for clustered data. Les Cahiers du GERAD G-2008-57, 2008.

[3] A. Hajjem, F. Bellavance and D. Larocque. Mixed e ects regression trees for clustered data. Statistics and Probability Letters, 81, 451{459, 2011.

[4] R. J. Sela and Simono , RE-EM trees: a data mining approach for longitudinal and clustered data J.S. Mach Learn (2012). 86: 169. https://doi.org/10.1007/s10994-011-5258-3

[5] T.M. Therneau. A short introduction to recursive partitioning. Orion Technical Report 21, Stanford University, Department of Statistics, 1983.

[6] T.M Therneau and E.J Atkinson. An introduction to recursive partitioninging using the rpart routines. Divsion of Biostatistics 61, Mayo Clinic, 1997.