
Dipartimento di Matematica Guido Castelnuovo, Università Sapienza Roma
Abstract: In recent years there has been a growing attention to machine learning models which are able to give explanatory insights on the decisions made by the algorithm. Thanks to their interpretability, decision trees have been intensively studied for classification tasks, and, due to the remarkable advances in Mixed-Integer Programming (MIP), various approaches have been proposed to formulate the Optimal Classification Tree (OCT) problem as a MIP model starting from the seminal paper [Bertsimas D. and J. Dunn. "Optimal classification trees." Machine Learning 106.7 (2017)]. In this talk, we will make an overview on classification trees and random forests and on the optimal tree classification formulation. First, we present a novel formulation for binary classification using OCT and exploiting the generalization capabilities of another well know machine learning classifiers which are Support Vector Machines [Cortes C. , and V. Vapnik. "Support vector machine." Machine learning 20.3 (1995)]. In particular, the maximum MARGin Optimal Classification Tree (MARGOT) uses at each node of the decision tree a regularized Linear SVM to define a linear classifiers with maximum margin in a tree structure. The resulting model is a Mixed Integer Quadratic Programming problem. The model can also include feature selection constraints which allows to define a hierarchy on the subsets of features which mostly affect the outcome. We further present an OCT for approximating random forests with a single tree following a similar modelling approach. Numerical results on benchmark dataset are reported.