TY - GEN
T1 - Optimal action extraction for random forests and boosted trees
AU - Cui, Zhicheng
AU - Chen, Wenlin
AU - He, Yujie
AU - Chen, Yixin
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/8/10
Y1 - 2015/8/10
N2 - Additive tree models (ATMs) are widely used for data mining and machine learning. Important examples of ATMs include random forest, adaboost (with decision trees as weak learners), and gradient boosted trees, and they are often referred to as the best off-the-shelf classifiers. Though capable of attaining high accuracy, ATMs are not well interpretable in the sense that they do not provide actionable knowledge for a given instance. This greatly limits the potential of ATMs on many applications such as medical prediction and business intelligence, where practitioners need suggestions on actions that can lead to desirable outcomes with minimum costs. To address this problem, we present a novel framework to post-process any ATM classifier to extract an optimal actionable plan that can change a given input to a desired class with a minimum cost. In particular, we prove the NP-hardness of the optimal action extraction problem for ATMs and formulate this problem in an integer linear programming formulation which can be efficiently solved by existing packages. We also empirically demonstrate the effectiveness of the proposed framework by conducting comprehensive experiments on challenging real-world datasets.
AB - Additive tree models (ATMs) are widely used for data mining and machine learning. Important examples of ATMs include random forest, adaboost (with decision trees as weak learners), and gradient boosted trees, and they are often referred to as the best off-the-shelf classifiers. Though capable of attaining high accuracy, ATMs are not well interpretable in the sense that they do not provide actionable knowledge for a given instance. This greatly limits the potential of ATMs on many applications such as medical prediction and business intelligence, where practitioners need suggestions on actions that can lead to desirable outcomes with minimum costs. To address this problem, we present a novel framework to post-process any ATM classifier to extract an optimal actionable plan that can change a given input to a desired class with a minimum cost. In particular, we prove the NP-hardness of the optimal action extraction problem for ATMs and formulate this problem in an integer linear programming formulation which can be efficiently solved by existing packages. We also empirically demonstrate the effectiveness of the proposed framework by conducting comprehensive experiments on challenging real-world datasets.
KW - Actionable knowledge
KW - Additive tree model
KW - Decision tree
KW - Integer linear programming
KW - Random forest
UR - https://www.scopus.com/pages/publications/84954145903
U2 - 10.1145/2783258.2783281
DO - 10.1145/2783258.2783281
M3 - Conference contribution
AN - SCOPUS:84954145903
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 179
EP - 188
BT - KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015
Y2 - 10 August 2015 through 13 August 2015
ER -