Handling Missing Data in Decision Trees: A Probabilistic Approach (bibtex)
by Pasha Khosravi, Antonio Vergari, YooJung Choi, Yitao Liang and Guy Van den Broeck
Abstract:
Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine learning models. As such, handling missing data in decision trees is a well studied problem. In this paper, we tackle this problem by taking a probabilistic approach. At deployment time, we use tractable density estimators to compute the "expected prediction" of our models. At learning time, we fine-tune parameters of already learned trees by minimizing their "expected prediction loss" w.r.t.\ our density estimators. We provide brief experiments showcasing effectiveness of our methods compared to few baselines.
View — Paper PDF
Reference:
Pasha Khosravi, Antonio Vergari, YooJung Choi, Yitao Liang and Guy Van den Broeck. Handling Missing Data in Decision Trees: A Probabilistic Approach, In The Art of Learning with Missing Values Workshop at ICML (Artemiss), 2020.
Bibtex Entry:
@inproceedings{KhosraviArtemiss20,
author = {Khosravi, Pasha and Vergari, Antonio and Choi, YooJung and Liang, Yitao and Van den Broeck, Guy},
title = {Handling Missing Data in Decision Trees: A Probabilistic Approach},
booktitle = {The Art of Learning with Missing Values Workshop at ICML (Artemiss)},
month = 7,
year = {2020},
url = {http://starai.cs.ucla.edu/papers/KhosraviArtemiss20.pdf},
keywords = {workshop}
}PDF Preview:
Powered by bibtexbrowser