Handling Missing Data in Decision Trees: A Probabilistic Approach (bibtex)

by Pasha Khosravi, Antonio Vergari, YooJung Choi, Yitao Liang and Guy Van den Broeck
Abstract:
Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine learning models. As such, handling missing data in decision trees is a well studied problem. In this paper, we tackle this problem by taking a probabilistic approach. At deployment time, we use tractable density estimators to compute the "expected prediction" of our models. At learning time, we fine-tune parameters of already learned trees by minimizing their "expected prediction loss" w.r.t.\ our density estimators. We provide brief experiments showcasing effectiveness of our methods compared to few baselines.
Reference:
Pasha Khosravi, Antonio Vergari, YooJung Choi, Yitao Liang and Guy Van den Broeck. Handling Missing Data in Decision Trees: A Probabilistic Approach, In The Art of Learning with Missing Values Workshop at ICML (Artemiss), 2020.
Bibtex Entry:
@inproceedings{KhosraviArtemiss20,
  author    = {Khosravi, Pasha and Vergari, Antonio and Choi, YooJung and Liang, Yitao and Van den Broeck, Guy},
  title     = {Handling Missing Data in Decision Trees: A Probabilistic Approach},
  booktitle = {The Art of Learning with Missing Values Workshop at ICML (Artemiss)},
  month     = 7,
  year      = {2020},
  url       = {http://starai.cs.ucla.edu/papers/KhosraviArtemiss20.pdf},
  keywords  = {workshop}
}
PDF Preview:
(PDF preview not available, download PDF instead)
Powered by bibtexbrowser