Lifted Probabilistic Inference in Relational Models

Info

Tutorial at the 25th International Joint Conference on Artificial Intelligence (IJCAI) in New York City
Speakers: Guy Van den Broeck and Dan Suciu

Slides

Download the Complete Tutorial as a single file,
or select individual parts:

Part 1: Motivation
Part 2: Probabilistic Databases
Part 3: Weighted Model Counting
Part 4: Lifted Inference for WFOMC
Part 5: Completeness
Part 6: Query Compilation
Part 7: Symmetric Complexity
Part 8: Open-World Probabilistic Databases
Part 9: Conclusions and References

Description

The tutorial will cover probabilistic inference in statistical relational models (SRMs) and probabilistic databases (PDBs). Both are popular relational representations of uncertainty. Both fields have realized that efficient inference is an enormous challenge, but also a immense opportunity for a new kind of probabilistic reasoning. This is referred to in the AI literature as lifted inference, and in the PDB literature as extensional query evaluation. The tutorial will focus on the big ideas that set probabilistic inference with relations apart from more classical inference problems.

There are several recent breakthrough results that for the first time allow us to talk about SRM and PDB inference in a single coherent framework. Within AI, we have

the realization that inference in probabilistic graphical models can be reduced to a weighted model counting task, and solved by DPLL search, knowledge compilation to d-DNNF, etc., and
the development of lifted inference algorithms, and our deeper understanding of their theoretical properties, strengths, and limitations. This includes connections to statistical exchangeability, (fractional) automorphism, symmetry, and weighted model counting of first-order sentences.

Within PDBs, we have

a classification of probabilistic inference algorithms into intensional and extensional, corresponding to grounded and lifted inference respectively;
a study of the data complexity of probabilistic inference (weighted model counting), where the FO formula is fixed, and one asks for the complexity of the problem as a function of size of the database (i.e. ground facts);
a dichotomy theorem for the data complexity stating that, for any positive, existential FO formula, the weighted model counting problem is either in PTIME, or provable #P-hard in the size of the database,
a more refined analysis of the distinction between grounded inference and lifted inference in terms of data complexity, proving that the latter can be exponentially faster.

Moreover, these two fields have very recently started connecting through the common language of relational logic. We now understand the commonalities and differences between SRMs and PDBs. Typical inference tasks are rather different in nature, yet can be captured in the same weighted model counting framework. Theoretical complexity bounds from one field translate to the other.

The focus of this tutorial is not on a specific lifted inference algorithm - many have been proposed. Instead, we explain why probabilistic inference with relations is different, and should be of interest to many IJCAI attendees. We focus on the fundamental ideas and insights that show why lifted inference algorithms work, which properties they exploit, how they can reduce complexity, but also why inference is still hard in the worst case. A second goal of the tutorial is to explain the connections between the concepts used in probabilistic AI and those used in probabilistic databases. As an application of these general ideas, we also briefly discuss approximate lifted inference, and lifted machine learning algorithms.