Wolfgang Gatterbauer, Stephan Günnemann, Danai Koutra, Christos Faloutsos
PVLDB 8(5):581-592, 2015.
selection [paper (VLDB)], [slides (2MB)], [narrated slides (32MB), annotations only work on Windows], [video (21min)], [Python code on Github], [SQL code on Github], [bib]
Full 18 page version with all proofs (arXiv:1406.7288): [paper (arXiv:1406.7288)], (Version Oct 2014)
Project page: SSL-H
How can we tell when accounts are fake or real in a social network? And how can we tell which accounts belong to liberal, conservative or centrist users? Often, we can answer such questions and label nodes in a network based on the labels of their neighbors and appropriate assumptions of homophily (“birds of a feather flock together”) or heterophily (“opposites attract”). One of the most widely used methods for this kind of inference is Belief Propagation (BP) which iteratively propagates the information from a few nodes with explicit labels throughout a network until convergence. A well-known problem with BP, however, is that there are no known exact guarantees of convergence in graphs with loops. This paper introduces Linearized Belief Propagation (LinBP), a linearization of BP that allows a closed-form solution via intuitive matrix equations and, thus, comes with exact convergence guarantees. It handles homophily, heterophily, and more general cases that arise in multi-class settings. Plus, it allows a compact implementation in SQL. The paper also introduces Single-pass Belief Propagation (SBP), a localized (or “myopic”) version of LinBP that propagates information across every edge at most once and for which the final class assignments depend only on the nearest labeled neighbors. In addition, SBP allows fast incremental updates in dynamic networks. Our runtime experiments show that LinBP and SBP are orders of magnitude faster than standard BP, while leading to almost identical node labels.