Semi-supervised learning with heterophily

Wolfgang Gatterbauer
new [working paper (arXiv:1412.3100)], (Version Dec 2014)
Project page: SSL-H

We propose a novel linear semi-supervised learning formulation that is derived from a solid probabilistic framework: belief propagation. We show that our formulation generalizes a number of label propagation algorithms described in the literature by allowing them to propagate generalized assumptions about influences between classes of neighboring nodes. We call this formulation Semi-Supervised Learning with Heterophily (SSL-H). We also show how the modularization matrix can be learned from observed data with a simple convex optimization framework that is inspired by locally linear embedding. We call this approach Linear Heterophily Estimation (LHE). Experiments on synthetic data show that both approaches combined can learn heterophily of a graph with 1M nodes and 10M edges in under 1min.