Hardness of Maximum Likelihood Learning of DPPs

Elena Grigorescu, Brendan Juba, Karl Wimmer, Ning Xie

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In these applications, the parameters of the DPP need to be fitted to match the data; typically, we seek a set of parameters that maximize the likelihood of the data. The algorithms used for this task to date either optimize over a limited family of DPPs, or use local improvement heuristics that do not provide theoretical guarantees of optimality. It is natural to ask if there exist efficient algorithms for finding a maximum likelihood DPP model for a given data set. In seminal work on DPPs in Machine Learning, Kulesza conjectured in his PhD Thesis (2012) that the problem is NP-complete. The lack of a formal proof prompted Brunel, Moitra, Rigollet and Urschel (2017a) to conjecture that, in opposition to Kulesza’s conjecture, there exists a polynomial-time algorithm for computing a maximum-likelihood DPP. They also presented some preliminary evidence supporting their conjecture. In this work we prove Kulesza’s conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a 1 − poly1log N -approximation to the maximum log-likelihood of a DPP on a ground set of N elements is NP-complete. At the same time, we also obtain the first polynomial-time algorithm that achieves a nontrivial worst-case approximation to the optimal log-likelihood: the approximation factor is (1+o(1))1log m unconditionally (for data sets that consist of m subsets), and can be improved to 1− 1+logo(1)N if all N elements appear in a O(1/N)fraction of the subsets. In terms of techniques, we reduce approximating the maximum log-likelihood of DPPs on a data set to solving a gap instance of a “vector coloring” problem on a hypergraph. Such a hypergraph is built on a bounded-degree graph construction of Bogdanov, Obata and Trevisan (2002), and is further enhanced by the strong expanders of Alon and Capalbo (2007) to serve our purposes.

Original languageEnglish
Pages (from-to)3800-3819
Number of pages20
JournalProceedings of Machine Learning Research
Volume178
StatePublished - 2022
Event35th Conference on Learning Theory, COLT 2022 - London, United Kingdom
Duration: Jul 2 2022Jul 5 2022

Fingerprint

Dive into the research topics of 'Hardness of Maximum Likelihood Learning of DPPs'. Together they form a unique fingerprint.

Cite this