Abstract
We consider the following conditional linear regression problem: the task is to identify both (i) a k-DNF1 condition c and (ii) a linear rule f such that the probability of c is (approximately) at least some given bound µ, and f minimizes the `p loss of predicting the target z in the distribution of examples conditioned on c. Thus, the task is to identify a portion of the distribution on which a linear rule can provide a good fit. Algorithms for this task are useful in cases where simple, learn-able rules only accurately model portions of the distribution. The prior state-of-the-art for such algorithms could only guarantee to find a condition of probability Ω(µ/nk) when a condition of probability µ exists, and achieved an O(nk)-approximation to the target loss, where n is the number of Boolean attributes. Here, we give efficient algorithms for solving this task with a condition c that nearly matches the probability of the ideal condition, while also improving the approximation to the target loss. We also give an algorithm for finding a k-DNF reference class for prediction at a given query point, that obtains a sparse regression fit that has loss within O(nk) of optimal among all sparse regression parameters and sufficiently large k-DNF reference classes containing the query point.
Original language | English |
---|---|
State | Published - 2020 |
Event | 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019 - Naha, Japan Duration: Apr 16 2019 → Apr 18 2019 |
Conference
Conference | 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019 |
---|---|
Country/Territory | Japan |
City | Naha |
Period | 04/16/19 → 04/18/19 |