Convex and non-convex approaches for statistical inference with class-conditional noisy labels
We study the problem of estimation and testing in logistic regression with class-conditional noise in the observed labels, which has an important implication in the Positive-Unlabeled (PU) learning setting. With the key observation that the label noise problem belongs to a special sub-class of generalized linear models (GLM), we discuss convex and non-convex approaches that address this problem. A non-convex approach based on the maximum likelihood estimation produces an estimator with several optimal properties, but a convex approach has an obvious advantage in optimization. We demonstrate that in the lowdimensional setting, both estimators are consistent and asymptotically normal, where the asymptotic variance of the non-convex estimator is smaller than the convex counterpart. We also quantify the efficiency gap which provides insight into when the two methods are comparable. In the high-dimensional setting, we show that both estimation procedures achieve l2-consistency at the minimax optimal √s log p/n rates under mild conditions. Finally, we propose an inference procedure using a de-biasing approach. We validate our theoretical findings through simulations and a real-data example.
Files
Metadata
Work Title | Convex and non-convex approaches for statistical inference with class-conditional noisy labels |
---|---|
Access | |
Creators |
|
License | CC BY 4.0 (Attribution) |
Work Type | Article |
Publisher |
|
Publication Date | August 1, 2020 |
Publisher Identifier (DOI) |
|
Deposited | April 28, 2025 |
Versions
Analytics
Collections
This resource is currently not in any collection.