Query-Augmented Active Metric Learning

In this article, we propose an active metric learning method for clustering with pairwise constraints. The proposed method actively queries the label of informative instance pairs, while estimating underlying metrics by incorporating unlabeled instance pairs, which leads to a more accurate and efficient clustering process. In particular, we augment the queried constraints by generating more pairwise labels to provide additional information in learning a metric to enhance clustering performance. Furthermore, we increase the robustness of metric learning by updating the learned metric sequentially and penalizing the irrelevant features adaptively. In addition, we propose a novel active query strategy that evaluates the information gain of instance pairs more accurately by incorporating the neighborhood structure, which improves clustering efficiency without extra labeling cost. In theory, we provide a tighter error bound of the proposed metric learning method using augmented queries compared with methods using existing constraints only. Furthermore, we also investigate the improvement using the active query strategy instead of random selection. Numerical studies on simulation settings and real datasets indicate that the proposed method is especially advantageous when the signal-to-noise ratio between significant features and irrelevant features is low.

This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of the American Statistical Association on 2023-07-03, available online: https://www.tandfonline.com/10.1080/01621459.2021.2019045.

Files

Metadata

Work Title Query-Augmented Active Metric Learning
Access
Open Access
Creators
  1. Yujia Deng
  2. Yubai Yuan
  3. Haoda Fu
  4. Annie Qu
Keyword
  1. Active learning
  2. Metric learning
  3. Selective penalty
  4. Semi-supervised clustering
License CC BY-NC 4.0 (Attribution-NonCommercial)
Work Type Article
Publisher
  1. Journal of the American Statistical Association
Publication Date January 28, 2022
Publisher Identifier (DOI)
  1. https://doi.org/10.1080/01621459.2021.2019045
Deposited November 22, 2023

Versions

Analytics

Collections

This resource is currently not in any collection.

Work History

Version 1
published

  • Created
  • Added manuscript_unblinded-1.pdf
  • Added Creator Yujia Deng
  • Added Creator Yubai Yuan
  • Added Creator Haoda Fu
  • Added Creator Annie Qu
  • Published
  • Updated Keyword Show Changes
    Keyword
    • Active learning, Metric learning, Selective penalty, Semi-supervised clustering
  • Updated