A Multilabel Classification Framework for Approximate Nearest Neighbor Search

Loading...
Thumbnail Image

Access rights

openAccess
CC BY
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Date

2024-02

Major/Subject

Mcode

Degree programme

Language

en

Pages

51

Series

Journal of Machine Learning Research, Volume 25, issue 46, pp. 1-51

Abstract

To learn partition-based index structures for approximate nearest neighbor (ANN) search, both supervised and unsupervised machine learning algorithms have been used. Existing supervised algorithms select all the points that belong to the same partition element as the query point as nearest neighbor candidates. Consequently, they formulate the learning task as finding a partition in which the nearest neighbors of a query point belong to the same partition element with it as often as possible. In contrast, we formulate the candidate set selection in ANN search directly as a multilabel classification problem where the labels correspond to the nearest neighbors of the query point. In the proposed framework, partition-based index structures are interpreted as partitioning classifiers for solving this classification problem. Empirical results suggest that, when combined with any partitioning strategy, the natural classifier based on the proposed framework leads to a strictly improved performance compared to the earlier candidate set selection methods. We also prove a sufficient condition for the consistency of a partitioning classifier for ANN search, and illustrate the result by verifying this condition for chronological k-d trees and (both dense and sparse) random projection trees.

Description

Keywords

Other note

Citation

Hyvönen, V, Jääsaari, E & Roos, T 2024, ' A Multilabel Classification Framework for Approximate Nearest Neighbor Search ', Journal of Machine Learning Research, vol. 25, no. 46, pp. 1-51 . < https://jmlr.org/papers/v25/23-0286.html >