Fixed Parameter Tractable Algorithm and Coreset for the Ordered k-Median problem

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2023-03-20

Department

Major/Subject

Computer Science

Mcode

SCI3042

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

66 + 0

Series

Abstract

Clustering problems are among the most central problems that arise in many research areas, such as machine learning, data mining, and algorithms. The goal of such problems is to partition data points into k clusters to optimize certain objectives. Classical clustering objectives (k-Means, k-Median, ℓ-centrum) have received significant attention for decades from researchers and practitioners. More recent applications require us to handle much more complex clustering objectives and constraints (e.g., fairness, diversity, and ordering), which are much less understood. This thesis presents new algorithmic results for the Ordered k-Median problem, a generalization of k-Median and ℓ-centrum. Informally, Ordered k-Median addresses the problem of finding the optimal set of centers minimizing the weighted sum of distances of clients to the closest center. A weight vector is introduced to make the solution balanced. Effectively, clients served with higher connection costs contribute with larger weights. Apart from designing the algorithm for solving the problem, our interest points toward corests. Coreset is a highly-applicable data-reduction technique, transforming points into a small (weighted) points set, such that for every set of potential centers, the objective value is approximated within 1 ± ϵ factor. Algorithm We introduce the algorithm for Ordered k-Median, which yields a constant approximation ratio. Our ratio is significantly better than known in the literature [1, 2]. Also, this thesis proposes a novel approach to analyze the ratio t between the smallest and the most significant weight in the weight vector. We propose min(1 + 2/(1 + t), (e + 2)/(t e)) + ϵ approximation algorithm running in Ąxed-parameters tractable algorithm (ignoring the impact of some variables) for problem restricted to Lp norm. Without considering parameter t, the algorithm yields 3 + ϵ approximation factor. The crucial element of the algorithm is the coreset construction. Coreset The coresets for k-Median (less general problem than Ordered k-Median) are already well-studied. The recent results almost match the size lower bounds for both Euclidean Ω( k/ϵ2 ) and general metric space Ω(k log n/ϵ ) [3], where n is a size of the data set. Notably, the size does not become exponentially dependent on the number of dimensions. However, for Ordered k-median, the size needs to be at least Ω(k/ϵd ) [4]. The only known results in this area introduced coreset of size O(k2 log2 n/ϵd+3 ) for Euclidean space with L2 norm [5]. Extending the previous work, we propose a strong and simultaneous (correct for all vectors) coreset construction for Ordered k-Median. We present results for the Euclidean space with the L2 norm, which matches the previous construction size for L2. Moreover, we analyze the impact of restringing the weights vector with the lower bound for the ratio between the smallest and most significant factors. That relaxed versatility allows for parameterizing the coreset size and smooth interpolation between k-Median and unrestricted Ordered k-Median. Our results serve as a vital part in a fixed-parameter tractable algorithm for the Euclidean Ordered k-Median problem. Eventually, the hardness analysis concludes the thesis. We propose further research direction and highlight the limitations we encounter.

Description

Supervisor

Chalermsook, Parinya

Thesis advisor

Khodamoradi, Kamyar
Spoerhase, Joachim

Keywords

approximation algorithms, ixed-parameter tractability,, clustering, hardness

Other note

Citation