Fixed Parameter Tractable Algorithm and Coreset for the Ordered k-Median problem
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2023-03-20
Department
Major/Subject
Computer Science
Mcode
SCI3042
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
66 + 0
Series
Abstract
Clustering problems are among the most central problems that arise in many research areas, such as machine learning, data mining, and algorithms. The goal of such problems is to partition data points into k clusters to optimize certain objectives. Classical clustering objectives (k-Means, k-Median, ℓ-centrum) have received significant attention for decades from researchers and practitioners. More recent applications require us to handle much more complex clustering objectives and constraints (e.g., fairness, diversity, and ordering), which are much less understood. This thesis presents new algorithmic results for the Ordered k-Median problem, a generalization of k-Median and ℓ-centrum. Informally, Ordered k-Median addresses the problem of finding the optimal set of centers minimizing the weighted sum of distances of clients to the closest center. A weight vector is introduced to make the solution balanced. Effectively, clients served with higher connection costs contribute with larger weights. Apart from designing the algorithm for solving the problem, our interest points toward corests. Coreset is a highly-applicable data-reduction technique, transforming points into a small (weighted) points set, such that for every set of potential centers, the objective value is approximated within 1 ± ϵ factor. Algorithm We introduce the algorithm for Ordered k-Median, which yields a constant approximation ratio. Our ratio is significantly better than known in the literature [1, 2]. Also, this thesis proposes a novel approach to analyze the ratio t between the smallest and the most significant weight in the weight vector. We propose min(1 + 2/(1 + t), (e + 2)/(t e)) + ϵ approximation algorithm running in Ąxed-parameters tractable algorithm (ignoring the impact of some variables) for problem restricted to Lp norm. Without considering parameter t, the algorithm yields 3 + ϵ approximation factor. The crucial element of the algorithm is the coreset construction. Coreset The coresets for k-Median (less general problem than Ordered k-Median) are already well-studied. The recent results almost match the size lower bounds for both Euclidean Ω( k/ϵ2 ) and general metric space Ω(k log n/ϵ ) [3], where n is a size of the data set. Notably, the size does not become exponentially dependent on the number of dimensions. However, for Ordered k-median, the size needs to be at least Ω(k/ϵd ) [4]. The only known results in this area introduced coreset of size O(k2 log2 n/ϵd+3 ) for Euclidean space with L2 norm [5]. Extending the previous work, we propose a strong and simultaneous (correct for all vectors) coreset construction for Ordered k-Median. We present results for the Euclidean space with the L2 norm, which matches the previous construction size for L2. Moreover, we analyze the impact of restringing the weights vector with the lower bound for the ratio between the smallest and most significant factors. That relaxed versatility allows for parameterizing the coreset size and smooth interpolation between k-Median and unrestricted Ordered k-Median. Our results serve as a vital part in a fixed-parameter tractable algorithm for the Euclidean Ordered k-Median problem. Eventually, the hardness analysis concludes the thesis. We propose further research direction and highlight the limitations we encounter.Description
Supervisor
Chalermsook, ParinyaThesis advisor
Khodamoradi, KamyarSpoerhase, Joachim
Keywords
approximation algorithms, ixed-parameter tractability,, clustering, hardness