Engineering Motif Search for Large Motifs
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2018
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
Series
17th Symposium on Experimental Algorithms, SEA 2018, pp. 1-19, Leibniz International Proceedings in Informatics (LIPIcs) ; Volume 103
Abstract
Given a vertex-colored graph H and a multiset M of colors as input, the graph motif problem asks us to decide whether H has a connected induced subgraph whose multiset of colors agrees with M. The graph motif problem is NP-complete but known to admit randomized algorithms based on constrained multilinear sieving over GF(2^b) that run in time O(2^kk^2m {M({2^b})}) and with a false-negative probability of at most k/2^{b-1} for a connected m-edge input and a motif of size k. On modern CPU microarchitectures such algorithms have practical edge-linear scalability to inputs with billions of edges for small motif sizes, as demonstrated by Björklund, Kaski, Kowalik, and Lauri [ALENEX'15]. This scalability to large graphs prompts the dual question whether it is possible to scale to large motif sizes. We present a vertex-localized variant of the constrained multilinear sieve that enables us to obtain, in time O(2^kk^2m{M({2^b})}) and for every vertex simultaneously, whether the vertex participates in at least one match with the motif, with a per-vertex probability of at most k/2^{b-1} for a false negative. Furthermore, the algorithm is easily vector-parallelizable for up to 2^k threads, and parallelizable for up to 2^kn threads, where n is the number of vertices in H. Here {M({2^b})} is the time complexity to multiply in GF(2^b). We demonstrate with an open-source implementation that our variant of constrained multilinear sieving can be engineered for vector-parallel microarchitectures to yield hardware utilization that is bound by the available memory bandwidth. Our main engineering contributions are (a) a version of the recurrence for tightly labeled arborescences that can be executed as a sequence of memory-and-arithmetic coalescent parallel workloads on multiple GPUs, and (b) a bit-sliced low-level implementation for arithmetic in characteristic 2 to support (a).Description
Keywords
algorithm engineering, constrained multilinear sieving, graph motif problem, multi-GPU, vector-parallel, vertex-localization
Other note
Citation
Kaski, P, Lauri, J & Muniyappa, S 2018, Engineering Motif Search for Large Motifs. in G D'Angelo (ed.), 17th Symposium on Experimental Algorithms, SEA 2018., 28, Leibniz International Proceedings in Informatics (LIPIcs), vol. 103, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 1-19, International Symposium on Experimental Algorithms, L'Aquila, Italy, 27/06/2018. https://doi.org/10.4230/LIPIcs.SEA.2018.28