Engineering Motif Search for Large Motifs
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.author | Kaski, Petteri | en_US |
dc.contributor.author | Lauri, Juho | en_US |
dc.contributor.author | Muniyappa, Suhas | en_US |
dc.contributor.department | Department of Computer Science | en |
dc.contributor.editor | D'Angelo, Gianlorenzo | en_US |
dc.contributor.groupauthor | Adj. Prof. Gionis Aris group | en |
dc.contributor.groupauthor | Professorship Kaski Petteri | en |
dc.contributor.groupauthor | Helsinki Institute for Information Technology (HIIT) | en |
dc.contributor.organization | Nokia Bell Labs Finland | en_US |
dc.date.accessioned | 2018-12-10T10:32:33Z | |
dc.date.available | 2018-12-10T10:32:33Z | |
dc.date.issued | 2018 | en_US |
dc.description.abstract | Given a vertex-colored graph H and a multiset M of colors as input, the graph motif problem asks us to decide whether H has a connected induced subgraph whose multiset of colors agrees with M. The graph motif problem is NP-complete but known to admit randomized algorithms based on constrained multilinear sieving over GF(2^b) that run in time O(2^kk^2m {M({2^b})}) and with a false-negative probability of at most k/2^{b-1} for a connected m-edge input and a motif of size k. On modern CPU microarchitectures such algorithms have practical edge-linear scalability to inputs with billions of edges for small motif sizes, as demonstrated by Björklund, Kaski, Kowalik, and Lauri [ALENEX'15]. This scalability to large graphs prompts the dual question whether it is possible to scale to large motif sizes. We present a vertex-localized variant of the constrained multilinear sieve that enables us to obtain, in time O(2^kk^2m{M({2^b})}) and for every vertex simultaneously, whether the vertex participates in at least one match with the motif, with a per-vertex probability of at most k/2^{b-1} for a false negative. Furthermore, the algorithm is easily vector-parallelizable for up to 2^k threads, and parallelizable for up to 2^kn threads, where n is the number of vertices in H. Here {M({2^b})} is the time complexity to multiply in GF(2^b). We demonstrate with an open-source implementation that our variant of constrained multilinear sieving can be engineered for vector-parallel microarchitectures to yield hardware utilization that is bound by the available memory bandwidth. Our main engineering contributions are (a) a version of the recurrence for tightly labeled arborescences that can be executed as a sequence of memory-and-arithmetic coalescent parallel workloads on multiple GPUs, and (b) a bit-sliced low-level implementation for arithmetic in characteristic 2 to support (a). | en |
dc.description.version | Peer reviewed | en |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Kaski, P, Lauri, J & Muniyappa, S 2018, Engineering Motif Search for Large Motifs. in G D'Angelo (ed.), 17th Symposium on Experimental Algorithms, SEA 2018., 28, Leibniz International Proceedings in Informatics (LIPIcs), vol. 103, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 1-19, International Symposium on Experimental Algorithms, L'Aquila, Italy, 27/06/2018. https://doi.org/10.4230/LIPIcs.SEA.2018.28 | en |
dc.identifier.doi | 10.4230/LIPIcs.SEA.2018.28 | en_US |
dc.identifier.isbn | 978-3-95977-070-5 | |
dc.identifier.issn | 1868-8969 | |
dc.identifier.other | PURE UUID: e198762a-7b4f-4561-a6a5-fa2b3d704f2a | en_US |
dc.identifier.other | PURE ITEMURL: https://research.aalto.fi/en/publications/e198762a-7b4f-4561-a6a5-fa2b3d704f2a | en_US |
dc.identifier.other | PURE LINK: http://drops.dagstuhl.de/opus/volltexte/2018/8963/ | en_US |
dc.identifier.other | PURE FILEURL: https://research.aalto.fi/files/30110608/LIPIcs_SEA_2018_28.pdf | en_US |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/35312 | |
dc.identifier.urn | URN:NBN:fi:aalto-201812106327 | |
dc.language.iso | en | en |
dc.relation.ispartof | International Symposium on Experimental Algorithms | en |
dc.relation.ispartofseries | 17th Symposium on Experimental Algorithms, SEA 2018 | en |
dc.relation.ispartofseries | pp. 1-19 | en |
dc.relation.ispartofseries | Leibniz International Proceedings in Informatics (LIPIcs) ; Volume 103 | en |
dc.rights | openAccess | en |
dc.subject.keyword | algorithm engineering | en_US |
dc.subject.keyword | constrained multilinear sieving | en_US |
dc.subject.keyword | graph motif problem | en_US |
dc.subject.keyword | multi-GPU | en_US |
dc.subject.keyword | vector-parallel | en_US |
dc.subject.keyword | vertex-localization | en_US |
dc.title | Engineering Motif Search for Large Motifs | en |
dc.type | A4 Artikkeli konferenssijulkaisussa | fi |
dc.type.version | publishedVersion |