Concise and interpretable multi-label rule sets

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorCiaperoni, Martino
dc.contributor.authorXiao, Han
dc.contributor.authorGionis, Aristides
dc.contributor.departmentDepartment of Computer Science
dc.contributor.departmentHelsinki Institute for Information Technology (HIIT)
dc.contributor.departmentKTH Royal Institute of Technology
dc.contributor.departmentDepartment of Computer Scienceen
dc.date.accessioned2023-10-11T09:36:46Z
dc.date.available2023-10-11T09:36:46Z
dc.date.issued2023-12
dc.descriptionFunding Information: This research is supported by the Academy of Finland project MLDB (325117), the ERC Advanced Grant REBOUND (834862), the EC H2020 RIA project SoBigData++ (871042), and the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation. | openaire: EC/H2020/871042/EU//SoBigData-PlusPlus
dc.description.abstractMulti-label classification is becoming increasingly ubiquitous, but not much attention has been paid to interpretability. In this paper, we develop a multi-label classifier that can be represented as a concise set of simple “if-then” rules, and thus, it offers better interpretability compared to black-box models. Notably, our method is able to find a small set of relevant patterns that lead to accurate multi-label classification, while existing rule-based classifiers are myopic and wasteful in searching rules, requiring a large number of rules to achieve high accuracy. In particular, we formulate the problem of choosing multi-label rules to maximize a target function, which considers not only discrimination ability with respect to labels, but also diversity. Accounting for diversity helps to avoid redundancy, and thus, to control the number of rules in the solution set. To tackle the said maximization problem, we propose a 2-approximation algorithm, which circumvents the exponential-size search space of rulesusing a novel technique to sample highly discriminative and diverse rules. In addition to our theoretical analysis, we provide a thorough experimental evaluation and a case study, which indicate that our approach offers a trade-off between predictive performance and interpretability that is unmatched in previous work.en
dc.description.versionPeer revieweden
dc.format.extent38
dc.format.extent5657-5694
dc.format.mimetypeapplication/pdf
dc.identifier.citationCiaperoni , M , Xiao , H & Gionis , A 2023 , ' Concise and interpretable multi-label rule sets ' , Knowledge and Information Systems , vol. 65 , no. 12 , pp. 5657-5694 . https://doi.org/10.1007/s10115-023-01930-6en
dc.identifier.doi10.1007/s10115-023-01930-6
dc.identifier.issn0219-1377
dc.identifier.issn0219-3116
dc.identifier.otherPURE UUID: c226c10c-93da-4f3e-850c-22a9dbba436d
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/c226c10c-93da-4f3e-850c-22a9dbba436d
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85166017990&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/124159138/Concise_and_interpretable_multi_label_rule_sets.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/123939
dc.identifier.urnURN:NBN:fi:aalto-202310116286
dc.language.isoenen
dc.publisherSpringer
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/871042/EU//SoBigData-PlusPlus
dc.relation.ispartofseriesKnowledge and Information Systemsen
dc.relation.ispartofseriesVolume 65, issue 12en
dc.rightsopenAccessen
dc.subject.keywordInterpretable machine learning
dc.subject.keywordMulti-label classification
dc.subject.keywordRule sampling
dc.subject.keywordRule-based classification
dc.titleConcise and interpretable multi-label rule setsen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files