dc.contributor |
Aalto-yliopisto |
fi |
dc.contributor |
Aalto University |
en |
dc.contributor.author |
Magnusson, Måns |
|
dc.contributor.author |
Jonsson, Leif |
|
dc.contributor.author |
Villani, Mattias |
|
dc.date.accessioned |
2019-07-30T07:15:48Z |
|
dc.date.available |
2019-07-30T07:15:48Z |
|
dc.date.issued |
2020-03-01 |
|
dc.identifier.citation |
Magnusson , M , Jonsson , L & Villani , M 2020 , ' DOLDA : a regularized supervised topic model for high-dimensional multi-class regression ' , Computational Statistics , vol. 35 , no. 1 , pp. 175-201 . https://doi.org/10.1007/s00180-019-00891-1 |
en |
dc.identifier.issn |
0943-4062 |
|
dc.identifier.issn |
1613-9658 |
|
dc.identifier.other |
PURE UUID: 3eb16b6c-aa7b-4183-b09d-cc5016c54f65 |
|
dc.identifier.other |
PURE ITEMURL: https://research.aalto.fi/en/publications/3eb16b6c-aa7b-4183-b09d-cc5016c54f65 |
|
dc.identifier.other |
PURE LINK: http://www.scopus.com/inward/record.url?scp=85067414496&partnerID=8YFLogxK |
|
dc.identifier.other |
PURE FILEURL: https://research.aalto.fi/files/35124979/Magnusson2019_Article_DOLDAARegularizedSupervisedTop.pdf |
|
dc.identifier.uri |
https://aaltodoc.aalto.fi/handle/123456789/39416 |
|
dc.description.abstract |
Generating user interpretable multi-class predictions in data-rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant probit model (Johndrow et al., in: Proceedings of the sixteenth international conference on artificial intelligence and statistics, 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al. in Biometrika 97:465–480, 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model’s predictive accuracy and scalability, and demonstrate DOLDA’s advantage in interpreting the generated predictions. |
en |
dc.format.extent |
27 |
|
dc.format.mimetype |
application/pdf |
|
dc.language.iso |
en |
en |
dc.publisher |
Springer Verlag |
|
dc.relation.ispartofseries |
Computational Statistics |
en |
dc.rights |
openAccess |
en |
dc.title |
DOLDA |
en |
dc.type |
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä |
fi |
dc.description.version |
Peer reviewed |
en |
dc.contributor.department |
Department of Computer Science |
|
dc.contributor.department |
Ericsson AB |
|
dc.contributor.department |
Linköping University |
|
dc.subject.keyword |
Diagonal Orthant probit model |
|
dc.subject.keyword |
Horseshoe prior |
|
dc.subject.keyword |
Interpretable models |
|
dc.subject.keyword |
Latent Dirichlet Allocation |
|
dc.subject.keyword |
Text classification |
|
dc.identifier.urn |
URN:NBN:fi:aalto-201907304471 |
|
dc.identifier.doi |
10.1007/s00180-019-00891-1 |
|
dc.type.version |
publishedVersion |
|