Sparsity-Promoting Bootstrap Method For Large-Scale Data
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.advisor | Koivunen, Visa | |
dc.contributor.author | Mozafari Majd, Emadaldin | |
dc.contributor.school | Sähkötekniikan korkeakoulu | fi |
dc.contributor.supervisor | Koivunen, Visa | |
dc.date.accessioned | 2017-12-18T11:44:17Z | |
dc.date.available | 2017-12-18T11:44:17Z | |
dc.date.issued | 2017-12-11 | |
dc.description.abstract | Performing statistical inference on massive data sets may not be computationally feasible using the conventional statistical inference methodology. In particular, there is a need for methods that are scalable to large volume and variability of data. Moreover, veracity of the inference is crucial. Hence, there is a need to produce quantitative information on the statistical correctness of parameter estimates or decisions. In this thesis, a scalable non-parametric bootstrap method that operates with smaller number of distinct data points on multiple disjoint subsets of data is proposed. The resampling approach stems from the Bag of Little Bootstraps method and is compatible with distributed storage systems and distributed and parallel processing architectures. Iterative reweighted l_1 norm minimization method is used for each bootstrap replica to find a sparse solution in the face of high-dimensional data set. The proposed method finds reliable estimates even if the problem is not overdetermined for distinct subsets of data by exploiting sparseness. The performance of the proposed method is studied in extensive simulations. It is demonstrated that method gives smaller Root MSE and significantly lower bias than bootstrap employing widely used sparse estimator Basis Pursuit DeNoising. Moreover, better performance is obtained in terms of classification error rate (CER) and recovery rate (RER) in identifying sparse parameters. Estimated confidence intervals are also highly concentrated around the true parameter values. | en |
dc.ethesisid | Aalto 9699 | |
dc.format.extent | 57 | |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/29131 | |
dc.identifier.urn | URN:NBN:fi:aalto-201712187929 | |
dc.language.iso | en | en |
dc.location | P1 | fi |
dc.programme | CCIS - Master’s Programme in Computer, Communication and Information Sciences (TS2013) | fi |
dc.programme.major | Signal, Speech and Language Processing | fi |
dc.programme.mcode | ELEC3031 | fi |
dc.subject.keyword | bag of little bootstraps | en |
dc.subject.keyword | sparsity | en |
dc.subject.keyword | reweighted 1_1 norm minimization | en |
dc.subject.keyword | scalable inference | en |
dc.subject.keyword | underdetermined systems | en |
dc.subject.keyword | parameter estimation | en |
dc.title | Sparsity-Promoting Bootstrap Method For Large-Scale Data | en |
dc.type | G2 Pro gradu, diplomityö | fi |
dc.type.ontasot | Master's thesis | en |
dc.type.ontasot | Diplomityö | fi |