Using visualization, variable selection and feature extraction to learn from industrial data

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en Laine, Sampsa 2012-02-10T08:54:12Z 2012-02-10T08:54:12Z 2003-09-19
dc.identifier.isbn 951-22-6670-9
dc.identifier.issn 1456-2243
dc.description.abstract Although the engineers of industry have access to process data, they seldom use advanced statistical tools to solve process control problems. Why this reluctance? I believe that the reason is in the history of the development of statistical tools, which were developed in the era of rigorous mathematical modelling, manual computation and small data sets. This created sophisticated tools. The engineers do not understand the requirements of these algorithms related, for example, to pre-processing of data. If algorithms are fed with unsuitable data, or parameterized poorly, they produce unreliable results, which may lead an engineer to turn down statistical analysis in general. This thesis looks for algorithms that probably do not impress the champions of statistics, but serve process engineers. This thesis advocates three properties in an algorithm: supervised operation, robustness and understandability. Supervised operation allows and requires the user to explicate the goal of the analysis, which allows the algorithm to discover results that are relevant to the user. Robust algorithms allow engineers to analyse raw process data collected from the automation system of the plant. The third aspect is understandability: the user must understand how to parameterize the model, what is the principle of the algorithm, and know how to interpret the results. The above criteria are justified with the theories of human learning. The basis is the theory of constructivism, which defines learning as construction of mental models. Then I discuss the theories of organisational learning, which show how mental models influence the behaviour of groups of persons. The next level discusses statistical methodologies of data analysis, and binds them to the theories of organisational learning. The last level discusses individual statistical algorithms, and introduces the methodology and the algorithms proposed by this thesis. This methodology uses three types of algorithms: visualization, variable selection and feature extraction. The goal of the proposed methodology is to reliably and understandably provide the user with information that is related to a problem he has defined interesting. The above methodology is illustrated by an analysis of an industrial case: the concentrator of the Hitura mine. This case illustrates how to define the problem with off-line laboratory data, and how to search the on-line data for solutions. A major advantage of algorithmic study of data is efficiency: the manual approach reported in the early took approximately six man months; the automated approach of this thesis created comparable results in few weeks. en
dc.format.extent 56, [68]
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher Helsinki University of Technology en
dc.publisher Teknillinen korkeakoulu fi
dc.relation.ispartofseries Publications in computer and information science. Report A en
dc.relation.ispartofseries 69 en
dc.relation.haspart Laine S., Lappalainen H. and Jämsä-Jounela S.-L., 1995. On-line determination of ore type using cluster analysis and neural networks. Minerals Engineering 8, No. 6, pages 637-648.
dc.relation.haspart Laine S., 1995. Ore type based Expert System for Hitura Concentrator. Barker I. J. (Ed.), Preprints of the 8th IFAC International Symposium on Automation in Mining, Mineral and Metal Processing. Sun City, 1995, pages 321-327.
dc.relation.haspart Laine S., Pulkkinen K. and Jämsä-Jounela S.-L., 2000. On-line determination of the concentrator feed type at Outokumpu Hitura Mine. Minerals Engineering 13, No. 8-9, pages 881-895.
dc.relation.haspart Laine S., 2001. Combining off-line and on-line information in process study using the Self-Organizing Map (SOM). Embrechts J., VanLandingham H. and Ovaska S. (Eds.), Proceedings of IEEE Mountain Workshop of Soft Computing in Industrial Applications. Blacksburg, USA, 2001, pages 71-76.
dc.relation.haspart Laine S., 2001. State based process study using the SOM and a variable selection technique, in evolving solution with neural networks. Baratti R. and Caete J. (Eds.), Proceedings of International Conference on Engineering Applications of Neural Networks. Cagliari, Italy, 2001, pages 15-22.
dc.relation.haspart Laine S., 2002. Finding the variables of interest. Minerals Engineering 15, No. 3, pages 167-176.
dc.relation.haspart Laine S., 2002. Selecting the variables that train a Self-Organizing Map (SOM) which best separates predefined clusters. ICONIP 2002, Singapore, pages 1961-1966.
dc.relation.haspart Laine S., 2003. Automatic extraction of simple features from process data. ICONIP 2003, Istanbul, Turkey, pages 134-137.
dc.subject.other Computer science en
dc.subject.other Automation en
dc.title Using visualization, variable selection and feature extraction to learn from industrial data en
dc.type G5 Artikkeliväitöskirja fi
dc.description.version reviewed en
dc.contributor.department Department of Computer Science and Engineering en
dc.contributor.department Tietotekniikan osasto fi
dc.subject.keyword human learning en
dc.subject.keyword visualization en
dc.subject.keyword variable selection en
dc.subject.keyword feature selection en
dc.subject.keyword feature extraction en
dc.subject.keyword Self-Organizing Map en
dc.subject.keyword data mining en
dc.subject.keyword statistical analysis en
dc.identifier.urn urn:nbn:fi:tkk-000731
dc.type.dcmitype text en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.type.ontasot Doctoral dissertation (article-based) en
dc.contributor.lab Laboratory of Computer and Information Science en
dc.contributor.lab Informaatiotekniikan laboratorio fi

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication