A top-down approach for creating and implementing data mining solutions
1University of Oulu, Faculty of Technology, Department of Electrical and Information Engineering
|Online Access:||PDF Full Text (PDF, 1.3 MB)|
|Persistent link:|| http://urn.fi/urn:isbn:9514281268
|Publish Date:|| 2006-06-13
|Thesis type:||Doctoral Dissertation
|Defence Note:||Academic Dissertation to be presented with the assent of the Faculty of Technology, University of Oulu, for public discussion in the Auditorium TS101, Linnanmaa, on June 22nd, 2006, at 12 noon
Professor Heikki Kälviäinen
Professor Heikki Mannila
The information age is characterized by ever-growing amounts of data surrounding us. By reproducing this data into usable knowledge we can start moving toward the knowledge age. Data mining is the science of transforming measurable information into usable knowledge. During the data mining process, the measurements pass through a chain of sophisticated transformations in order to acquire knowledge. Furthermore, in some applications the results are implemented as software solutions so that they can be continuously utilized. It is evident that the quality and amount of the knowledge formed is highly dependent on the transformations and the process applied. This thesis presents an application independent concept that can be used for managing the data mining process and implementing the acquired results as software applications.
The developed concept is divided into two parts – solution formation and solution implementation. The first part presents a systematic way for finding a data mining solution from a set of measurement data. The developed approach allows for easier application of a variety of algorithms to the data, manages the work chain, and differentiates between the data mining tasks. The method is based on storage of the data between the main stages of the data mining process, where the different stages of the process are defined on the basis of the type of algorithms applied to the data. The efficiency of the process is demonstrated with a case study presenting new solutions for resistance spot welding quality control.
The second part of the concept presents a component-based data mining application framework, called Smart Archive, designed for implementing the solution. The framework provides functionality that is common to most data mining applications and is especially suitable for implementing applications that process continuously acquired measurements. The work also proposes an efficient algorithm for utilizing cumulative measurement data in the history component of the framework. Using the framework, it is possible to build high-quality data mining applications with shorter development times by configuring the framework to process application-specific data. The efficiency of the framework is illustrated using a case study presenting the results and implementation principles of an application developed for predicting steel slab temperatures in a hot strip mill.
In conclusion, this thesis presents a concept that proposes solutions for two fundamental issues of data mining, the creation of a working data mining solution from a set of measurement data and the implementation of it as a stand-alone application.
Acta Universitatis Ouluensis. C, Technica
© University of Oulu, 2006. This publication is copyrighted. You may download, display and print it for your own personal use. Commercial use is prohibited.