Masterarbeit, 2012
43 Seiten, Note: 1.00
CHAPTER 1: INTRODUCTION
1.1 Objective of the work
1.2 Introduction to data mining
1.3 Missing values
1.4 Missing value imputation
1.5 Model flow diagram
1.6 Organizationof the report
1.7 Summary
CHAPTER 2: LITERATURE REVIEW
2.1 Introduction
2.2 Literature review
2.2.1 Missing values
2.2.2 Missing value imputation
2.2.3 Kernel functions
2.3 Summary
CHAPTER 3: DATASET DESCRIPTION
3.1 Introduction
3.2 Data set description
3.3 Summary
CHAPTER 4: IMPUTATION TECHNIQUES
4.1 Introduction
4.2 K –Nearest neighbor imputation method
4.3 Experimental results for imputation done using K-NN
4.4 Frequency Estimation Method
4.5 Experimental results for frequency estimator
4.6 Kernel Functions
4.7 Imputation using RBF kernel
4.8 Experimental results for rbf kernel
4.9 Imputation using poly kernel
4.10 Experimental results for poly kernel
4.11 Summary
CHAPTER 5: IMPUTATION USING MIXTURE OF KERNELS
5.1 Introduction
5.2 Interpolation and Extrapolation
5.3 Mixture of kernels
5.4 Experimental results for mixture of kernels
5.5 Imputation using spherical kernel with rbf kernel
5.6 Experimental results for imputation using spherical kernel and rbf kernel
5.7 Imputation using spherical kernel and poly kernel
5.8 Experimental results for spherical kernel and poly kernel
5.9 Summary
CHAPTER 6: RESULTS AND DISCUSSION
6.1 Introduction
6.2 Performance evaluation
6.3 Experimental results and discussion
6.4 Discussion of results
6.5 Summary
CHAPTER 7: CONCLUSION AND FUTURE WORK
7.1 Conclusion
7.2 Future work
The primary objective of this study is to develop and evaluate a mixture kernel-based iterative nonparametric estimator for imputing missing values in mixed-attribute datasets. By leveraging both complete and incomplete instances, the proposed approach aims to mitigate the information loss typically associated with converting continuous values to discrete ones during imputation. The performance is rigorously assessed using Root Mean Square Error (RMSE) and correlation coefficients across multiple standard datasets.
1.1 Objective of the work
The main objective of this work is to use an estimator for imputing missing values in mixed attribute datasets by utilising the information present in incomplete instances also apart from the complete instances. This approach prevents loss of information which occurs when continuous values are converted into discrete values and vice versa for imputation.
This method is evaluated with extensive experiments and is compared with some typical algorithms and the performance is evaluated in terms of root mean square error and correlation coefficients.
This chapter begins with the brief introduction to data mining concepts, missing values and missing value imputation and concludes with the organization of the report.
CHAPTER 1: INTRODUCTION: Provides an overview of data mining, the challenge of missing values in mixed-attribute datasets, and outlines the objectives of the research.
CHAPTER 2: LITERATURE REVIEW: Examines previous studies regarding missing value imputation, data mining techniques, and the mathematical foundations of kernel functions.
CHAPTER 3: DATASET DESCRIPTION: Details the five publicly available datasets used for experimental validation and explains the procedure for simulating missing values.
CHAPTER 4: IMPUTATION TECHNIQUES: Discusses standard techniques like K-Nearest Neighbor and individual kernel methods (RBF, Polynomial) for filling in missing data.
CHAPTER 5: IMPUTATION USING MIXTURE OF KERNELS: Introduces hybrid kernel strategies, including the combination of spherical, RBF, and polynomial kernels to enhance model performance.
CHAPTER 6: RESULTS AND DISCUSSION: Compares the experimental performance of all proposed methods based on RMSE and correlation coefficients across the selected datasets.
CHAPTER 7: CONCLUSION AND FUTURE WORK: Summarizes the effectiveness of the proposed mixture kernel estimators and suggests future research directions in advanced kernel modeling.
Data Mining, Missing Value Imputation, Mixed-Attribute Datasets, Kernel Functions, RBF Kernel, Polynomial Kernel, Spherical Kernel, Mixture of Kernels, Root Mean Square Error, Correlation Coefficient, K-Nearest Neighbor, Nonparametric Estimator, Predictive Modeling, Interpolation, Extrapolation
This research focuses on the challenge of imputing missing values specifically in mixed-attribute datasets, where independent attributes consist of both continuous and discrete types.
The study investigates various techniques including K-Nearest Neighbors, frequency estimation, and several kernel-based methods such as RBF, polynomial, and spherical kernels.
The objective is to minimize information loss by utilizing information from both complete and incomplete data instances, thereby creating a more accurate estimator for missing values.
Performance is quantitatively assessed by measuring the Root Mean Square Error (RMSE) and the correlation coefficient between the original and imputed values.
The main part of the report covers theoretical foundations, detailed descriptions of the used datasets, the implementation of various imputation techniques, and extensive experimental results.
Key terms include Data Mining, Missing Value Imputation, Kernel Functions, Mixed-Attribute Datasets, and Predictive Performance Evaluation.
Mixture kernels are utilized because they combine the strengths of different kernels—specifically the interpolation ability of local kernels like RBF and the extrapolation capability of global kernels like polynomial.
The spherical kernel, classified as a higher-order kernel, is shown to provide superior imputation accuracy, yielding lower RMSE values and higher correlation coefficients when mixed with other kernels.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

