Doktorarbeit / Dissertation, 2018
138 Seiten
1. Introduction
2. Literature Survey
3. OCNN (Prototype Selection)
4. Prototype Selection and Feature Extraction
5. Prototype Selection and Feature Selection
6. Summary, Conclusions and Future Work
The primary objective of this thesis is to reduce the computational requirements of the Nearest Neighbor Classifier (NNC) by addressing the challenges of high storage and processing time through the reduction of training set cardinality and data dimensionality.
3.1 Proposed Methodology
The Proposed method OCNN is based on the intuitive notion of retaining the patterns which are close to decision boundary and ignoring other patterns. Obviously the patterns that are close to decision boundary play the important role in the classification of a query pattern. A decision boundary is the hyperplane that separates the space into regions keeping the aim such that each region contains the patterns which exhibits the similar behaviour. The true decision boundary is unknown by definition. The next best alternative is piecewise linear decision boundary. The piecewise linear decision boundary is shown in Figure 3.1. The decision boundary need not be linear. The Figure 3.1 shows decision boundary for the two class category problem.
The proposed method is explained with the example training set shown in Figure 3.1. There are two classes present in the Figure 3.1. The positive class samples are represented by the sign ’+’ and negative class tuples are represented by the sign ’-’. For example, if a query pattern ’*’ lies on the left side of the boundary decision, then it is classified as ’-’ class otherwise ’+’ class. That means the patterns that are near to decision boundary plays important role in classification. The remaining patterns can be ignored from the training set. This is the central theme of the proposed method. In the example present in the Figure 3.1, two samples from ’+’ class and two samples from ’-’ class which are near to decision boundary are considered as typical patterns when compared to other training patterns because these patterns plays crucial role in classification process. This can be found by using the Other Class Nearest Neighbors algorithm(OCNN) discussed in the next section 3.2.
1. Introduction: This chapter provides an overview of Nearest Neighbor Classification (NNC) and the related fields of Machine Learning, Pattern Recognition, and Data Mining, while outlining the motivation and scope of the research.
2. Literature Survey: This chapter reviews existing literature on NNC, focusing on various prototype selection, prototype generation, and feature reduction techniques.
3. OCNN (Prototype Selection): This chapter introduces the proposed Other Class Nearest Neighbor (OCNN) prototype selection algorithm and presents its experimental validation against existing methods.
4. Prototype Selection and Feature Extraction: This chapter discusses the cascading of the OCNN method with Linear Discriminant Analysis (LDA) to achieve both cardinality and dimensionality reduction.
5. Prototype Selection and Feature Selection: This chapter details the combination of the OCNN method with the Sequential Forward Selection (SFS) algorithm to optimize the feature subset for classification.
6. Summary, Conclusions and Future Work: This chapter concludes the thesis by summarizing the findings and suggesting future research directions for simultaneous data reduction.
Nearest Neighbor Classifier, NNC, Prototype Selection, OCNN, Feature Extraction, Feature Selection, Dimensionality Reduction, Machine Learning, Data Mining, Pattern Recognition, LDA, SFS, Classification Accuracy, Pattern Synthesis, Computational Complexity
The research focuses on overcoming the high computational requirements of the Nearest Neighbor Classifier, specifically addressing the storage and time overheads associated with large training datasets.
The primary challenges include high storage requirements, sensitivity to noise, and high computational time, especially when dealing with large datasets or high-dimensional features.
The objective of the Other Class Nearest Neighbor (OCNN) method is to perform prototype selection by retaining only the patterns that are close to the decision boundary, thus reducing the training set size without significant loss of accuracy.
The work employs prototype selection (OCNN), feature extraction (LDA), and feature selection (SFS) techniques, evaluating them across various benchmark datasets from the UCI Machine Learning Repository.
Computational complexity is improved by reducing both the cardinality of the training set through OCNN and the dimensionality of the feature space through LDA or SFS, thereby decreasing the time required for classification.
Unlike some algorithms that may be order-dependent or require multiple passes over the dataset, OCNN is designed to be order-independent and more efficient in terms of processing time.
The thesis concludes that applying feature extraction methods prior to prototype selection generally yields better results than the reverse order.
Linear Discriminant Analysis (LDA) is used as a dimensionality reduction technique to project high-dimensional data into a lower-dimensional subspace while maximizing class separability.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

