Masterarbeit, 2015
105 Seiten, Note: 1.3
1 Introduction
2 Fundamentals
2.1 Semantic Image Annotation
2.2 The Semantic Gap Problem
2.3 Object Representations in Semantic Object Recognition Systems
2.4 Image Comparison Methods
2.5 Google’s Reverse Image Search Engine
3 Related Works
4 The Semantic Annotation System
4.1 Overview
4.2 The Online-based Annotation System
4.2.1 Processing of Websites and Tag Extraction
4.2.2 The Score Value
4.2.3 Alternate Tags
4.2.4 Substrings in URL
4.2.5 Plural to Singular Conversion
4.2.6 Words from Similar Semantic Context
4.3 The Offline-based Annotation System
4.3.1 Transformations
4.3.2 Invariant Features
4.3.3 Invariant Feature Histogram
4.3.4 Jensen-Shannon Divergence
4.3.5 The Offline Database
5 Implementation
5.1 Robot Operating System
5.2 Libcurl
5.3 WordNet
5.4 Flexible Image Retrieval Engine
6 Testing and Evaluation
6.1 Testing Environment and Notation
6.2 The Online-based Annotation System
6.3 The Offline-based Annotation System
6.4 Combining the Offline and Online Annotation Engine
7 Summary and Outlook
7.1 Summary
7.2 Outlook
The goal of this thesis is to develop an automatic image annotation system that bridges the semantic gap by combining online resources with local prior knowledge, enabling robust object recognition for autonomous mobile robots. The primary research question investigates to what extent standard online search engines and extracted website text can substitute for large, resource-intensive image databases.
2.2 The Semantic Gap Problem
The Semantic Gap Problem addresses an issue of particular importance in the context of automatic semantic image annotation and information extraction from images. It denotes the distance between low-level representations of an image in terms of raw operational data and high level-representations humans use to describe the content of an image and its semantics [Par08]. Those high-level representations cannot be derived from the raw operation data of images directly [HLES06]. There are two main approaches to address this issue: The first approach is premised on a database containing images visually similar to the image to be annotated [BBBZ13]. The second approach relies on supplementary information along the image to be annotated [SC97].
Assuming a database containing a range of images along human made textual labels describing their semantics as fundamental truth data. Crawling this database for images that are visually similar to the image to be annotated using methodologies to estimate image similarity, the manually provided labels of these similar images in the database are combinable. Those newly acquired textual labels describe the queried image. The second approach is to analyze information such as textual descriptions available along the image to be annotated. Following example illustrates the second approach: A website depicts an image with surrounding text. This text supposedly describes the semantics of the depicted image. Those semantic information are used to generate textual labels to describe the depicted image. Both approaches require information in addition to the image’s operational data. The issue of the Semantic Gap Problem and how current research works address it is depicted in figure 2.2.
1 Introduction: Provides an overview of object recognition in robotics, introduces the semantic gap problem, and outlines the system architecture.
2 Fundamentals: Establishes core concepts of content-based image retrieval, image annotation, and the mechanisms of Google’s Reverse Image Search Engine.
3 Related Works: Reviews existing research frameworks for automatic image annotation and discusses the limitations of using large local databases.
4 The Semantic Annotation System: Details the functional design and algorithms of the online and offline annotation engines and their parallel integration.
5 Implementation: Describes the technical realization using the Robot Operating System (ROS), Libcurl, WordNet, and the Flexible Image Retrieval Engine (FIRE).
6 Testing and Evaluation: Presents empirical results of system performance, parameter tuning, and validation in simulated and real-world scenarios.
7 Summary and Outlook: Consolidates the findings of the research and suggests future improvements, particularly in background removal and multi-language support.
Semantic Image Annotation, Semantic Gap Problem, Content-based Image Retrieval, Object Recognition, Robot Operating System, Google Reverse Image Search Engine, Invariant Feature Histogram, Jensen-Shannon Divergence, Automated Tag Extraction, Mobile Rescue Robots, Online Database, Offline Database, Computational Vision, Image Similarity, Automated Tagging
The thesis focuses on creating an automatic image annotation algorithm for mobile robots that bridges the semantic gap by leveraging online image search results to label objects without requiring a massive, pre-existing local image database.
The research addresses the semantic gap—the difficulty of mapping raw sensor data to human-understandable textual labels—and the high cost/resource limitations of traditional, database-heavy object recognition systems.
It explores how effectively online image search engines and contextual text found on websites can be exploited as a viable substitute for traditional large-scale image databases in an annotation algorithm.
The system employs content-based image retrieval (CBIR) methods, including invariant feature histograms, Jensen-Shannon divergence for similarity calculation, and text mining techniques to extract semantic data from websites.
The system is composed of an online-based annotation engine that mines the web and an offline-based annotation engine that uses a local database for prior environment knowledge, which are integrated via the Robot Operating System (ROS).
Key terms include Semantic Image Annotation, Content-based Image Retrieval, Semantic Gap, Robot Operating System, and Invariant Feature Histograms.
The offline module allows the robot to incorporate prior knowledge of specific environments (like hazard symbols), improving recognition precision in situations where the online system might fail due to perspective distortion.
These are specific modular enhancements that re-rank extracted tags based on manually provided metadata in HTML or filenames, which significantly boosts the likelihood of retrieving accurate, environment-relevant labels.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

