Bachelorarbeit, 2018
78 Seiten, Note: 1.0
1 Introduction
2 Research Method
2.1 Related Work
2.2 Research Conduction
3 Background
3.1 Reinforcement Learning
3.1.1 Markov Decision Process
3.1.2 Value Functions
3.1.3 Tabular Solution Methods
3.2 Deep Learning
4 Results
4.1 Value-Based Deep Reinforcement Learning
4.1.1 Deep Q-Learning and Deep Q-Networks
4.1.2 Double Q-Learning and Double Q-Network
4.1.3 Prioritized Replay
4.1.4 Dueling Network
4.1.5 Distributional Reinforcement Learning
4.1.6 Rainbow
4.2 Policy-Based Deep Reinforcement Learning
4.2.1 Asynchronous Advantage Actor-Critic
4.2.2 Trust Region Policy Optimization
4.2.3 Deep Deterministic Policy Gradients
4.2.4 Policy Iteration Using Monte Carlo Tree Search
4.2.5 Evolutionary Algorithms
4.3 Performance of the Algorithms
4.3.1 Atari 2600
4.3.2 MuJuCo
4.3.3 Various Measures
5 Discussion
5.1 Exploration vs. Exploitation
5.2 Need for Rewards
5.3 Knowledge Reusability
5.4 Inefficiency
5.5 Multi-Agent Reinforcement Learning
5.6 Model-Based Reinforcement Learning
5.7 Proposed Research Directions
6 Conclusion
This thesis aims to provide a comprehensive review of recent advancements in Deep Reinforcement Learning (DRL) by categorizing the field into distinct research directions and evaluating the performance of key algorithms.
Deep Learning
In this section the main concepts of DL will be covered. For the purposes of this thesis, DL methods can be seen as a form of non-linear function approximator. DL methods are a subclass of representation learning, which in turn focuses on extracting the necessary features for the task (e.g. classification or detection) (Lecun et al., 2015, p. 436). Typically, supervised learning, which this section will be focused on, is used. Here, labeled training data is fed to a non-linear function approximator, such as a neural network. In this context, "labeled" means that along with the data (e.g. pixel values of an image) a target is also given (e.g. the image shows a dog). The network now learns a function that maps the inputs (e.g. the pixels) to the output (e.g. the label). The goal of the network is, given enough training data, to be able to generalize to new, unseen data (Lecun et al., 2015, p. 436-438).
The most widely used model architectures are feedforward (neural) networks (FNN), which are also called multilayer perceptrons (MLP) (Goodfellow et al., 2016, p. 167).
1 Introduction: Provides the definition of Reinforcement Learning and describes the motivation for combining it with Deep Learning.
2 Research Method: Details the literature review process and the sources used to identify relevant research in the DRL field.
3 Background: Establishes the theoretical foundations of Reinforcement Learning (including Markov Decision Processes) and Deep Learning concepts.
4 Results: Presents an analysis of value-based and policy-based DRL algorithms and summarizes their performance on common benchmarks.
5 Discussion: Examines open research challenges such as the exploration-exploitation dilemma and multi-agent settings.
6 Conclusion: Summarizes the thesis findings and reflects on the trajectory of Artificial General Intelligence.
Deep Reinforcement Learning, DRL, Artificial Intelligence, Neural Networks, Q-Learning, Policy Gradient, Atari 2600, MuJoCo, Exploration vs. Exploitation, Multi-Agent Reinforcement Learning, Markov Decision Process, Function Approximation, Experience Replay, Reward Shaping, Deep Learning.
The work provides a detailed review of recent advancements in Deep Reinforcement Learning (DRL), summarizing the most important algorithms and their performance in various environments.
The thesis categorizes research into value-based approaches (like DQN) and policy-based approaches (like A3C, TRPO, and D-DPG), while also addressing model-based learning and evolutionary strategies.
The objective is to offer a structured overview of the current DRL landscape and to synthesize how different techniques contribute to reaching an agent's goals.
The thesis relies on benchmark testing using the Atari 2600 game suite and the MuJoCo physics simulation environment to compare different algorithmic implementations.
The main part covers the theoretical background of RL and DL, the transition to Deep Q-Networks, improvements like Dueling Networks and Distributional RL, and a variety of policy-based methods.
Key terms include Deep Reinforcement Learning, Neural Networks, Policy Gradients, Atari 2600 benchmarks, and Multi-Agent Reinforcement Learning.
Rainbow is a state-of-the-art agent that combines multiple enhancements—such as Prioritized Replay, Multi-Step Learning, and Noisy Nets—into a single architecture, significantly improving efficiency.
The work discusses the significant time and sample requirements of current models compared to human learning speed and highlights the need for higher sample efficiency in future applications.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

