Masterarbeit, 2023
131 Seiten, Note: 1,3
1 Introduction
1.1 Research Objectives
1.2 Thesis Outline
2 Background
2.1 The Russian-Ukrainian Conflict 2022
2.1.1 Foreign and Security Policy
2.1.2 Energy Crisis
2.2 Twitter
2.3 Stance Detection
2.3.1 The Task of Stance Detection
2.3.2 Types of Stance Detection
2.3.3 Related Work
2.4 The Language Model BERT
2.4.1 Methodology
2.4.2 Transformer Encoder
2.4.3 Pre-Training
2.4.4 Fine-Tuning
3 Dataset
3.1 Data Collection
3.2 Removal of Duplicates
3.3 Development of Balanced Class Distributions
3.4 Data Statistics
3.5 Manually Labeled Test Datasets
3.6 Final Dataset and Data Availability
4 Experiments
4.1 Experimental Setup
4.1.1 Pre-Trained Language Models
4.1.2 Preprocessing Methodology
4.1.3 Evaluation Metrics
4.2 Experiments and Results
4.2.1 Experiment 1: Impact of a Balanced Dataset
4.2.2 Experiment 2: Cross-Target Generalization
4.2.3 Experiment 3: Different BERT Models
4.2.4 Hyperparameter
4.2.5 Discussion
5 Application of Fine-Tuned Model on 2022 Twitter Data
5.1 Twitter Data of 2022
5.2 Statistics of Detected Stances
5.3 Potential Reasons of Target-Specific Stance Groups
5.3.1 Target NOC
5.3.2 Target SLI
5.3.3 Target AD
5.3.4 Target US
5.4 Summary and Evolution of Tweet Volume Over Time
6 Conclusions and Outlook
This master thesis aims to develop an automatic stance detection model by fine-tuning BERT in a supervised cross-target setting. By applying this model to a large corpus of German Twitter data, the study examines stances regarding controversial socio-political debates arising from the 2022 Russian-Ukrainian conflict.
2.3 Stance Detection
The automatic extraction and analysis of information from texts has been an important research area in NLP for decades. Along with sentiment analysis, emotion recognition, and textual entailment, stance detection is an important research problem regarding the automatic analysis of content and can be viewed as a subtask of opinion mining.
2.3.1 The Task of Stance Detection
“Stance is a public act by a social actor, achieved dialogically through overt communicative means, of simultaneously evaluating objects, positioning subjects (self and others), and aligning with other subjects, with respect to any salient dimension of sociocultural field.” (Du Bois, 2007, p. 163)
According to Linguistics, the term stance refers to a social act by which someone takes a position in an ongoing communication itself in terms of evaluation, intentionality, epistemology, or social relations. By that, a person is taking a stance, whenever he or she describes an object (hereafter referred to as a target) in a way that expresses his or her attitude to it.
Accordingly, in NLP, stance detection refers to the task of automatically detecting the attitude expressed in a natural language text towards a target. Hence, a basic stance detection system should detect whether the author of a text is against or in favor of a given target entity. There are also many cases where a third or more classes are defined, such as neutral, where neither inference is likely. The target entity may be a person (e.g., Olaf Scholz) or an organization (e.g., North Atlantic Treaty Organization), a product (e.g., iPhone), a claim or headline (e.g., COVID-19 vaccines affect fertility in women) or any topic such as a political movement or a government policy (e.g., legalization of cannabis).
1 Introduction: Provides an overview regarding the conflict setting and the research objectives of the development of a stance detection system.
2 Background: Discusses the Russian-Ukrainian conflict 2022, the significance of Twitter as a platform, the theory of stance detection, and the language model BERT.
3 Dataset: Details the collection of tweets, the cleaning process (removal of duplicates), and the development of balanced class distributions using back translation.
4 Experiments: Explains the setup of the experiments, model evaluations across targets, and the assessment of performance regarding balanced datasets, generalization, and different BERT models.
5 Application of Fine-Tuned Model on 2022 Twitter Data: Applies the developed model to a larger 2022 dataset to examine stance distributions and explore reasoning behind target-specific opinions.
6 Conclusions and Outlook: Summarizes the key findings of the thesis and discusses potential future research directions, such as incorporating multi-modal data.
Stance Detection, Natural Language Processing, BERT, Cross-Target Stance Detection, Russian-Ukrainian Conflict, German Twitter Data, Transfer Learning, Machine Learning, Data Augmentation, Back Translation, Opinion Mining, Social Media Analysis, Fine-Tuning, Sentiment Analysis, Public Opinion.
This thesis investigates the development of an automatic stance detection system for the German language, specifically applied to Twitter debates surrounding the 2022 Russian-Ukrainian conflict.
The model analyzes four key targets: general support of Ukraine, delivery of heavy weapons to Ukraine, the repeal of the nuclear phase-out, and the implementation of a temporary speed limit on highways.
The goal is to leverage transfer learning by fine-tuning BERT models on multiple domain-related targets to create a system capable of predicting user stances even when the target might be indirectly referenced.
The work employs deep learning, specifically fine-tuning pre-trained BERT language models. It uses a cross-target training approach combined with synthetic data augmentation (back translation) to handle dataset imbalances.
The main part covers the theoretical background of stance detection and BERT, detailed methodology for creating a balanced dataset from scraped Twitter posts, various experimental configurations, and an application study on 2022 data.
Effectiveness is evaluated via precision, recall, and F-score on two different test sets: one automatically labeled (Test-1) and one manually annotated by human testers (Test-2).
Results show the model achieves reliable performance on known targets like Ukraine support, but struggles with cross-target generalization to completely unknown, unconventional domains.
Yes, the thesis highlights that irony and sarcasm pose significant challenges for automated stance detection, as these require deep contextual and background knowledge often missed by sentiment-bearing lexicon approaches.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

