Machine Learning Approach to Detect Fraudulent Banking Transactions

Masterarbeit, 2022
69 Seiten, Note: 3

Leseprobe

Inhaltsverzeichnis (Table of Contents)

Introduction
Objective
Literature Review
- Related Work
- Machine learning approaches
  - Logistic Regression
  - Decision Tree
  - Working of Decision Tree
  - Random Forest
  - Support Vector Machines (SVM)
  - K-Nearest Neighbours (KNN)
  - Gradient Boosted Trees
  - Research Method Data Challenges
- Recent Fraud Cases
- CRISP-DM Model
  - Business Understanding
  - Data Understanding
  - Data Preparation
  - Data Modelling
  - Model evaluation
  - Model Deployment
Methodology and Case Study
- Banking Theory
- Data Description
- Data Preparation
  - Scaling the data
  - Missing values handling
  - Dropping NA
  - Data encoding
- Data Visualisation
  - Univariate Analysis
  - Histograms
  - Boxplot
  - Bivariate analysis
  - Correlation
  - Summary from EDA
- Feature Selection
  - ANOVA
- Model Comparison and Results
  - Logistic Regression
  - Decision Tree
  - Random Forest
  - XGBBoost
  - GradientBoosting
  - LGBMclassifier
- Classification Evaluation Metrics
  - Confusion Matrix
  - Precision
  - Recall
  - F1 score
  - AUC-ROC
  - Receiver operating characteristic (ROC) Curve
  - Accuracy
  - Imbalanced Data
- Possible next steps
Summary and Conclusion

Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)

This master thesis investigates the application of machine learning techniques in detecting fraudulent banking transactions. The primary objective is to develop and evaluate a robust fraud detection model capable of accurately identifying suspicious transactions in real-time.

Machine learning algorithms for fraud detection
Data preprocessing and feature engineering for fraud detection
Performance evaluation of different machine learning models
Understanding and addressing challenges related to imbalanced data in fraud detection
Exploring potential future directions for improving fraud detection systems

Zusammenfassung der Kapitel (Chapter Summaries)

The thesis begins with a comprehensive introduction, defining the problem of fraudulent transactions and highlighting the significance of machine learning in this domain. The subsequent chapter delves into the objectives of the study, outlining the research questions and methodologies employed.

Chapter 3 provides a thorough literature review, examining existing research on fraud detection, discussing various machine learning approaches, and highlighting recent fraud cases. It further explores the CRISP-DM model, a structured data mining process that guides the development and deployment of fraud detection systems.

Chapter 4 focuses on the methodology and case study. It discusses banking theory, data description, and detailed data preparation techniques, including data scaling, handling missing values, and encoding categorical features. The chapter further presents a thorough analysis of the data, including univariate and bivariate analysis, feature selection, and model comparison. Finally, it evaluates the performance of various machine learning models using different classification evaluation metrics.

Schlüsselwörter (Keywords)

Fraud detection, machine learning, banking transactions, data preprocessing, feature engineering, model evaluation, imbalanced data, CRISP-DM, logistic regression, decision tree, random forest, XGBoost, gradient boosting, LGBM, confusion matrix, precision, recall, F1 score, AUC-ROC, accuracy.

Frequently Asked Questions

How can machine learning detect banking fraud?

Machine learning algorithms analyze millions of transactions to identify patterns, anomalies, and suspicious behaviors that deviate from normal customer activity.

What is the CRISP-DM model?

CRISP-DM (Cross-Industry Standard Process for Data Mining) is a structured approach for data projects, involving business understanding, data preparation, modeling, and evaluation.

Which algorithms are used for fraud detection?

Common algorithms include Logistic Regression, Decision Trees, Random Forest, XGBoost, and Support Vector Machines (SVM).

Why is imbalanced data a challenge in fraud detection?

Fraudulent transactions are very rare compared to legitimate ones. This imbalance can lead to models that are biased towards predicting everything as "legitimate."

What metrics evaluate a fraud detection model?

Key metrics include Precision, Recall, F1-score, and the Area Under the ROC Curve (AUC-ROC), rather than just simple accuracy.

Ende der Leseprobe aus 69 Seiten - nach oben

Details

Titel: Machine Learning Approach to Detect Fraudulent Banking Transactions
Hochschule: Hochschule für Technik und Wirtschaft Berlin
Veranstaltung: Project management and Data Science
Note: 3
Autor: Riwaj Kharel (Autor:in)
Erscheinungsjahr: 2022
Seiten: 69
Katalognummer: V1275894
ISBN (Buch): 9783346728951
Sprache: Englisch
Schlagworte: Machine learning Fraud detection Banking fraud
Produktsicherheit: GRIN Publishing GmbH
Preis (Ebook): US$ 34,99
Preis (Book): US$ 49,99

Arbeit zitieren: Riwaj Kharel (Autor:in), 2022, Machine Learning Approach to Detect Fraudulent Banking Transactions, München, Page::Imprint:: GRINVerlagOHG, https://www.diplomarbeiten24.de/document/1275894