Masterarbeit, 2017
40 Seiten, Note: 10
The dissertation aims to develop an effective data mining technique for both structured and unstructured big data, focusing on privacy preservation during data sharing from distributed databases. The work explores the challenges of anonymizing data while maintaining privacy and examines existing techniques to address this issue.
The dissertation begins by introducing the idea and motivation behind developing a new data mining technique for big data, with a focus on privacy preservation. It then defines the problem and scope of the dissertation, outlining the software context, constraints, and expected outcomes. Chapter 3 details the project plan, timeline, and feasibility study, including economic, technical, operational, and time feasibility aspects. Chapter 4 focuses on the software requirement specification, outlining the purpose, scope of the document, and responsibilities of the developer. It also includes a product overview with block diagrams, functional models with flow diagrams and data flow diagrams, and a detailed analysis of UML diagrams such as sequence diagrams and class diagrams. Finally, Chapter 5 dives into the detailed design, examining the architecture design and algorithms used, as well as interface details, including human and database interfaces.
The primary focus of the dissertation lies in the intersection of big data, data mining, privacy preservation, and data anonymization. It investigates techniques for collaborative data publishing and the role of a trusted third party in ensuring data privacy while facilitating data sharing from distributed databases. Key concepts include privacy-preserving data analysis, insider attacks, and the development of a new algorithm for data anonymization, addressing the challenges of data sharing while maintaining privacy for individuals and sensitive information.
It is a technique for extracting meaningful patterns and information from large-scale, complex datasets, including structured and unstructured data.
When sharing sensitive data like healthcare records, it's crucial to anonymize information to protect individual identities and comply with regulations.
Hadoop is a framework used to store and process vast amounts of data across clusters of computers, making it ideal for managing Big Data.
Structured data is organized (like databases), while unstructured data includes things like social media posts, videos, and patient symptoms in text form.
An attack where someone within an organization or a trusted third party misuses their access to compromise sensitive data.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

