Doktorarbeit / Dissertation, 2010
103 Seiten, Note: summa cum laude
1 Introduction
2 Multiple imputation
3 Significance levels from multiply-imputed data
3.1 Significance levels from multiply-imputed data using moment-based statistics and an improved F-reference-distribution
3.2 Significance levels from multiply-imputed data using parameter estimates and likelihood-ratio statistics
3.3 Significance levels from repeated p-values with multiply-imputed data
4 z-transformation procedure for combining repeated p-values
4.1 The new z-transformation procedure
4.2 z-test
4.3 t-test
4.4 Wald-test
5 How to handle the multi-dimensional test problem
5.1 Idea
5.2 Simulation study
5.3 Further problems
6 Small-sample significance levels from repeated p-values using a componentwise-moment-based method
6.1 Small-sample degrees of freedom with multiple imputation
6.2 Significance levels from multiply imputed data with small sample size based on Sd
7 Comparing the four methods for generating significance levels from multiply-imputed data
7.1 Simulation study
7.2 Results
7.2.1 ANOVA
7.2.2 Combination of method and appropriate degrees of freedom
7.2.3 Rejection rates
7.2.4 Conclusions
8 Summary and practical advices
9 Future tasks and outlook
A Derivation of (3.1)-(3.5) from Section 3.1
B Derivation of the degrees of freedom δ and w in the moment-based procedure described in Section 3.1
The primary objective of this thesis is to develop a robust statistical method for calculating significance levels from multiply-imputed data that overcomes the limitations of existing procedures. The author seeks to provide a practical and accessible approach that is compatible with standard statistical software while retaining high power and calibration, even in cases with small sample sizes or high-dimensional data.
1 Introduction
Missing data are an ubiquitous problem in statistical analyses that has become an important research field in applied statistics because missing values are frequently encountered in practice, especially in survey data. Many statistical methods have been developed to deal with this issue. Substantial advances in computing power, as well as in theory, in the last 30 years enables the application of these methods for applied researchers. A highly useful technique to handle missing values in many settings is multiple imputation, which was first proposed by Rubin (1977, 1978) and extended in Rubin (1987). The key idea of multiple imputation is to replace the missing values with more than one, say m, sets of plausible values, thereby generating m completed data sets. Each of these completed data sets is then analyzed using standard complete-data methods. These repeated analyses are combined to create one imputation inference, that takes correctly account into the uncertainty due to missing data. Multiple imputation retains the major advantages and simultaneously overcomes the major disadvantages inherent in single imputation techniques.
Due to the ongoing improvement in computer power in the last 10 years, multiple imputation has become a well known and often used tool in statistical analyses. Multiple imputation routines are now implemented in many statistical software packages. However, there still exists a problem in generally obtaining significance levels from multiply-imputed data, because Rubin’s combining rules (1978)
1 Introduction: Provides an overview of the problem of missing data and the role of multiple imputation in modern statistical analysis.
2 Multiple imputation: Introduces the theoretical foundations, necessary notations, and the established combining rules for multiple imputation.
3 Significance levels from multiply-imputed data: Details existing procedures for generating significance levels, including moment-based and likelihood-ratio-based approaches.
4 z-transformation procedure for combining repeated p-values: Presents a novel z-transformation approach for combining p-values and evaluates its performance on z-tests, t-tests, and Wald-tests.
5 How to handle the multi-dimensional test problem: Discusses the limitations of existing methods in multi-dimensional contexts and explores the challenges related to small sample sizes.
6 Small-sample significance levels from repeated p-values using a componentwise-moment-based method: Proposes an adjusted procedure utilizing componentwise-moment-based calculations for improved inference in small samples.
7 Comparing the four methods for generating significance levels from multiply-imputed data: Conducts an extensive simulation study to evaluate and compare the performance of the various discussed methods.
8 Summary and practical advices: Summarizes the findings and provides actionable recommendations for researchers applying these methods in practice.
9 Future tasks and outlook: Identifies open research problems and suggests potential directions for future statistical development.
Multiple Imputation, Missing Data, Significance Levels, Wald-test, z-transformation, p-values, Simulation Study, Small Sample Size, Multi-dimensional Test, Statistical Inference, Combining Rules, Imputation Model, Degrees of Freedom, Applied Statistics, Hypothesis Testing
This work focuses on solving the challenge of obtaining accurate significance levels from multiply-imputed data, especially when standard combining rules are insufficient or when specific statistical software access is limited.
The research frequently analyzes the Wald-test due to its ubiquity in regression models, alongside evaluations of z-tests, t-tests, and F-tests.
The main goal is to develop a method that retains the advantages of existing procedures while overcoming their limitations—such as the requirement for variance-covariance matrices—by relying on standard output from statistical software.
The author uses theoretical derivations based on Rubin’s rules and conducts extensive factorial simulation studies in the statistical programming language R to validate the performance of different methods.
The main chapters introduce a z-transformation procedure, address multi-dimensional test issues, and propose a componentwise-moment-based method designed specifically for small sample sizes.
Key terms include Multiple Imputation, Wald-test, p-value combination, simulation analysis, and small-sample degrees of freedom.
The componentwise approach is designed to provide better calibration in small-sample settings where the standard moment-based method may break down, by utilizing adjusted degrees of freedom calculated componentwise.
Standard Wald-tests can produce invalid significance levels if they do not correctly account for the uncertainty introduced by the imputation process and the specific distribution of the test statistics across multiple data sets.
Der GRIN Verlag hat sich seit 1998 auf die Veröffentlichung akademischer eBooks und Bücher spezialisiert. Der GRIN Verlag steht damit als erstes Unternehmen für User Generated Quality Content. Die Verlagsseiten GRIN.com, Hausarbeiten.de und Diplomarbeiten24 bieten für Hochschullehrer, Absolventen und Studenten die ideale Plattform, wissenschaftliche Texte wie Hausarbeiten, Referate, Bachelorarbeiten, Masterarbeiten, Diplomarbeiten, Dissertationen und wissenschaftliche Aufsätze einem breiten Publikum zu präsentieren.
Kostenfreie Veröffentlichung: Hausarbeit, Bachelorarbeit, Diplomarbeit, Dissertation, Masterarbeit, Interpretation oder Referat jetzt veröffentlichen!

