: Jean-Paul Fox
: Bayesian Item Response Modeling Theory and Applications
: Springer-Verlag
: 9781441907424
: 1
: CHF 132.60
:
: Methoden der empirischen und qualitativen Sozialforschung
: English
: 313
: Wasserzeichen/DRM
: PC/MAC/eReader/Tablet
: PDF
The modeling of item response data is governed by item response theory, also referred to as modern test theory. The eld of inquiry of item response theory has become very large and shows the enormous progress that has been made. The mainstream literature is focused on frequentist statistical methods for - timating model parameters and evaluating model t. However, the Bayesian methodology has shown great potential, particularly for making further - provements in the statistical modeling process. The Bayesian approach has two important features that make it attractive for modeling item response data. First, it enables the possibility of incorpor- ing nondata information beyond the observed responses into the analysis. The Bayesian methodology is also very clear about how additional information can be used. Second, the Bayesian approach comes with powerful simulation-based estimation methods. These methods make it possible to handle all kinds of priors and data-generating models. One of my motives for writing this book is to give an introduction to the Bayesian methodology for modeling and analyzing item response data. A Bayesian counterpart is presented to the many popular item response theory books (e.g., Baker and Kim 2004; De Boeck and Wilson, 2004; Hambleton and Swaminathan, 1985; van der Linden and Hambleton, 1997) that are mainly or completely focused on frequentist methods. The usefulness of the Bayesian methodology is illustrated by discussing and applying a range of Bayesian item response models.

Jean-Paul Fox is Associate Professor of Measurement and Data Analysis, University of Twente, The Netherlands. His main research activities are in several areas of Bayesian response modeling. Dr. Fox has published numerous articles in the areas of Bayesian item response analysis, statistical methods for analyzing multivariate categorical response data, and nonlinear mixed effects models.
"8 Response Time Item Response Models (p. 227-228)

Response times and responses can be collected via computer adaptive testing or computer-assisted questioning. Inferences about test takers and test items can therefore be based on the response time and response accuracy information. Response times and responses are used to measure a respondents speed of working and ability using a multivariate hierarchical item response model. A multivariate multilevel structural population model is de ned for the person parameters to explain individual and group di erences given background information. An application is presented that illustrates novel features of the model.

8.1 Mixed Multivariate Response Data

Nowadays, response times (RTs) are easily collected via computer adaptive testing or computer-assisted questioning. The RTs can be a valuable source of information on test takers and test items. The RT information can help to improve routine operations in testing such as item calibration, test design, detection of cheating, and adaptive item selection. The collection of multiple item responses and RTs leads to a set of mixed multivariate response data since the individual item responses are often observed on an ordinal scale, whereas the RTs are observed on a continuous scale.

The observed responses are imperfect indicators of a respondents ability. When measuring a construct such as ability, attention is focused on the accuracy of the test results. The observed RTs are indicators of a respondents speed of working, and speed is considered to be a di erent construct. As a result, mixed responses are used to measure the two constructs ability and speed. Although response speed and response accuracy measure di erent con- structs (Schnipke and Scrams, 2002, and references therein), the reaction-time research in psychology indicates that there is a relationship between response speed and response accuracy (Luce, 1986).

This relationship is often characterized as a speed{accuracy trade-o . A person can decide to work faster, but this will lead to a lower accuracy. The trade-o is considered to be a withinperson relationship: a respondent controls the speed of working and accepts the related level of accuracy. It will be assumed that each respondent chooses a xed level of speed, which is related to a xed accuracy. A hierarchical measurement model was proposed by van der Linden (2007) to model RTs and dichotomous responses simultaneously that accounts for di erent levels of dependency.

The di erent stages of the model capture the dependency structure of observations nested within persons at the observational level and the relationship between speed and ability at the individual level. Klein Entink, Fox and van der Linden (2009a), and Fox, Klein Entink and van der Linden (2007) extended the model for measuring accuracy and speed (1) to allow time-discriminating items, (2) to handle individual and/or group characteristics, and (3) to handle the nesting of individuals in groups.

This extension has a multivariate multilevel structural population model for the ability and the speed parameters that can be considered a multivariate extension of the structural part of the MLIRT model of Chapter 6. In this chapter, the complete modeling framework will be discussed, and an extension is made to handle polytomous response data."
Preface8
Contents12
1 Introduction to Bayesian Response Modeling16
1.1 Introduction16
1.1.1 Item Response Data Structures18
Hierarchically Structured Data18
1.1.2 Latent Variables20
1.2 Traditional Item Response Models21
1.2.1 Binary Item Response Models22
The Rasch Model22
Two-Parameter Model24
Three-Parameter Model26
1.2.2 Polytomous Item Response Models27
1.2.3 Multidimensional Item Response Models29
1.3 The Bayesian Approach30
1.3.1 Bayes' Theorem31
Constructing the Posterior33
Updating the Posterior33
1.3.2 Posterior Inference35
The Role of Prior Information30
1.4 A Motivating Example Using WinBUGS36
1.4.1 Modeling Examinees' Test Results36
WinBUGS37
1.5 Computation and Software39
Computer Code Developed for This Book41
1.6 Exercises42
2 Bayesian Hierarchical Response Modeling45
2.1 Pooling Strength45
2.2 From Beliefs to Prior Distributions47
A Hierarchical Prior for Item Parameters48
A Hierarchical Prior for Person Parameters52
2.2.1 Improper Priors52
2.2.2 A Hierarchical Bayes Response Model53
Posterior Computation55
2.3 Further Reading56
2.4 Exercises57
3 Basic Elements of Bayesian Statistics59
3.1 Bayesian Computational Methods59
3.1.1 Markov Chain Monte Carlo Methods60
Gibbs Sampling60
Metropolis-Hastings61
Issues in MCMC62
Single Chain Analysis63
Multiple Chain Analysis64
3.2 Bayesian Hypothesis Testing65
3.2.1 Computing the Bayes Factor68
Importance Sampling69
Using Identities and MCMC Output70
Bayes Factor for Item Response Models71
3.2.2 HPD Region Testing72
3.2.3 Bayesian Model Choice73
3.3 Discussion and Further Reading75
3.4 Exercises76
4 Estimation of Bayesian Item Response Models81
4.1 Marginal Estimation and Integrals81
4.2 MCMC Estimation85
4.3 Exploiting Data Augmentation Techniques87
4.3.1 Latent Variables and Latent Responses88
4.3.2 Binary Data Augmentation89
4.3.3 TIMMS 2007: Dutch Sixth-Graders' Math Achievement95
4.3.4 Ordinal Data Augmentation97
4.4 Identification of Item Response Models100
4.4.1 Data Augmentation and Identifying Assumptions101
4.4.2 Rescaling and Priors with Identifying Restrictions102
4.5 Performance MCMC Schemes103
4.5.1 Item Parameter Recovery103
4.5.2 Hierarchical Priors and Shrinkage106
4.6 European Social Survey: Measuring Political Interest109
4.7 Discussion and Further Reading112
4.8 Exercises113
5 Assessment of Bayesian Item Response Models121
5.1 Bayesian Model Investigation121
5.2 Bayesian Residual Analysis122
5.2.1 Bayesian Latent Residuals123
5.2.2 Computation of Bayesian Latent Residuals123
5.2.3 Detection of Outliers124
5.2.4 Residual Analysis: Dutch Primary School Mathematics Test125
5.3 HPD Region Testing and Bayesian Residuals126
5.3.1 Measuring Alcohol Dependence: Graded Response Analysis130
Item and Person Fit126
Detecting Discriminating Items128
5.4 Predictive Assessment131
5.4.1 Prior Predictive Assessment133
5.4.2 Posterior Predictive Assessment136
Overview of Posterior Predictive Model Checks138
5.5 Illustrations of Predictive Assessment140
5.5.1 The Observed Score Distribution140
5.5.2 Detecting Testle