|
心理学报 2012
Dynamic and Comprehensive Item Selection Strategies for Computerized Adaptive Testing Based on Graded Response Model
|
Abstract:
Item selection strategy (ISS) is a core component in Computerized Adaptive Testing (CAT). Polytomous items can provide more information about examinee compared with dichotomous items, and adopting polytomously scored items in test is a research direction of CAT. As we know, the most widely used ISS is the maximum Fisher information (MFI) criterion, which raises concerns about cost-efficiency of the pool utilization and poses security risks for CAT programs. Chang & Ying (1999) and Chang, Qian, & Ying (2001) proposed two alternative item selection procedures, the a-stratified method (a-STR) and the a-stratified with b blocking method (b-STR) based on dichotomous model, with the goal to remedy the problems of item overexposure and item underexposure produced by MFI. However, the technology of a-STR and b-STR is static because the items are stratified according to the given information at the beginning of test. Based on graded response model (GRM), a technique of the reduction dimensionality of difficulty (or step) parameters was employed to construct some ISSs recently. The limitation of this dimension reduction technique is that it loses a lot of information. Thus, in order to improve MFI, two new item selection methods are proposed based on GRM: (1) modify the technique of the reduction dimensionality of difficulty (or step) parameters by integrating the interval estimation; (2) dynamic a-STR and dynamic b-STR methods are implemented in the testing process. On one hand, these new ISSs can avoid and remedy the limitations of MFI and make good use of the advantages of the Fisher information function (FIF); FIF compresses all item parameters and ability parameters, so it is a comprehensive tool for all parameters in nature.On the other hand, the new ISSs employ the property that FIF could represent the inverse of the variance of the ability estimation, let ε be the square root of the reciprocal of the Fisher information, d be the absolute deviation between the estimate ability and the function of the parameters of an item, which may be chosen and could be changed during the course of CAT, the inequality of d<ε has the form of interval estimation, and its utility could be imaged as a more flexible shadow item pool. A simulation study based on GRM was conducted. Four item pools of different structures were simulated, and 1000 examinees was generated and their abilities were randomly drawn from the standard normal distribution N (0,1). Each pool consists of 1000 polytomous items and the maximum score of each item was randomly selected from set {3, 4, 5, 6}. In this paper, we assume the prior distribution of ability is standard normal and the Bayesian expected a posteriori (EAP) is employed to estimate the ability parameter. The CAT test stopped when the accumulative information satisfies the pre-determined value M (M=16) or reaches the pre-assigned test length 30. The results of the simulation study show that the new item selection methods required sho