Score Comparability

Score Comparability and Differential Item Functioning
Under the Common Core State Standards (CCSS), tests developed by each consortium are based on the same common core standards; however, states within one consortium may adopt different curriculum and instruction and student populations from different states could be very diverse. As acknowledged by PARCC, test score comparability across states is an important issue to be addressed. In this part, we will discuss briefly methods for detecting DIF items across multiple groups as well as multiple-group IRT models for dealing with DIF.
Software Packages For Multiple Group IRT Analysis and Accuracy of Parameter Estimates
In this report, we compare several IRT software packages for multiple group analysis including BILOG-MG, MULTILOG, IRTPRO, flexMIRT, Mplus, BMIRT, and FLIRT (R package). Given that different software programs employ different defaults and/or options for model identification and commonality, this report provides information on these two issues (model identification and commonality). This review focuses on use of software programs for multiple group IRT analysis in the context where the same test form is administered to different groups.
Context Effect on Item Parameter Invariance
Context effects occur when item parameters are influenced by item location, order effects, or the characteristics of other items in a test. Though a large amount of research on context effect showed changes in item positions can have great impact on both item parameter estimates and the subsequent equating results, inconsistent findings showed context effects did not always significantly affect item difficulty or item discrimination. Based on a thorough literature review, this project summarized the research findings on item parameter invariance as well as equating under the influence of context effects. Recommendations from literature on test construction or development were also provided.
Student Characteristics and CBT Performance: An Overview of the Literature
One big change in the field of education and assessment under the influence of modern technology is the transition from paper-based to computer-based assessment. Computer-based testing (CBT) is gaining popularity over the traditional paper-and-pencil test (PPT) due to many advantages that computer-based assessment provides. Meanwhile, more and more educators and researchers have shown interest in investigating the factors that influence students’ CBT performance. That is, for whom is CBT best suited? Or, what student characteristics are important in effective use of CBTs? The objective of this project was to examine the relationship between student characteristics and CBT, compared with PPT. In the literature, factors related to student characteristics, such as student demographic attributes, learning style, computer familiarity and test anxiety, were found to have somewhat different relations with CBT performance compared with PPT.
Student Characteristics and CBT Performance / Annotation-Abstract and Key Points/ Reference for the Literature Review

Maryland Assessment Research Center (MARC)

Score Comparability