Next Generation Adaptive Testing: From Algorithms to Empathy

Nous joindre

Publication précédente Prochaine publication

31 octobre 2025

Next Generation Adaptive Testing: From Algorithms to Empathy

S’inscrire au Vretta Buzz

The start of fall, particularly the months of September and October, is typically filled with conferences around the world, especially across Eurasia and North America, where I had the opportunity to speak and engage with various experts in the field of assessment. At the conferences in Zambia, Uzbekistan, Mexico, and Kazakhstan, I reached a clear conclusion: computer adaptive testing is increasingly addressing the challenges of personalization and the expanding role of technology, bringing us closer to the reality of personalized - agentic learning and assessment.

As assessment organizations advance in their digital transformation, many have already embraced adaptive testing to improve efficiency, fairness, and measurement precision. Yet, the landscape is shifting again. What was once a mechanism for tailoring item difficulty to a learner’s ability is now becoming an intelligent system capable of understanding how a student thinks, not only how well they perform (Minn et al., 2022). The classical CAT (Computerized Adaptive Testing) approach, where each question is selected based on previous responses, has set the foundation for this transformation. However, recent developments in AI, process data analytics, and cognitive modeling are transitioning the adaptive testing paradigm (Yu et al., 2025). In addition to measuring proficiency, the next generation of adaptive testing systems dynamically interprets response behavior, predicts performance patterns, and adapts to multiple cognitive dimensions in real time (Luo et al., 2022).

This article explores how adaptive testing has evolved from a one-dimensional, efficiency-driven model into an intelligent, evidence-based approach that supports both precision and personalization, building on the earlier discussion of test delivery models presented in the article Test Delivery Models: Choosing or Balancing Priorities? published earlier this year, and focusing on the cutting edge of adaptivity where psychometrics meets AI (Huseyn, V, 2025).

From Classical CAT to Smart Adaptive Systems

The original CAT model, grounded in IRT (Item Response Theory), revolutionized how assessments could estimate student ability by selecting items that best matched the examinee’s current estimated level of proficiency. Such a CAT model reduced test length while maintaining measurement accuracy, a milestone achievement in the evolution of large-scale assessments. However, its primary limitation lies in its single-dimensional focus, as the classical CAT model assumes that a learner’s performance can be represented by one latent ability parameter.

To address this limitation, the field moved toward the Multistage Computerized Adaptive Testing (MST). Unlike classical CAT, which selects items one at a time, the MST model assembles test modules (or “stages”) that adapt at broader intervals (Yan, von Davier, & Lewis, 2022). This method could improve operational control and psychometric stability while allowing partial adaptivity within a structured framework, a model now widely used by assessment organizations. While the MST model offers structure and operational stability to adaptive testing practices, its adaptivity remains pretty much rule-based, and driven by pre-defined pathways and psychometric parameters. In these smart adaptive systems, every click, pause, and navigation choice contributes to a richer picture of how students engage with content.

To gain deeper insights into learner behavior, the next logical evolution is Intelligent Adaptive Testing (IAT), systems that retain MST’s structured design but level-up it with artificial intelligence, process data, and cognitive modeling (Yu et al., 2025). With this modernized framework, assessment organizations can take advantage of machine learning, reinforcement learning, and Bayesian networks to predict student responses, detect disengagement, and adapt dynamically across multiple abilities or constructs (Minn et al., 2022; İnce, 2025). For example, an IAT system could detect hesitation before a response and adjust subsequent items not just for difficulty but for cognitive process, whether the student is reasoning, guessing, or applying learned strategies. In doing so, adaptivity becomes a tool not only for precision but also for understanding and supporting learning. By incorporating Multidimensional IRT (MIRT), IAT systems can now map performance across several cognitive domains rather than along a single latent trait, effectively merging the psychometric rigor of MST with the real-time intelligence of AI-driven systems.

Advanced Adaptive Testing Models in Practice

Upon the review of the systems we move on to models. A new generation of adaptive models is designed to improve psychometric precision and operational flexibility by integrating innovations in artificial intelligence, item banking, and multidimensional psychometrics, delivering assessments that are both more accurate and more reflective of real learning behaviors(Gibbons et al., 2024; Halkiopoulos & Gkintoni, 2024). Recent research and pilot initiatives highlight three major directions shaping this evolution: hybrid adaptive architectures, dynamically assembled multistage models, and AI-assisted adaptivity.

On-the-Fly Assembled Multistage Adaptive Testing (OF-MSAT). A recent and promising model, On-the-Fly Assembled Multistage Adaptive Testing (OF-MSAT) (Li, Zhang, & Chen, 2025), combines the strengths of Linear-On-The-Fly Testing (LOFT) and Multistage Adaptive Testing (MST). In this model, test modules are dynamically assembled during delivery, integrating Multidimensional Item Response Theory (MIRT) to evaluate several cognitive traits simultaneously. This hybrid approach enables assessment organizations to maintain tight control over content coverage while allowing real-time adaptivity. OF-MSAT also supports test security by making sure that each student receives a unique but psychometrically equivalent pathway through the test, reducing the risk of item exposure.

Hybrid Adaptive Systems. Some organizations are adopting hybrid models that blend LOFT, CAT, and MST features into a single framework. For instance, an assessment might begin with a short routing testlet (MST stage) and then transition to fully adaptive item-level selection (CAT), allowing policymakers to balance efficiency, psychometric requirements, and content validity (Yan et al., 2022). Furthermore, in standardized assessments, adaptive routing is often combined with fixed anchor items to support cross-year comparability within this hybrid approach.

AI-Assisted Adaptivity. Artificial intelligence now amplifies the power of adaptive systems by enabling predictive and process-aware adaptivity. Machine learning models trained on response patterns, timing, and interaction data can anticipate a learner’s next likely step, detect disengagement, and adjust test flow accordingly (Minn et al., 2022). Reinforcement learning algorithms continuously refine item selection based on live data from previous test administrations, producing adaptive systems that “learn” alongside their users (Yu et al., 2025). Such AI-driven adaptivity is increasingly integrated into platforms with cognitive diagnostic assessment modules, where adaptive pathways are informed by process data, clicks, revisits, and problem-solving sequences that reflect real cognitive engagement.

Together, these innovations mark a turning point in adaptive testing, from improving psychometric efficiency to deepening understanding of learning, setting the stage for the next section’s exploration of their cognitive and pedagogical implications for teaching and curriculum design.

Cognitive and Pedagogical Gains

Advanced adaptive testing models also bring a fundamental shift in how we interpret performance and design learning pathways, with their ability to capture cognition in action, how students reason, make choices, and persist through challenges. By linking psychometrics with learning analytics, these systems transform test data into diagnostic evidence and formative feedback (Gibbons et al., 2024), as explained in more detail below.

From Measurement to Understanding. Traditional testing measures outcomes; adaptive systems measure processes of thinking. Through multidimensional modeling (such as MIRT and cognitive diagnostic modeling), assessment organizations can distinguish between conceptual understanding, procedural fluency, and strategic reasoning (Luo et al., 2022), three pillars often merged in conventional scoring. For instance, adaptive algorithms can identify whether a student’s repeated attempts stem from conceptual confusion or from testing hypotheses, providing educators with insights into why mistakes occur, not only where they occur. Such a modernized approach means that assessment reports will move from generic score summaries to learner profiles, mapping competencies across sub-skills or cognitive traits, informing individualized instruction, targeted remediation, and curriculum design at scale.

Process Data as a Learning Lens. Unlike traditional adaptive testing, which adjusts item difficulty only based on correct or incorrect answers, intelligent adaptive systems use process data, such as keystrokes, response times, and navigation patterns, to understand how students think and interact with tasks. When integrated into adaptive algorithms, this data helps detect disengagement, cognitive overload, or guessing behavior (Halkiopoulos & Gkintoni, 2024). For example, an intelligent system might reduce item complexity if it detects fatigue or introduce motivational feedback when it senses rapid, inconsistent responses. These design shifts transform assessment from a performance-monitoring tool into a learning-support system. Process-driven adaptivity also improves fairness and inclusivity by recognizing diverse problem-solving behaviors as valuable evidence, rather than penalizing there, a key step toward culturally responsive assessment design, as explored in our article titled Do Culturally Responsive Assessments Matter? (Huseyn, V, 2025).

Challenges and Ethical Considerations

As adaptive testing systems gain autonomy, they introduce technical and ethical challenges that demand stronger governance and infrastructure to support fairness, transparency, and trust. These operational and conceptual challenges primarily emerge across three interconnected areas: algorithmic bias and equity, transparency and explainability, and data privacy and human oversight, each shaping how adaptive testing systems can remain both innovative and ethical.

Algorithmic Bias and Equity. The first concern is algorithmic bias, as adaptive algorithms trained on unbalanced or unrepresentative data can replicate and amplify existing inequities. For example, if item calibration data reflects patterns from a narrow demographic or linguistic group, the adaptive model might overestimate ability for some test-takers while underestimating it for others, meaning small systemic biases can mirror across testing pathways. Assurance of fairness therefore requires continuous monitoring of item exposure rates, test difficulty parameters, and response-time distributions across demographic groups. To intervene, assessment organizations can mitigate potential risks through bias audits, differential item functioning (DIF) analyses on adaptive item pools, and transparent communication about how adaptive decisions are made (Yan et al., 2022). Modernized assessment organizations can even run bias dashboards for real-time monitoring of adaptive performance across subpopulations.

Transparency and Explainability. The second challenge is transparency. In classical test delivery models, item selection and scoring logic are visible and easily documented. In contrast, modern adaptive systems, especially those driven by AI, operate as “black boxes,” producing outputs that even psychometricians may struggle to interpret fully. This lack of explainability can weaken public trust, particularly when assessment results influence high-stakes decisions such as university admission or teacher selection. Therefore, transparency mechanisms, such as model cards, algorithmic documentation, and open rubrics for AI scoring, become very important to sustain credibility (Yu et al., 2025). Eventually, adaptive testing should not obscure the human reasoning behind item design and score interpretation, as AI is most effective when it remains auditable and guided by human intent, allowing interpretability to transform it from a decision-maker into a decision-support tool.

Data Privacy and Human Oversight. Another major concern relates to data privacy and oversight. Intelligent adaptive systems collect extensive behavioral and biometric data, keystrokes, timing, and interaction traces, raising legitimate questions about ownership, consent, and ethical use. Without clear governance structures, this data could be misused or repurposed beyond its intended educational scope(Halkiopoulos & Gkintoni, 2024). Therefore, assessment bodies need to establish privacy-by-design frameworks, anonymization protocols, and explicit consent mechanisms before deploying adaptive solutions. In short, the ethical future of adaptivity depends on the governance ecosystems built around technology, systems that balance innovation with accountability, efficiency with empathy.

As these ethical and operational challenges are addressed through stronger governance and transparent design, adaptive testing can move confidently toward its next generation, one that bridges the gap between assessment and learning while keeping human empathy at its core.

Toward the Next Generation of Adaptive Assessment

Finally, the next generation of adaptive systems minimizes the gap between assessment and learning, supporting test-takers to be understood as complex thinkers whose reasoning patterns, persistence, and engagement identify deep insights into how they learn rather than as only data points. This evolution, from classical CAT to intelligent - AI-assisted systems, signals a fundamental shift in how assessments measure and interpret learning. In fact, such a modern adaptive testing now interprets the process of thinking, integrating psychometric models, process data, and cognitive diagnostics to create a more holistic picture of learner performance.

In the context of precautions, the more powerful adaptive testing systems become, the more emphasis gets on ethical and operational dimensions, fostering the need for the algorithmic fairness, transparency, and data privacy evolve in parallel with technical innovation. Ultimately, the new way of approaching adaptive testing is in harmonizing algorithmic intelligence with human empathy, creating systems that are more fair, explainable, and learner-centered. And as assessment organizations prepare for their modernization journeys, staying informed through blogs like this one about emerging practices that shape the future of personalized and intelligent assessment systems, and integrating dynamic databases into their platforms, will provide a competitive advantage by enabling more personalized, action-oriented, and accurate measurement.

About the Author

Vali Huseyn is an educational assessment expert and quality auditor, recognized for promoting excellence and reform-driven scaling in assessment organizations. He mentors edtech & assessment firms on reform-aligned scaling by promoting measurement excellence, drawing on his field expertise, government experience, and regional network.

He holds a master degree in educational policy from Boston University (USA) and Diploma of Educational Assessment from Durham University (UK). Vali has supported national reforms in Azerbaijan and, through his consultancy with AQA Global Assessment Services, works with Kazakhstan and the Kyrgyz Republic to align assessment systems with international benchmarks such as CEFR, PISA, and the UIS technical criteria. He also works as a quality auditor in partnership with RCEC, most recently audited CENEVAL in Mexico. Fluent in Azerbaijani, Russian, Turkish, and English, he brings a deep contextual understanding to cross-country projects.

References

Gibbons, R. D., Lauderdale, D. S., & Wilson, R. S. (2024). Adaptive measurement of cognitive function based on multidimensional item response theory. Alzheimer’s & Dementia: Translational Research & Clinical Interventions, 10(4), e70018. https://doi.org/10.1002/trc2.70018

Halkiopoulos, C., & Gkintoni, E. (2024). Leveraging AI in e-learning: Personalized learning and adaptive assessment through cognitive neuropsychology. Electronics, 13(18), 3762. https://doi.org/10.3390/electronics13183762

Huseyn, V. (2025, February 28). Do culturally responsive assessments matter? Vretta Buzz. https://www.vretta.com/buzz/culturallyresponsive/

Huseyn, V. (2025, March 31). Test delivery models: Choosing or balancing priorities? Vretta Buzz. https://www.vretta.com/buzz/testdeliverymodels/

İnce, A. H. (2025). Enhancing ability estimation with time-sensitive IRT: The impact of response time in adaptive testing. Applied Sciences, 15(13), 6999. https://doi.org/10.3390/app15136999

Zheng, Y., & Chang, H.-H. (2015). On-the-Fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104-118. https://doi.org/10.1177/0146621614544519

Luo, H., Chen, Y., Wang, C., & Xue, H. (2022). Combining cognitive diagnostic computerized adaptive testing. Frontiers in Psychology, 13, 911893. https://doi.org/10.3389/fpsyg.2022.911893

Minn, S., Kim, J., & Park, S. (2022). AI-assisted knowledge assessment techniques for adaptive testing. International Journal of Artificial Intelligence in Education, 32(4), 881–900. https://doi.org/10.1016/j.ijai.2022.03.015

Yan, D., von Davier, A. A., & Lewis, C. (Eds.). (2022). Computerized adaptive and multistage testing: Theory and applications. Springer. https://doi.org/10.1007/978-3-030-91191-7

Yu, J., Zhuang, Y., Sun, Y., Gao, W., Liu, Q., Cheng, M., Huang, Z., & Chen, E. (2025). TestAgent: An adaptive and intelligent expert for human assessment. arXiv preprint arXiv:2506.03032. https://arxiv.org/abs/2506.03032

Publication précédente Prochaine publication