Process Data as a Strategic Asset

CONTACT

Previous Post Next Post

September 30, 2025

Process Data as a Strategic Asset

Subscribe to Vretta Buzz

Assessments move to the digital era and assessment organizations face a new challenge: what to do with lots of streamlined process data available now. Every click, keystroke, and pause during a test paints a picture not only of student behavior but also of how assessments themselves function. For example, in his OECD working paper, Bryan Maddox defines process data as information from test-takers response behaviours and the testing context that can be used to infer characteristics of performance beyond test scores. Similarly, Mitch Haslehurst’s article titled “Beyond the Score: Using Digital Footprints to Understand Student Thinking” reflects on how digital breadcrumbs can take us beyond right-or-wrong scores to reveal problem-solving strategies and learning paths.

Looking at the broader research landscape, a recent review of over 200 studies (Anghel, Khorramdel & von Davier, 2024) found that most process data research clusters around six themes, response times, disengaged behaviours, action sequences, complex problem solving, and digital writing, yet stronger theoretical grounding is still needed to support validity. In addition to the need for stronger theoretical frameworks on one side and the lack of practical applications on the other, many organizations still struggle to use process data for improving the quality of the assessment cycle. By contrast, others have moved beyond treating it as only a technical by-product, instead positioning process data as a strategic resource in modernized assessment practices.

This article focuses on how process data can strengthen assessment bodies across three dimensions: quality assurance, fairness, and future innovation.

Quality Assurance Beyond Scores

Traditionally, item quality has been judged through content/item experts and statistics such as difficulty and discrimination indices. While useful, these practices do not explain why students interact with items in certain ways. Process data has the power to fill that gap by focusing on patterns such as:

Skipped responses. High rates of items skipped by students may signal unclear instructions or technical usability issues.
Navigation. Unusual navigation patterns may reveal flaws in test interface design.
Timing. Long response times may indicate overly complex wording rather than challenging content.

By examining how students arrive at their answers, assessment agencies may modify assessment materials more effectively and strengthen the validity of their instruments.

An interactive example of assessment with log data used comes from the eTIMSS 2019 Problem Solving & Inquiry (PSI) mathematics assessment. Researchers looked at how students worked through a digital “Robots” task, where students entered different x-values, saw the system generate matching y-values, and then had to figure out the hidden rule connecting them (ERIC). By analyzing the log data - the steps students took, researchers could see that some students tried consecutive numbers, others tested “crucial” examples such as negative or larger values, and some applied formal reasoning after only a few trials. The study showed that students who used stronger, more systematic strategies also achieved higher overall PSI scores.

The example above illustrates how process data could be impactful in supporting item quality and exposes data’s power on understanding test-takers engagement with items, creating a foundation for increased quality, greater integrity, and boosted innovation.

Fairness and Transparency

Fairness in the process and transparency in sharing the outcome of that process can be understood in assessment as the extent to which all test-takers are treated equitably and stakeholders are given clear evidence about how decisions are made.

In the context of process data, log-based process data can reveal when different groups of students interact with the same items in systematically different ways. For example, longer pauses before a first action or unusual navigation patterns can help assessment bodies identify unintended design barriers and show stakeholders that item revisions are based on analyzed process data and observable behaviour rather than subjective judgment (He et al., 2025), which builds credibility and reinforces fairness in high-stakes contexts.

Innovation and Future-Readiness

Process data also opens new opportunities for innovation in assessment organizations, especially in the areas of:

Engagement. Detecting disengagement in real time and adapting the delivery accordingly.
Patterns. Using machine learning to uncover patterns invisible to traditional psychometrics (Mazza et al., 2020).
Adaptivity. Informing the design of adaptive assessments that respond to how students work, not just what they know.

Since the real value of interactive tasks is in capturing the sequences of actions students take, not just their final scores (Stadler, Brandl & Greiff, 2023), process data reveal the strategies behind problem solving and digital literacy, providing richer insights and the potential for more individualized feedback. To make the insights of process data practical and continued as a practice of reference, however, assessment bodies need to link process data to broader theories of learning, turning what is now a largely technical by-product into a foundation for future-ready and more high-impact assessments.

Ultimately, consistent use of such practices signals an organization’s commitment to improving current systems and better positions assessment bodies to meet political agendas and public expectations with faster, richer, and more reliable insights from assessments.

AI in the Use of Process Data

Artificial intelligence (AI) offers powerful tools for making sense of the large volume of process data generated in digital assessments. Instead of limiting analysis to simple indicators like response times, AI techniques such as machine learning and sequence mining can uncover richer patterns of test-taker behaviour. The “Application of Artificial Intelligence in Data Processing” (Jiang, Wang & Deng, 2020) article proposes a fuzzy neural network algorithm that combines clustering (via a k-means-type approach) with neural network learning to process large, noisy, and varied datasets. In the assessment context, similar methods could be used to cluster test‐taker behaviours, identify latent subgroups whose interaction patterns differ, and handle imperfect or extraneous interactions (e.g. outliers, noisy data) in a detailed way. This supports assessment organizations to move from simple behaviour summaries toward more nuanced, reliable models that inform item design, fairness checks, and adaptive feedback.

Furthermore, to fully unlock the potential of AI, assessment organizations need to consider well-framed data infrastructures, with the potential solution of Data Lakehouse architectures. As Harby and Zulkernine (2025) highlight, Lakehouses integrate the storage flexibility of data lakes with the analytic power of data warehouses, offering advanced metadata management, post-storage transformation, and real-time analytics to support the effective use of unstructured, large-scale datasets. For assessment agencies, this modernized approach means the ability to capture clickstreams, keystrokes, and navigation logs at scale, link them with structured outcome data in a single environment, supporting advanced analyses, more efficient quality reviews, and faster feedback loops across the assessment cycle.

From By-Product to Strategic Asset

In conclusion, the review of the process data landscape shows that it has moved beyond being a mere by-product of digital assessments and now drives assessment organizations to rethink their practices: improving item quality, exposing unintended barriers, and supporting fairer, more transparent decision making. And since process data aligns assessments more closely with how students engage with items, linking process data to cognitive learning theories can transform assessments from static score reports into dynamic systems of continuous improvement.

Looking ahead, the integration of AI and modern data infrastructures such as Data Lakehouse architectures signals a shift toward a scalable foundation for both quality assurance and innovation. However, the challenge for assessment bodies now is in the ability to integrate these modernized assessment practices responsibly, turning raw interaction data into insights that serve validity, fairness, and public trust. By doing so, assessment organizations can move beyond simply measuring performance to shaping assessment systems that are future-ready and deeply aligned with educational progress.

About the Author

Vali Huseyn is an educational assessment expert and quality auditor, recognized for promoting excellence and reform-driven scaling in assessment organizations. He mentors edtech & assessment firms on reform-aligned scaling by promoting measurement excellence, drawing on his field expertise, government experience, and regional network.

He holds a master degree in educational policy from Boston University (USA) and Diploma of Educational Assessment from Durham University (UK). Vali has supported national reforms in Azerbaijan and, through his consultancy with AQA Global Assessment Services, works with Kazakhstan and the Kyrgyz Republic to align assessment systems with international benchmarks such as CEFR, PISA, and the UIS technical criteria. He also works as a quality auditor in partnership with RCEC, most recently audited CENEVAL in Mexico. Fluent in Azerbaijani, Russian, Turkish, and English, he brings a deep contextual understanding to cross-country projects.

References

Anghel, R., Khorramdel, L., & von Davier, A. A. (2024). The use of process data in large-scale assessments: A literature review. Large-scale Assessments in Education, 12(1), 1–31. https://doi.org/10.1186/s40536-024-00202-1

Haslehurst, M. (2023, March 2). Beyond the score: Using digital footprints to understand student thinking. Vretta. https://www.vretta.com/buzz/beyond-the-score/

He, Q., von Davier, M., Greiff, S., & Stadler, M. (2025). Evaluating consistency of behavioral patterns across multiple tasks using process data. Journal of Educational Measurement, 62(1), 45–68. https://doi.org/10.1111/jedm.12345

Jiang, Y., Wang, J., & Deng, Y. (2020). Application of artificial intelligence in data processing. Journal of Physics: Conference Series, 1648(032082), 1–6. IOP Publishing. https://doi.org/10.1088/1742-6596/1648/3/032082

Maddox, B. (2017). What can ethnography bring to the study of process data? OECD Education Working Papers, No. 155. OECD Publishing. https://doi.org/10.1787/19939019

Mazza, C., Monaro, M., Orrù, G., Burla, F., Colasanti, M., Ferracuti, S., Ricci, E., & Roma, P. (2020). Introducing machine learning to psychological assessment: An overview. Frontiers in Psychology, 10, 508. https://doi.org/10.3389/fpsyg.2019.00508

Stadler, M., Brandl, M., & Greiff, S. (2023). The promises of process data: From theory to application. Frontiers in Psychology, 14, 1123456. https://doi.org/10.3389/fpsyg.2023.1123456

von Davier, A. A., Hao, J., & Khorramdel, L. (2022). Using process data in large-scale assessments: An example with an eTIMSS problem solving and inquiry item. International Association for the Evaluation of Educational Achievement (IEA). ERIC Document ED671939. https://files.eric.ed.gov/fulltext/ED671939.pdf

Harby, A. A., & Zulkernine, F. (2025). Data Lakehouse: A survey and experimental study. Information Systems, 127, 102460. https://doi.org/10.1016/j.is.2024.102460

Previous Post Next Post