
26 février 2026
S’abonner au Vretta Buzz
How is the Document Structured?
The Key Change: A Broader Risk Awareness
How can this Document be Used Strategically?
Why ATP 2026 Is the Right Moment to Revisit These Guidelines
Twenty years ago, when technology-based testing was still finding its footing, the conversation was mostly about delivery. Could we move from paper to screen? Could we scale? Could we score faster?
Today, those questions appear relatively narrow in scope. Technology no longer sits at the edges of assessment. It shapes the entire lifecycle, from item development to scoring algorithms, from data storage to cross-border regulation. And when technology shapes experience, it shapes validity, fairness, and trust. That is why the updated Guidelines for Technology-Based Assessment (Version 1.1, July 2025) authored by the International Test Commission (ITC) and Association of Test Publishers (ATP) matter.
The updated Guidelines are comprehensive and technical. Most professionals will not read them cover to cover, and they do not need to. What follows is a structured orientation: who the document is really for, what has changed, and how it can be used strategically in practice.
These Guidelines are not written for academics alone. They speak directly to:
Assessment developers and psychometric teams
Ministries of education and national examination bodies
Test publishers and edtech platform providers
Certification and licensure organizations
Leaders responsible for governance, compliance, and quality assurance
What unites these stakeholders is a collective obligation to uphold validity, fairness, and integrity in technology-enabled assessment.
Importantly, this is not a regulatory document. It does not impose standards. Instead, it provides structured, internationally recognized guidance for responsible practice in a digital assessment environment. In an era of AI-assisted scoring, remote delivery, and data integration, that distinction matters.
The strength of the Guidelines lies in their structure. They recognize that technology-based assessment is not a single decision, it is an ecosystem.
The document is organized into four parts:
Part I sets the context: purpose, scope, and rationale. It clarifies that guidance, not prescription, is the goal.
Part II revisits foundational measurement principles, validity, fairness, reliability, construct representation, and reminds us that technology does not replace these principles; it amplifies the need to protect them.
Even as technology reshapes assessment, foundational principles such as validity, fairness, and reliability remain central because digital systems introduce new sources of construct-irrelevant variance, accessibility challenges, data governance complexities, and algorithmic risks that must be anticipated and managed.
Part III forms the operational core. It covers the entire assessment lifecycle:
Test development and technology-enhanced items
Design and adaptive assembly
Delivery environments and disruptions
Scoring (including automated and AI-assisted methods)
Results reporting Data management, security, and privacy
Fairness, accessibility, and global considerations
Part IV addresses emerging applications. Here the document shifts from theory to contemporary reality: artificial intelligence, generative AI, automated item generation, biometric tools, and regulatory developments. The transition consistently ties each of these back to core measurement principles, demonstrating that technological innovation must be governed by validity, fairness, reliability, and responsible data practices to make sure that new tools contribute positively, rather than undermine, assessment quality.
This structure signals something important: technology is no longer a feature of assessment. It is the infrastructure of assessment.
The 2025 update does not redefine what good assessment means. Validity, fairness, and reliability remain central. What has changed is the recognition of new risks.
The foundational sections now explicitly acknowledge how digital interfaces, AI systems, and remote environments can introduce construct-irrelevant variance or inequities if poorly governed. The risk is no longer hypothetical, it is embedded in daily practice.
The core guideline sections go deeper into:
Ethical and responsible use of artificial intelligence
Data governance and privacy management
Transparency in automated and AI-assisted scoring
Integration of assessment data with other digital systems
Accessibility in increasingly complex digital environments
But perhaps the most visible shift appears in the treatment of emerging technologies. Generative AI is no longer treated as an experimental edge case. It is addressed as a present reality requiring validation, oversight, and regulatory awareness.
AI and other advanced technologies can significantly increase efficiency. They can accelerate item generation, support large-scale scoring, detect anomalies, enhance accessibility features, and provide richer data insights. In many operational areas, they reduce manual workload and improve speed and scalability.
However, these efficiencies do not eliminate professional responsibility; they redistribute and, in some areas, intensify it. The responsibility shifts from manual execution to design oversight, validation, governance, and transparency.
When AI generates items, experts must verify construct alignment and detect bias. When automated scoring is deployed, psychometricians must evaluate model performance and fairness. When systems integrate large volumes of candidate data, governance leaders must ensure privacy, security, and regulatory compliance.
The message is therefore more precise: technological capability can streamline processes, but it increases the need for deliberate oversight, documented validation, and ethical accountability. Efficiency expands; so does the responsibility to govern it properly.
The Guidelines are well-composed together by design. They are meant to support diverse organizations across different stages of digital maturity, and to be revisited as questions arise, not necessarily read sequentially from beginning to end.
Their real value lies in how they are applied.
First, as a tool for internal policy alignment. Are AI tools in item generation or scoring documented and validated? Is data governance consistent across systems? Are accessibility considerations embedded from design rather than retrofitted? The Guidelines offer a structured lens for asking these questions before external stakeholders do.
Second, as evidence in audits, tenders, and accreditation processes. Alignment with internationally recognized guidance strengthens defensibility. In an environment where assessment decisions are scrutinized by regulators, media, and courts, shared professional reference points matter.
Third, as a foundation for staff onboarding and capacity building. As teams transition from traditional digital testing to AI-supported systems, they need a shared language. The Guidelines provide that common vocabulary across technical, psychometric, and governance domains.
In short, the Guideline is both a technical manual and a governance framework for digital assessment maturity.
Guidelines do not change practice on their own. Organizations do.
At the ATP Innovations in Testing 2026 Conference, this shift from principle to practice becomes visible. The presentation by the Education Quality and Accountability Office (EQAO) and Vretta does not treat the ITC/ATP Guidelines as a document to reference. It treats them as an operational framework. Through a structured professional learning program, cross-team dialogue, and ongoing reflection, the Guidelines were used to anchor modernization efforts, strengthen psychometric procedures, enhance data lifecycle governance, and guide an AI implementation roadmap.
This is precisely why ATP 2026 matters.
The updated TBA Guidelines arrive at a time when organizations are moving beyond the adoption of digital tools toward alignment and governance. AI scoring transparency, data privacy, accessibility, and regulatory pressures are not abstract debates. They are implementation decisions being made now.
What the EQAO-Vretta case illustrates is that alignment with the Guidelines is not a compliance exercise. It is a catalyst for professional learning, institutional clarity, and innovation grounded in best practice.
For example, through the Professional Learning Program, cross-functional teams systematically reviewed each chapter of the Guidelines against existing EQAO practices, examining areas such as automated scoring transparency, data lifecycle governance, and accessibility design. This reflection led to documented refinements in psychometric procedures, strengthened test security protocols, and a structured roadmap for responsible AI implementation.
Professional interpretation happened through dialogue, case sharing, and peer exchange rather than checklist compliance. Conferences create a space where principles are tested against real-world implementation.
For those attending ATP Innovations in Testing 2026, the full conference program is available here:
In particular, I encourage participants to attend the session “Maximizing the ITC/ATP Guidelines: From Practice to Innovation” on Tuesday, March 3, 2026, from 1:45 PM – 2:30 PM (Bolden 5 - 2nd Floor) to experience the shared journey of how structured reflection on the Guidelines can support organizational transformation.
This session offers an opportunity to examine how organizations are actively aligning their modernization efforts with the Guidelines, strengthening governance, advancing innovation responsibly, and embedding internationally recognized principles into everyday assessment practice.
The real question is no longer whether we can build more advanced digital systems. It is whether we can align them responsibly with shared professional standards.
That is exactly what these Guidelines invite us to do.

Vali Huseyn is an educational assessment expert and quality auditor, recognized for promoting excellence and reform-driven scaling in assessment organizations. He mentors edtech & assessment firms on reform-aligned scaling by promoting measurement excellence, drawing on his field expertise, government experience, and regional network.
He holds a master degree in educational policy from Boston University (USA) and Diploma of Educational Assessment from Durham University (UK). Vali has supported national reforms in Azerbaijan and, through his consultancy with AQA Global Assessment Services, works with Kazakhstan and the Kyrgyz Republic to align assessment systems with international benchmarks such as CEFR, PISA, and the UIS technical criteria. He also works as a quality auditor in partnership with RCEC, most recently audited CENEVAL in Mexico. In addition, he promotes awareness of the use of technology across the assessment cycle through his work with Vretta. Fluent in Azerbaijani, Russian, Turkish, and English, he brings a deep contextual understanding to cross-country projects.
If you would like to discuss your approaches to assessment modernisation or explore opportunities to showcase and promote your work, please feel free to contact Vali Huseyn at: vali.huseyn@vretta.com | LinkedIn