Reliability and Validity of International Large-Scale Assessment


Book Description

This open access book describes and reviews the development of the quality control mechanisms and methodologies associated with IEA’s extensive program of educational research. A group of renowned international researchers, directly involved in the design and execution of IEA’s international large-scale assessments (ILSAs), describe the operational and quality control procedures that are employed to address the challenges associated with providing high-quality, comparable data. Throughout the now considerable history of IEA’s international large-scale assessments, establishing the quality of the data has been paramount. Research in the complex multinational context in which IEA studies operate imposes significant burdens and challenges in terms of the methodologies and technologies that have been developed to achieve the stated study goals. The demands of the twin imperatives of validity and reliability must be satisfied in the context of multiple and diverse cultures, languages, orthographies, educational structures, educational histories, and traditions. Readers will learn about IEA’s approach to such challenges, and the methods used to ensure that the quality of the data provided to policymakers and researchers can be trusted. An often neglected area of investigation, namely the consequential validity of ILSAs, is also explored, examining issues related to reporting, dissemination, and impact, including discussion of the limits of interpretation. The final chapters address the question of the influence of ILSAs on policy and reform in education, including a case study from Singapore, a country known for its outstanding levels of achievement, but which nevertheless seeks the means of continual improvement, illustrating best practice use of ILSA data.







The promise of large-scale learning assessments


Book Description

This report addresses the more contentious aspects of large-scale learning assessments (LSLAs). Drawing on UNESCO's extensive experience in the area from involvement in the direct implementation of assessments and as a knowledge broker and convener of networks this publication presents the Organization's critical take on such initiatives. It aims to balance the debate on LSLAs by reviewing their benefits while raising awareness on their potential risks and pitfalls. The focus of discussions in this publication is on LSLAs conducted in formal and school-based education. It includes an Annex outlining key international studies. [Executive summary, ed]




Classroom Assessment and Educational Measurement


Book Description

Classroom Assessment and Educational Measurement explores the ways in which the theory and practice of both educational measurement and the assessment of student learning in classroom settings mutually inform one another. Chapters by assessment and measurement experts consider the nature of classroom assessment information, from student achievement to affective and socio-emotional attributes; how teachers interpret and work with assessment results; and emerging issues in assessment such as digital technologies and diversity/inclusion. This book uniquely considers the limitations of applying large-scale educational measurement theory to classroom assessment and the adaptations necessary to make this transfer useful. Researchers, graduate students, industry professionals, and policymakers will come away with an essential understanding of how the classroom assessment context is essential to broadening contemporary educational measurement perspectives. The Open Access version of this book, available at http://www.taylorfrancis.com, has been made available under a Creative Commons Attribution-Non Commercial-No Derivatives 4.0 license.




Advancing Human Assessment


Book Description

This book is open access under a CC BY-NC 2.5 license.​​ This book describes the extensive contributions made toward the advancement of human assessment by scientists from one of the world’s leading research institutions, Educational Testing Service. The book’s four major sections detail research and development in measurement and statistics, education policy analysis and evaluation, scientific psychology, and validity. Many of the developments presented have become de-facto standards in educational and psychological measurement, including in item response theory (IRT), linking and equating, differential item functioning (DIF), and educational surveys like the National Assessment of Educational Progress (NAEP), the Programme of international Student Assessment (PISA), the Progress of International Reading Literacy Study (PIRLS) and the Trends in Mathematics and Science Study (TIMSS). In addition to its comprehensive coverage of contributions to the theory and methodology of educational and psychological measurement and statistics, the book gives significant attention to ETS work in cognitive, personality, developmental, and social psychology, and to education policy analysis and program evaluation. The chapter authors are long-standing experts who provide broad coverage and thoughtful insights that build upon decades of experience in research and best practices for measurement, evaluation, scientific psychology, and education policy analysis. Opening with a chapter on the genesis of ETS and closing with a synthesis of the enormously diverse set of contributions made over its 70-year history, the book is a useful resource for all interested in the improvement of human assessment.




Validity in Educational and Psychological Assessment


Book Description

Validity is the hallmark of quality for educational and psychological measurement. But what does quality mean in this context? And to what, exactly, does the concept of validity apply? These apparently innocuous questions parachute the unwary inquirer into a minefield of tricky ideas. This book guides you through this minefield, investigating how the concept of validity has evolved from the nineteenth century to the present day. Communicating complicated concepts straightforwardly, the authors answer questions like: What does ′validity′ mean? What does it mean to ′validate′? How many different kinds of validity are there? When does validation begin and end? Is reliability a part of validity, or distinct from it? This book will be of interest to anyone with a professional or academic interest in evaluating the quality of educational or psychological assessments, measurements and diagnoses.




Everyday Assessment in the Science Classroom


Book Description

Designed as a ready-to-use survival guide for middle school Earth science teachers, this title is an invaluable resource that provides an entire year's worth of inquiry-based and discovery-oriented Earth science lessons, including 33 investigations or labs and 17 detailed projects. This unique collection of astronomy, geology, meteorology, and physical oceanography lessons promotes deeper understanding of science concepts through a hands-on approach that identifies and dispels student misconceptions and expands student understanding and knowledge. In addition, this field-tested and standards-based volume is ideal for university-level methodology courses in science education.




Monitoring Student Achievement in the 21st Century


Book Description

This book draws together leading student assessment academics from across Europe exploring student monitoring policies and practices in a range of countries across 22 chapters. The chapters in the first part offer a broad overview on student assessment covering history and current status, aims and approaches as well as methodological challenges of international student assessment. The second part presents country specific chapters provide an in depth look examining country specific policy and practices and findings of national and/or international assessments. Findings are critically discussed and recommendations are made for further development of each country's assessment context. The book shows similarities and differences within the educational assessment landscape as well as complexity and similarities in assessment policy documents and strategies, Given the globalized world we live in today, this book fills a need in the higher educational context and is intended for for policy makers in different countries as well.




Test Fairness in the New Generation of Large?Scale Assessment


Book Description

The new generation of tests is faced with new challenges. In the K?12 setting, the new learning targets are intended to assess higher?order thinking skills and prepare students to be ready for college and career and to keep American students competitive with their international peers. In addition, the new generation of state tests requires the use of technology in item delivery and embedding assessment in real?world, authentic, situations. It further requires accurate assessment of students at all ability levels. One of the most important questions is how to maintain test fairness in the new assessments with technology innovative items and technology delivered tests. In the traditional testing programs such as licensure and certification tests and college admission tests, test fairness has constantly been a key psychometric issue in test development and this continues to be the case with the national testing programs. As test fairness needs to be addressed throughout the whole process of test development, experts from state, admission, and licensure tests will address test fairness challenges in the new generation assessment. The book chapters clarify misconceptions of test fairness including the use of admission test results in cohort comparison, the use of international assessment results in trend evaluation, whether standardization and fairness necessarily mean uniformity when test?takers have different cultural backgrounds, and whether standardization can insure fairness. More technically, chapters also address issues related to how compromised items and test fairness are related to classification decisions, how accessibility in item development and accommodation could be mingled with technology, how to assess special populations with dyslexia, using Blinder?Oaxaca Decomposition for differential item functioning detection, and differential feature functioning in automated scoring. Overall, this book addresses test fairness issues in state assessment, college admission testing, international assessment, and licensure tests. Fairness is discussed in the context of culture and special populations. Further, fairness related to performance assessment and automated scoring is a focus as well. This book provides a very good source of information related to test fairness issues in test development in the new generation of assessment where technology is highly involved.




IEA 1958-2008


Book Description