TIMSS, PISA, and NAEP: What to Know Before Digging into the Results

This is a banner season for the release of national and international test results in mathematics and science.

  • In late October, science results from the 2015 National Assessment of Educational Progress (NAEP) were released following an earlier release of the mathematics results.
  • Results from the 2015 Trends in International Mathematics and Science Study (TIMSS) were released yesterday.
  • And coming December 6, results from the 2015 Program for International Student Assessment (PISA) will be released, focusing on science literacy.

It’s very rare that science and mathematics results from all three assessments are released in the same year.

We will have estimates from three studies of how our students are doing over time and estimates from two studies of how our students’ scores compare with those in other countries. Chances are the results from the various assessments won’t all tell the same story. So how can we make sense of this bumper crop of assessments?

Here’s the key: Each test has a different purpose, different design, and different student targets.

NAEP, often called the Nation’s Report Card, measures student mastery of content, including mathematics, science, reading, writing, and social studies—taught in U.S. classrooms in grades four, eight, and 12. NAEP largely assesses academic content. Since there is no single curriculum across the U.S., expert panels guide the development of NAEP frameworks based on a broad consensus of content in the various state curricula—but also incorporating more cutting edge knowledge thought important for students to know.

TIMSS is designed to assess mastery of mathematics and science content at grades four and eight in countries around the world. Like NAEP, TIMSS also largely assesses academic content. Its frameworks, and the test questions generated from them, are not based on the curricula of any particular country, but instead are based on content that a committee of representatives from the participating countries agree is important for students to know. Periodically (including in 2015), TIMSS Advanced adds an assessment of secondary school students engaged in advanced mathematics and physics in the final year of secondary school.

PISA differs from NAEP and TIMSS in at least two critically important ways. First, unlike NAEP and TIMSS, which are heavily focused on academic content, PISA is designed to measure the extent to which students can apply their skills and competencies in reading, mathematics, and science to real-world problems in real-world contexts, with an emphasis on one of these three subject areas each assessment cycle. In 2015, the emphasis was on science. Second, unlike NAEP, which assesses at grades four, eight, and 12, PISA assesses at age 15, which is nearing the end of compulsory education in most countries. In the U.S., 15-year-olds are primarily in grade 10—a grade at which we do not have a national assessment to compare with PISA.

But there are additional caveats as well:

  • The assessments’ scales are different. TIMSS’ and PISA’s scales range from 0 to 1,000, while NAEP scales range from 0 to 500 for mathematics at grades four and eight and 0 to 300 at grade 12. NAEP’s science scale also ranges from 0 to 300. A one-point difference on NAEP is not the same as a one-point difference on TIMSS or PISA.
  • NAEP samples a much larger number of U.S. students than TIMSS and PISA; consequently NAEP measures U.S. students’ performance with a greater level of precision than TIMSS or PISA. That allows NAEP to detect smaller, statistically significant differences than TIMSS and PISA are able to do.
  • NAEP, PISA, and TIMSS all use different test questions to assess the same subject areas; while many of the test questions from one of the assessments could fit into the framework from the other two, they may fit less well at different grade levels; and some will likely not fit at all.

So with all these caveats, is the U.S.’s participation in the international tests worth it? 

Absolutely.

Understanding how our students compare and compete internationally can help educators prepare our graduates for the career world they will face. But we must be cautious about drawing hasty conclusions about the relative performance of U.S. students when the results are initially released.

That said, all three of these assessments collect rich contextual information that can be very useful when judging the relative performance of U.S. students.

Behind the Numbers

NAEP is administered by the National Center for Education Statistics. Its results for mathematics and reading at grades four and eight come out every two years—science, writing, and other subject areas less often.

Thirty-five industrialized countries of the Organisation for Economic Co-operation and Development (OECD)—including Germany, the United Kingdom, France, Spain, Japan, the Republic of Korea, and the U.S.—all administer PISA. Important non-OECD countries, including some regions of China and the Russian Federation, also participate. It is administered every three years with a rotating focus on either mathematics literacy, reading literacy, or science literacy, the latter of which was the focus in 2015.

TIMSS is administered every four years by the International Association for the Evaluation of Educational Achievement (IEA), an international organization of national research institutions and governmental research agencies. In 2015, more than 50 countries participated. Importantly, not all countries that participate in TIMSS participate in PISA and vice-versa. Finally, both TIMSS and PISA include developed and developing countries; however, TIMSS has a larger proportion than PISA. TIMSS Advanced was also administered in 2015. It targets students engaged in advanced mathematics and physics studies that prepare them to enter STEM programs in higher education. Also of note, both TIMSS 2015 and TIMSS Advanced 2015 provide 20-year trend measures for countries, such as the U.S., that participated in the first TIMSS assessments in 1995.

George Bohrnstedt is senior vice president, a sociologist and an AIR Institute Fellow who studies student achievement and other outcomes through analysis of NAEP data. He also chairs the NAEP Validity Studies Panel.

Fran Stancavage is a managing director at AIR. She is a co-author of the 2015 NAEP Validity Studies Report, Study on the Alignment of the 2015 NAEP Mathematics Items at Grades 4 and 8 to the Common Core State Standards (CCSS) for Mathematics. She is Project Director of the NAEP Validity Studies Panel.