This is a highly controversial issue in education because it addresses fundamental questions about test validity and fairness. If a modest intervention such as a test prep program can result in a large increase in test scores, then what does that say about the validity of scores earned, both by students who received the intervention and by those who did not?
A thoughtful piece by Jim Jump published last month in Inside Higher Ed (5/22) raises some of the same issues and questions about recent claims of large score increases on the SAT based on moderate amounts of “instruction.”
The interest in this topic provides an opportunity to review the research on test preparation in general and to make some connections to similar claims made about the impact of other types of interventions on achievement.
To cut to the chase: The research clearly suggests that short-term test prep activities, while they may be helpful, do not produce large increases in college admission test scores.
There are some fundamental principles about admissions test scores that have remained constant across numerous redesigns, changes in demographics, and rescaling efforts. They include the following:
- Scores on college admissions tests (as well as most cognitive tests) generally increase with retesting, so any claim about score increases must statistically explain the proportion of change attributed to additional time, practice, and growth apart from any intervention1.
- John Hattie’s exhaustive synthesis of over 800 meta-analyses related to achievement show almost any type of intervention—more homework, less homework, heterogeneous grouping, homogeneous grouping—will show some positive effect on student achievement; it is hard to stop learning2. But, in general, small interventions and shorter interventions have less impact on student achievement. Web-based instruction has an effect size of .30, which is certainly good. The average effect size across all interventions, however, is less than .40.
- Students who participate in commercial coaching programs differ in important ways from other test takers. They are more likely than others to: be from high income families, have private tutors helping them with their coursework, use other methods to prepare for admission tests (e.g., books, software), apply to more selective colleges, and be highly motivated to improve their scores. Such differences need to be examined and statistically controlled in all studies on the impact of interventions. Claims about the efficacy of instructional interventions and test preparation programs on test scores have been shown to be greatly exaggerated.
- There have been about 30 published studies of the impact of test preparation on admissions test scores. Results across these studies are remarkably consistent. They show a typical student in a test prep program can expect a total score gain of 25 to 32 points on the SAT 1600-point scale, and similar respective results can be found for the ACT and GRE. The reality is far less than the claims.
- In 2009, Briggs3 conducted the most recent comprehensive study of test preparation on admissions tests. He found an average coaching boost of 0.6 point on the ACT Math Test, 0.4 point on the ACT English Test, and -0.7 point on the ACT Reading Test4. Similarly, test preparation effects for the SAT were 8 and 15 points on the reading and math sections, respectively. The effects of computer-based instruction, software, tutors and other similar interventions appear no larger than those reported for test preparation.
There have been many more studies which attempted to examine the impact of instructional programs on achievement. Again, such studies are equally difficult to conduct and equally unlikely to show effect sizes larger than the typical growth students encounter simply from another year of instruction, coursework and development. Research-based interventions which are personalized to the learner can improve learning, and increased learning will impact test scores. However, in order to support such claims, research studies which address the representativeness of the sample and equivalent control groups, the extent and type of participation in the intervention, and many other contextual factors need to be addressed and published. This is how we advance scientific knowledge in education and basically any field.
Jim Jump’s previously referenced column identified many questions and possible problems associated with the claims related to the efficacy of participation in Khan Academy’s programs on SAT scores. However, few if any of these questions or concerns can be answered, simply because no research behind the claims has been made available by the College Board to review or examine—all we have is their press release—and claims can neither be supported nor refuted when there is no methodology to examine. Further speculation about the efficacy of this intervention is not helpful. But there are some additional facts about testing, interventions, and claims of score increases to consider when we read any claims or research on the subject.
First, while test preparation may not lead to large score increases, it can be helpful. Students who are familiar with the assessment, have taken practice tests, understand the instructions and have engaged in thoughtful review and preparation tend to be less anxious and more successful than those who haven’t. Such preparation is available for free to all students on the ACT website and other sources.
Second, the importance of scores on tests such as the ACT and SAT continues to be exaggerated. What is ultimately important is performance in college.
We know that some interventions can increase test scores by two-thirds of a standard deviation. The question should be whether there is evidence of a similar increase in college grades (which is the outcome that admissions tests predict). Claims that test preparation could result in large score increases required serious investigation because they threatened to undermine the validity of admission testing scores. Simply put, if an intervention increases test scores without increasing college grades, then there is some bias present in some scores.
It is possible that scores of students participating in test prep or another intervention are being over-predicted and will not result in similar increases in college grades. Or it could it be that the scores of students who have not engaged in test prep have been under-predicted.
Hardison and Sackett5 demonstrated that a 12-hour intervention could increase performance on an SAT writing prototype while also increasing performance in other writing assignments. While this was a preliminary experimental study of the coachability of a new writing assessment, it demonstrated that instruction could result in better writing on both a test and in course assignments.
This type of study highlights the types of questions that are raised whenever claims of large score increases are reported. When results are too good to be true (and even when they are not), it is always better to verify.
Claims that test preparation, short-term interventions, or new curricular or other innovations can result in large score gains on standardized assessments are tempting to believe. These activities require so much less effort than enrolling in more rigorous courses in high school or other endeavors which require years of effort and learning.
If we find an intervention that increases only a test score without a similar effect on actual achievement, then we need to be concerned about the test score. And when we hear claims about score increases that appear to be too good to be true, we need to conduct research based on the same professional standards to which other scientific research adheres. Because if it sounds too good to be true, it very likely is.
1 See the What Works Clearinghouse for Criteria https://ies.ed.gov/ncee/wwc/
2 Hattie (2009). Visible Learning: A synthesis of over 800 meta-analyses related to achievement. New York, Routledge.
3 http://www.soe.vt.edu/highered/files/Perspectives_PolicyNews/05-09/Preparation.pdf
4 ACT scores are on a 1-36 scale so these raw numbers can’t be compared to the SAT. These effects represent a full model which controls for differences in socioeconomics, ability, and motivation between a baseline group and a test preparation group. Yes, students in the coached group saw a decrease in ACT reading scores relative to uncoached students.
5 Hardison and Sackett, (2009). Use of writing samples on standardized tests: Susceptibility to rule-based coaching and resulting effects on score improvement. Applied Measurement in Education, 21: 227-252.