In my last post, I highlighted reasons why we are not quite ready for individual teacher incentives. the second involving growth models. Compared to the status model (NCLB) of examining test scores, growth models have many advantages–the primary one being that comparisons are made from students’ individual growth, not comparisons of last year’s 4th graders to this year’s 4th graders (hence not the same students are being compared, one of the major complaints of educators).
Within the growth model community, there is a huge push for value added growth as opposed to growth measured by gains. A gains growth model compares scaled scores from one test to the next (assuming the tests are vertically scaled); a value added growth model uses linear regression to predict a student’s growth. The value added advocates will describe its statistical superiority which is primarily true. The model provides for singling out the effects of various factors that impact achievement, allowing educators to compare students with other students of similar demographics
Problem 1: While value added advocates tout its statistical superiority, there is little talk about its philosophical inferiority. Linear regressions are based on statistical tools that compare students against students. These tools analyze the score distribution of the large group (usually statewide data) and predict where students should have scored based on the other students. In other words, it is the testing equivalent of grading on the curve!
If value added models are applied to incentives for a smaller segment of the large group, say a dozen districts against state data, you could make a weak case for some justification that all 12 districts still have the potential to make incentives. The limitations of value added growth models surface when incentive programs are broadened to the level of the large group, say a state-wide program. No matter if all teachers make significant progress over the course of a year, half will win and half will lose. It is the old game of having to score better than your neighbors to get your incentive.
Yes, even though value added models can add in factors so students are compared to similar demographics, it is still based on variance from the mean. They still have to beat the curve of their demographic group.
Problem 2: value added is a complex “black box” process that isn’t easy for educators grasp and see what needs to be done at an individual level. If the essential information provided by any growth model is primarily a “Yes-you-made-it/No-you-didn’t” form, it gives educators few useful tools to help work toward improvement.
Why should educators care? On the national scene, there is a growing push for broadening the utilization of incentives and using growth models as a major determiner of the incentive (Arne1 and Arne2).
The most valuable experience with our own incentive project so far has been this observation: effective systems reinforce educators (administrators and teachers) working, sharing, communicating together to improve the achievement of all students. The weaknesses of value added models are systemic challenges that work against these principals. Not to say that gains growth models don’t have just as long a list of problems when applied to incentives. But the issue is less the about the merits of each growth model–rather the issue is the lack of depth for discussion on how growth models are utilized to improve instruction (another future blog topic). Just having a statistical model to differentiate the incentives is the small part of the picture.
Nationally we are diving head first without enough thought to the systemic ramifications. It is vitally important that educators get up to speed on incentives and growth models to ensure the right questions are asked before programs are implemented.