I like the idea of comparing a students score on a standardized test before and after the teacher. And, adding in a long term effect so students from 2+ years down the line also impact your evaluation. IMO, it's going to be hard to teach to this test and next years test without covering the subject in some detail. Also by inspiring students to really enjoy the subject you will improve their long term performance. Basically take 30 students at 50% up to 55%, that's great, and if they keep getting 55 percentile for the next 5 years you probably did a great job.
PS: Students at the edges students will tend to regress to the mean, so you can still measure performance on a class that scored 95% last year.
Its easy to train students to perform better on tests by teaching them things orthogonal to the actual material.
Further, you can "teach to the test" by giving students only classes of problems that will appear on the test and drilling them in that and little else. I imagine that you could end up with lots of functional-illiteracy type issues where students would only be able to do a task in the tightly controlled circumstances that they have been trained on but have difficulty generalizing.
A simple multiple choice test covering vary limited subject matter is one option, but we can do better than that. I remember taking a vary comprehensive series of tests when I was !10 years old that included such things as attempting to sound out words that don't exist in English. Granted, the possibility of extremely high quality testing in this country is one thing, I suspect the actual creation of standardized tests in the public school system has become extremely political.
PS: I think AP tests are fairly good indicators of their subject matter. Even teachers who "teach to the test" still cover the subject to an acceptable degree.
Pretty much any test is teachable. I raised my ACT scores by 3 or 4 points with some studying. Even IQ test are teachable. I believe taking the WAIS IV for a second will invalidate it if taken sooner than six months prior to the first testing date.
PS: Students at the edges students will tend to regress to the mean, so you can still measure performance on a class that scored 95% last year.