How much authority should we give to such work in our policy decisions? The question is important because media reports often seem to assume that any result presented as “scientific” has a claim to our serious attention. But this is hardly a reasonable view. There is considerable distance between, say, the confidence we should place in astronomers’ calculations of eclipses and a small marketing study suggesting that consumers prefer laundry soap in blue boxes.
A rational assessment of a scientific result must first take account of the broader context of the particular science involved. Where does the result lie on the continuum from preliminary studies, designed to suggest further directions of research, to maximally supported conclusions of the science? In physics, for example, there is the difference between early calculations positing the Higgs boson and what we hope will soon be the final experimental proof that it actually exists. Scientists working in a discipline generally have a good sense of where a given piece of works stands in their discipline. But often, as I have pointed out for the case of biomedical research, popular reports often do not make clear the limited value of a journalistically exciting result. Good headlines can make for bad reporting.
Second, and even more important, there is our overall assessment of work in a given science in comparison with other sciences. The core natural sciences (e.g., physics, chemistry, biology) are so well established that we readily accept their best-supported conclusions as definitive. (No one, for example, was concerned about the validity of the fundamental physics on which our space program was based.) Even the best-developed social sciences like economics have nothing like this status.
Consider, for example, the report President Obama referred to. By all accounts it is a significant contribution to its field. As reported in The Times, the study, by two economists from Harvard and one from Columbia, “examined a larger number of students over a longer period of time with more in-depth data than many earlier studies, allowing for a deeper look at how much the quality of individual teachers matters over the long term.” As such, “It is likely to influence the roiling national debates about the importance of quality teachers and how best to measure that quality.”
But how reliable is even the best work on the effects of teaching? How, for example, does it compare with the best work by biochemists on the effects of light on plant growth? Since humans are much more complex than plants and biochemists have far more refined techniques for studying plants, we may well expect the biochemical work to be far more reliable. For making informed decisions about public policy, though, we need to have a more precise sense of how large the difference in reliability is. Is there any work on the effectiveness of teaching that is solidly enough established to support major policy decisions?
The case for a negative answer lies in the predictive power of the core natural sciences compared with even the most highly developed social sciences. Social sciences may be surrounded by the “paraphernalia” of the natural sciences, such as technical terminology, mathematical equations, empirical data and even carefully designed experiments. But when it comes to generating reliable scientific knowledge, there is nothing more important than frequent and detailed predictions of future events. We may have a theory that explains all the known data, but that may be just the result of our having fitted the theory to that data. The strongest support for a theory comes from its ability to correctly predict data that it was not designed to explain.
While the physical sciences produce many detailed and precise predictions, the social sciences do not. The reason is that such predictions almost always require randomized controlled experiments, which are seldom possible when people are involved. For one thing, we are too complex: our behavior depends on an enormous number of tightly interconnected variables that are extraordinarily difficult to distinguish and study separately. Also, moral considerations forbid manipulating humans the way we do inanimate objects. As a result, most social science research falls far short of the natural sciences’ standard of controlled experiments.