To get anywhere, or even live a long time, a man has to guess, and guess right, over and over again, without enough data for a logical answer.

Robert A. Heinlein

I’ve been thinking hard about the nature of education research and I’m worried that it might be broken. If I develop a theory but have no evidence for it then it is dismissed as ‘mere speculation’. “Show me the evidence!” comes the crowded shout, and currently in the sphere of education evidence is all. But can we really trust the evidence we’re offered?

Clearly, sometimes we can. I don’t want to be cast as dismissing all evidence. My point is that we place too much faith in it, and we might possibly be mistaken to do so. Especially in education. As Bertrand Russell pointed out, “The most savage controversies are those about matters as to which there is no good evidence either way. Persecution is used in theology, not in arithmetic.” So too, I contend, in education.

The following is an attempt to untangle Egan’s objections to the bloated claims of education research made in his fascinating but frustratingly self-indulgent tome, Getting it Wrong from the Beginning.

Let’s imagine we want conduct some research on the effectiveness of a new teaching strategy (Strategy X.) How would we go about it? Well, we’d probably want to test its effectiveness across a range of different groups pupils and we’d probably consider getting several different teachers to try it out. We’d also want to have a control group who didn’t get the intervention so that we could try to establish what sorts of things happen without the intervention of Strategy X. A particularly reputable research might also want to set up a double-blind to try to avoid such confounds as the Hawthorne Effect, but it’s pretty tricky to keep teachers in the dark about how they’re teaching their pupils so in practice this is something that very rarely happens. We’d then need to decide our success criteria – how will we know if Strategy X works? For that we need something to measure, but what? Test results maybe?

Ok, so we’ve set up our study and, guess what? It turns out Strategy X is effective! It works! The overwhelming majority of studies show successful implementation of ideas, frameworks, teaching materials, methods, technological innovations etc. It doesn’t seen to matter whether studies are well-funded, small-scale or synthesising analyses of other studies: almost everything studied by education researchers seems effective. Of course there are some studies which report failures, but they’re rare. We acres of information on how to improve pupils’ learning that it seems inconceivable that learning does not then improve. But almost every one of these successful studies has absolutely no impact on system wide improvement. Why is this?

I’m sure readers will be able to point me in the direction of hugely important international studies which conclusively prove all sorts of things and that clearly we are on the brink of major breakthroughs. This, as far as I can see, has been the case for decades, but again, I’m the studies cited will be claimed to be free of the misunderstandings and technological limitations which bedevilled previous studies. But where does it end? And, most importantly, when will start reaping the rewards?

One of the problems we have is the limitations the scientific method in telling us what’s effective in education. Biesta, who I’ve been critical of before, tells us education is so saturated with values, and so contested in its aims, that it cannot really be operated on in the same way as physical sciences. We pay lip service to this fact but still make the mistake of believing that learning is a part the natural world and therefore conforms to the same rules that govern the rest of nature. I’m not sure that’s true. Rather, learning is shaped by a combination of evolution, culture, history, technology and development, and as such it’s a slippery devil; scientific method has to be appropriate to whatever it’s being applied. Methods may not transferable between fields of study.

Another issue is that our methodology is not always properly to the problems we want to solve, resulting in as Wittgenstein suggested of a clash of “experimental methods and conceptual confusion.” As has been pointed out to me before, if the most reliable of empirical evidence suggested beating children was the most effective way to get them to learn we would reject this finding as being both unpalatable and at odds with our values. Likewise if a progressively aligned academic found rote drilling to be more effective than discovery methods they would find it straightforward to dismiss the finding as being too narrowly defined or harmful in some other, less empirical way. This is something we all do. Any evidence, no matter how robust, has to align with our ideologies and values otherwise it is useless.

And that’s a third problem: empirical evidence in education isn’t empirical in the right way. As Wittgenstein observed, “The existence of the experimental method makes us think we have the means of solving the problems which trouble us; through problems and methods pass one another by.” Education research is founded on the proposition that it’s possible to establish causal links between discrete things, such as the link between Strategy X and pupils’ test results. But can it? That depends on the degree of conceptual confusion. Let’s say I want ed to conduct an experiment to determine how many students in Birmingham schools were under the age of 20. I do all kinds of data analysis and design as many questionnaires as I pleased, whatever I found would be banal as the causal connection I’m seeking to establish already exists as a conceptual connection. All I need to know is school education in Birmingham ends at age 19 to work out that the existence of 20-year-old students is a logical impossibility. This is an obviously stupid example, but it would appear this is exactly the mistake made in much education research, it’s just that the pre-existing conceptual connection is more subtle and  the findings are psuedoempirical. Egan offers this example of research study attempting to establish how we should to teach by using such principles as “To develop competence in an area of inquiry, students must a) have a deep foundational knowledge of factual knowledge, b) understand facts and ideas in the context of a conceptual framework, and c) organize knowledge in ways that facilitate retrieval and application”. He points out that a) b) and c) are definitions of ‘competence in an area of inquiry’. No amount of empirical research could ever demonstrate that these things are not connected!

Added to all this we have the research finding (O! The irony) that less that 1% of the education research that gets published are replication studies. (A replication study is one where researchers attempt replicate results with different test subjects.) Now, apparently the majority of replication studies in education (68%) manage to replicate the original findings, but when replication studies are conducted by completely different teams of researchers only 54% of studies are found to be replicable. A cynic might suggest that there’s a degree of vest interest at work here.

This might suggest that instead of relying so enthusiastically on evidence we could instead put a little more faith in reasoning and analysis. If I present a reasoned analysis of why I think Strategy X is likely to be effective with no supporting data, it’ll be dismissed as ‘mere speculation’. But my contention is this: I could conduct research on something that is analytically sound, and ensure it cannot fail but to produce favourable evidence. Yes, there will be all sorts of variation between different groups of students and their teachers, but where a teacher is enthusiastic, research will likely provide favourable finding. This seems obvious. If I can convince a teacher of the merits of Strategy X, they’ll work hard to get me the positive data I’m after with no connivance needed. Similarly, if they were sure I was a charlatan, there’s no way they’d use Strategy X unless they were forced and in that case the likelihood research finding would be positive is remote in the extreme.

Maybe, rather than being so quick to say, ‘the research shows…’ we might be better to formulate our thinking with ‘analysis has concluded…’? Of course we would still have to contend with just as much nonsense and dogma, but we’d waste a lot less cash!