The AfL debate: does it matter who’s right?

If you’re not already aware of my critique and Dylan Wiliam’s defence of formative assessment I do recommending getting up to speed before reading this post.

Dylan’s defence rests on the idea that although we can never be sure what’s going on in a child’s mind, “teaching will be better if the teacher bases their decisions about what to do next on a reasonably accurate model of the students’ thinking.”

He makes a rather interesting and surprising point: it doesn’t matter that we can’t know what’s going on in our students’ minds because his “definition of formative assessment does not require that the inferences we make from the evidence of student achievement actually improve student learning”.

Hang on, what’s formative assessment then?

Here’s what Wikipedia has to say:

Formative assessment or diagnostic testing is a range of formal and informal assessment procedures employed by teachers during the learning process in order to modify teaching and learning activities to improve student attainment.[1] It typically involves qualitative feedback (rather than scores) for both student and teacher that focuses on the details of content and performance.[2] It is commonly contrasted with summative assessment, which seeks to monitor educational outcomes, often for purposes of external accountability.[3]

And while we could pick a few holes in this definition this is, by and large, what I was under the impression most people believed formative assessment to be.

Well, it turns out that maybe this isn’t what Dylan Wiliam believes. From his comment I think we can infer that formative assessment is about “increase[ing] the odds that we are making the right [teaching] decision on the basis of evidence rather than hunch”, and that “as long as teachers reflect on the activities in which they have engaged their students, and what their students have learned as a result, then good things will happen.”

The essential elements of ‘formative assessment’ would appear to be these:

  • We should make teaching decisions based on evidence.
  • We should reflect on the activities students have engaged in and what they have learned as a result.

On the face of it this advice is both sound and wise. No one is seriously arguing that we should avoid making teaching decisions based on evidence, or that it’s a bad idea to reflect on the process of teaching and learning. But, and it’s a big but, what evidence? And what learning?

If we are to accept that it is best to “use evidence about learning to adapt teaching and learning to meet student needs” we need to pretty clear about what we’re doing.

Maybe the best evidence might be our understanding of cognitive science and the limits of working memory? Or possibly our knowledge of the role of retrieval induced forgetting? If we were to accept, for instance, Bjork’s concept of ‘desirable difficulties’ and his finding that current performance is a very poor indicator of future learning then maybe the very worst thing we could be doing is imagining that this evidence should be overthrown just because our students can respond to cues and prompts in the classroom?

If we were to accept these ideas about learning and memory then possibly the most sensible approach is to amass evidence on the most effective ways to teach before embarking on a teaching sequence and then reflecting on how successfully we believe we stuck to these principles. To refute this Dylan cites the example of car manufacture and the Japanese practice of “building in quality” to the manufacturing process. He says, “Similarly, in teaching, while we are always interested in the long-run outcomes, the question is whether attending to some of the shorter term outcomes can help us improve the learning for young people.”

Well, I’d contend that it is overwhelmingly more complex to see whether a student has learned something during a lesson than it is to check whether a car has been sufficiently well constructed. Any attempts to elicit evidence from students during the teaching process is fraught with difficulty, and the only really helpful response is one along the lines Dylan refers to on the properties of wood and metal. As he says, knowing a student is labouring under a misapprehension is better than not knowing it. Definitely. But what about when students are able to answer your questions? If we are to take this as a measure of our teaching’s effectiveness than we could be sadly and spectacularly mistaken. Finding out that students know the right answer during a lesson is most useless piece of feedback we can get. Who cares? It is only in ascertaining whether a change has taken place over the long term that we will get any useful feedback on the effectiveness of our teaching.

So in conclusion, all that formative assessment within lessons can tell us is what students haven’t learned, never what they have learned. That’s not to say that it isn’t extremely useful to check out misconceptions and reveal areas of ignorance, but it might be incredibly damaging to use formative assessment in lessons as justification for believing we can ‘move on’ in the belief that we know with any certainty whether students are ‘making progress’.

Does any of this matter? Yes, I think it does. If Dylan’s right then we should carry on with his ‘5 strategies and one big idea’. But if I’m right, then formative assessment in lessons (a long-winded way of saying AfL) could be counterproductive and prevent us from doing what is best. Maybe the ‘nose for quality’ we need is in searching out the most effective ways to teach.

57 Responses to The AfL debate: does it matter who’s right?

  1. Ian Lynch says:

    How many angels can dance on the head of a pin? Look this is really quite simple. Assessment can determine what someone knows and can do as well as what they don’t know and can’t do. Whether the learning or not took place in that lesson or not might or might not be possible. The information gleaned from this process can be useful in informing future teaching. That’s it. What is it with the need to over-complicate things?

    • David Didau says:

      If Dylan (and you) are right then formative assessment is essential for testing how effective our teaching is.

      If I’m right, then formative assessment will just prevent us from doing what is best.

      I’m happy for you to disagree but It really isn’t productive to berate me for blogging. You have been warned 😉

      • Martin Harris says:

        Could you answer a few questions please?

        What should be used to enable teachers to do what is best?

        Is this approach feasible or utopian?

        Or is rite learning of facts and testing of said facts the only way to judge what has been learned?

        • David Didau says:

          Well, I think the best guide to how we should teach is by following the research on what cognitive science has to tell us about learning. With so many bloggers trying to digest this for us, yes I think this is feasible.

          Are you under the impression that I’m advocating rote learning and testing? I linked in my original post to my ideas on a proposed teaching sequence:

          While we’re at it, what distinction would you draw between testing and assessment?

      • Ian Lynch says:

        Where did I say it was essential? I said what it is and that it can be useful. You are free to blog what you like, I’m simply commenting on what seems incredibly obvious. On your logic it is a disadvantage to check whether someone has grasped something before moving on, or ok to carry on teaching them the same stuff when they already know it. The argument that current level of knowledge is a poor predictor of future learning is a confusion of change and rate of change.

        Let’s take learning to lift weights. Let’s not bother testing the pupil’s strength, let’s just put 300 kg on the bar or maybe 30 kg since we’d better start low and current strength is no indicator of future gains. It’s a very, very well proven fact that to optimise the gains you need to stress the person optimally and feedback from quantitative testing is the only way to do that efficiently. Before you say brains are different to muscles, I agree they are but PE is a legitimate educational activity and you haven’t specified to which education fields your theory applies. So even as a minimum you need to say what scope you are considering.

        Another example. I set a test recently and found one group of children could not spell professional, another group could. Now do I feedback to the former group that they need to learn to spell professional or not? After all that is an instance of AfL informing what is taught in the future. Do I need to feedback how to spell professional to the second group when they already know how? We can worry about why one group can spell and the other can’t but the immediate issue is identifying an issue and fixing it. Without the assessment how would that ever come to light?

        • Martin Harris says:

          I suppose I am a little wary of Round Earthers. Those that follow new science and believe that is the truth. Until a new truth arrives.
          In the end teaching is about many things and how we ensure the best progression is down to many factors. However if children are taught how to learn, how to self evaluate , reflect , target improvement it helps. Sequencing will help I suspect . What also helps is engagement , resourcing, smaller teaching ratios in certain socio economic situations, etc etc etc.
          I believe there is not one answer, but many thst can help. The best teachers will know what works for the circumstance .
          No Round Earthers please;)

          • David Didau says:

            You’d prefer Flat Earthers perhaps?

          • Martin Harris says:

            I didn’t expect such a response. A little terse to a genuine response.
            No flat Earthers are not a preference they are more obvious. My point seems to have touched a nerve,

          • David Didau says:

            Not at all Martin – maybe I’m misunderstanding something but what’s wrong with people who believe the Earth is round? Is this an argument that actually the earth is an oval and not a sphere or something? Surely anyone who believes the Earth isn’t round is deluded and you appear (?) to be inviting delusion?

          • Ian Lynch says:

            Of course scientifically the earth is an oblate sphere. It approximates to flat and round in particular circumstances. Playing the science card is only safe if you understand the role of context in uncertainties precision and accuracy 😉

  2. […] If you’re not already aware of my critique and Dylan Wiliam’s defence of formative assessment I do recommending getting up to speed before reading this post. Dylan’s defence rests on the idea that although we can never be sure what’s going on in a child’s mind, “teaching will be better if the teacher bases their decisions about what  […]

  3. Dylan Wiliam says:

    In his post “The AfL debate: so who’s right?” David Didau strains mightily to maintain the idea that we are in disagreement, but I can’t honestly see anything that he has written that I disagree with, and he seems to find disagreement with aspects of AfL that I have never endorsed.

    So, first, I try not to use the term “assessment for learning” or AfL. I found it unhelpful before the government really screwed it up by making a watered–down form of the research on formative assessment part of the national strategies, and as Randy Bennett (2011) has pointed out, equating AfL with formative assessment is unhelpful, and the lack of clear definitions is making reasoned debate in this area difficult.

    The Wikipedia entry for formative assessment is of little help here being a listing of different views of the subject, with little attempt to synthesize them. I have probably spent more time than most people thinking about how to define formative assessment, and while I don’t expect everyone to agree with me, presenting my definition will at least allow discussion to take place on an agreed basis.

    My definition of formative assessment proposes that an assessment functions formatively:

    “to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited.” (Black & Wiliam, 2009, p. 9)

    Here are some important features of this definition:

    1. The definition of formative assessment does not entail any view of what students should be learning—it can be used to improve the learning of anything. Most of the psychological work on memory has been done in laboratory settings on learning that is easy to test. Sometimes, we want to develop dispositions or attitudes that are very different from the kinds of knowledge that feature in most psychological studies.

    2. The definition of formative assessment does not entail any view of what happens when learning takes place—in other words, it is independent of any psychology of learning.

    3. The distinction between summative and formative is grounded in the function that the evidence elicited by the assessment actually serves, and not on the kind of assessment that generates the evidence.

    4. Anyone—teacher, learner or peer—can be the agent of formative assessment.

    5. The focus of the definition is on decisions, not evidence.

    6. The decisions are about “instruction” which may sound to UK readers as espousing a transmission approach to teaching, but is here used in the US sense of the planned engagement of students with activities that will result in learning.

    7. The definition does not require that the inferences about next steps in instruction are correct. Given the complexity of human learning, it is impossible to guarantee that any specified sequence of instructional activities will have the intended effect. All that is required is that the evidence collected improves the likelihood that the intended learning takes place.

    8. The definition does not require that instruction is in fact modified as a result of the interpretation of the evidence. The evidence elicited by the assessment may indicate that what the teacher had originally planned to do was, in fact, the best course of action. This would not be a better decision (since it was the same decision that the teacher was planning to make without the evidence) but it would be a better founded decision.

    On this basis, I can’t see any basis of disagreement with what David is saying. On the other hand, if David can suggest any reasons why the following five strategies:

    Sharing with students what they are meant to be learning
    Finding out what they are learning
    Providing feedback that promotes further learning
    Enabling students to support each other in their learning
    Helping students to be more active independent learners

    are, in principle, bad for students, then we may have grounds for a serious disagreement…


    Bennett, R. E. (2011). Formative assessment: a critical review. Assessment in Education: Principles Policy and Practice, 18(1), 5-25.

    Black, P. J., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5-31.

    • David Didau says:

      I must say Dylan, I really appreciate you taking the time to help me out with all this – I’m learning an awful lot. You say I’m “straining mightily” to put some distance between our positions, but in fact I’m only mighty straining I’m doing is in trying to understand what is and isn’t important for effective teaching.

      Sorry for continuing to use the phrase ‘AfL’ – I recognise that you stopped using it sometime ago, but in the minds of most teachers formative assessment are synonymous. I realise that perhaps it has become a toxic brand due to government interference and enforcement and so I will, from now on, confine myself to your preferred term.

      I chose the Wikipedia definition, despite its limitations as both easily available and widely held. If you google “formative assessment” it’s the first term that comes up and will therefore be revealing about what most people believe. I’m genuinely startled that your definition has moved on so far; possibly the word ‘assessment’ is misleading?

      The definitional points you outline are fascinating and worth exploring and sharing more widely as I suspect they may be little known or understood. I may need to think about about them some more but on the face of it, as you say, it appears we have little cause for disagreement.

      The next job is to demonstrate how the 5 strategies can be aligned with this refined understanding of formative assessment. That will be the subject of a separate post.

      Many thanks, David

      • Dylan Wiliam says:

        Actually, David, I don’t think my thinking on formative assessment has evolved much in the last ten years. It’s just that most of my writing has been in academic journal articles rather in sources more easily accessible to teachers.

        The phrase “assessment for learning” was first used by Harry Black in 1987 and adopted by the Assessment Reform Group in 1999 but Paul Black and I never used it much. We did entitle our 2003 book “Assessment for learning: putting into practice” partly because of the popularity of the term but also to emphasise that what was important was that the assessment evidence was acted upon. Assessment for learning describes the intent behind the assessment process, but too often, assessment evidence is collected, but not used. Since about 2004, we have suggested that assessment for learning becomes formative assessment when the assessment evidence is used in an attempt to increase learning.

        Interestingly, my AERA paper this year (on my web-site) tries to demonstrate why I still think it’s fruitful to regard the processes we are talking about as an assessment process. Fundamentally, assessments are devices for improving decisions. When the decisions relate to the status of students, the assessments are functioning summatively. When the decisions relate to the optimisation of future learning, they function formatively.

        • David Didau says:

          Thanks Dylan – I’ll look up your paper and give it a read.

          Also, much to muse on re. decision making

        • mrbenney says:

          “too often, assessment evidence is collected, but not used” – oh absolutely. This to me is where AfL becomes gimmicky. It’s not collecting the information but doing something with it that is important. If formative assessment can only tell us what pupils can’t do (and highlight misconceptions) that is more than enough for me .

    • Ian Lynch says:

      Hattie found feedback to be the attribute with the most significant effect size. How can feedback be given without assessment? What is AfL if it’s not assessment and feedback? If the claim to want to use evidence from research is genuine then let’s not cherry pick one source that is two steps removed from the classroom and focus on empirical data directly taken from the classroom. Better still state the contextual limits on that data clearly and unambiguously. That would be getting somewhere towards a scientific approach. Anyone claiming the scientific high ground needs to demonstrate that their arguments hold scientific water.

      • David Didau says:

        If you want to comment on feedback, why not comment on this post in which I try to unpick Hattie’s findings:

        Currently I’m writing a book on all this. If at the end of this lengthy process you feel I’ve cherry picked evidence two steps removed from classrooms and ignored actual classroom practice I will have some sympathy. For now, I am groping my way to a fuller understanding. I’m not claiming anything, I merely asking, What if… about a question you claim is “incredibly simple”. To my mind it is anything but simple, a point well made by Dylan.

        Bjork (1999) suggests that both teachers and students “gain illusions of competence based on their interpreting good performance during the acquisition as evidence that learning, as measured by long-term retention and transfer, has been achieved.”

        Let’s take the two examples you gave in an earlier comment. Firstly the weight lifting example. Interesting, the evidence on feedback on the motor domain is far more compelling than in the verbal domain; the separation between performance and learning here seems even more stark. Consider for instance Christina & Bjork (1991) Lee (2012) Schmidt & Bjork (1992) Schmidt & Lee (2011) Wulf, Gabriele & Shea (2002) Baddeley, Alan & Longman (1978) Goode, Stephen & Magill (1986) Hall, Kellie Domingues & Cavazos (1994) Lin, Wu, Udompholkul & Knowlton (2010) for starters.

        I could go on…

        Your spelling test example is exactly the type of activity I had in mind when writing the post. I f I find out someone doesn’t know how to spell a word, that is useful: I can then remediate. If I find out someone can spell a word without having taught them, then this will have usefully established prior knowledge. But if I find out someone can spell a word based on what I have just taught them, I have learned nothing.

        • Ian Lynch says:

          My puzzle is why you would compartmentalise feedback as something completely separate from AfL. To me and I suspect most others the two are fundamentally linked.

          • David Didau says:

            Oh is that your puzzle? I wouldn’t have, but that’s what Dylan has suggested in his definition of formative assessment. He chose to define it as being where “evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited.”

            As this post is based on a conversation between me and Dylan it seemed most reasonable to engage with this definition. By all means write your own blog setting out what your think.

        • I have no comment on the substantive debate going on here – I just want to say what a great quote that is from Bjork – hits the nail on the head.

  4. […] If you’re not already aware of my critique and Dylan Wiliam’s defence of formative assessment I do recommending getting up to speed before reading this post.  […]

  5. Nigel W says:

    I have to confess to being slightly confused by this David. I enjoy much of what you write, but this obsession with the effectiveness of AFL seems a distraction, and if one were cynical, just seems like your trying to pick a fight for the sake if it.

    In addition to class room teaching I coach sport (once at a top level, too old now!) Formative assessment is not an event. It is an ever-present state of mind of the educator. A vigilent observer who absorbs all cues to inform the next action. Whether that assessment be based on recall ability, skill, or even emotional readiness for learning.

    There is no debate for me here, start teaching from where the learner is – Always.

    • David Didau says:

      Clearly you are a cynic Nigel :). I’m not picking a fight at all, in fact I wasn’t under the impression I was having a fight? For me, this is an incredibly useful exercise in refining my understanding and I’m very grateful to Dylan for indulging me.

      My contention is that the “ever present state of mind of the educator” may be leading us down the wrong road. Of course you’re welcome to dismiss this as a distraction but if I’m right it would seem to be of fundamental importance.

      I agree that we ought to start teaching from where the student is in terms of prior knowledge; any new instruction will make so much more sense if it is relevant after all, but from that point on I think we should design instructional sequences based on what cognitive science reveals. Therefore we should carefully plan an interleaved and spaced curriculum which acknowledges that students forget at a fairly predictable rate and will only learn (no matter what our vigilant observation of their current performance tells us) with appropriate retenda reminders.

      Anyone who takes the position that there is no debate might be guilty of putting their head in the sand. There is a debate; I’m having it. You can ignore it, but it seems foolish to deny its existence.

  6. Keith says:

    But surely we would therefore need to use formative assessment in order to elicit which students have forgotten and which have not? I think your disagreement seems to be that we should not use evidence to make a decision that the piece of information/ skill has been learnt and as a consequence move on (never to return). We will need to return to develop, refine, re teach as necessary. But it is better to use some form of formative assessment in order to make an informed judgement on the steps we need to take and with whom? surely? Students do forget at a fairly predictable rate ‘on average’ and this is why the formative assessment is essential.

    • David Didau says:

      Is better to use some form of formative assessment in order to make an informed judgement on the steps we need to take and with whom?

      Well that rather depends on whether you believe that current performance is a useful indicator of future learning. If it’s not, how can the judgement be informed? If forgetting is predictable, why spend time finding out what you already know?

      • Alx w says:

        Current performance is some kind of indicator of future learning in that it
        A) demonstrates that the individual has the capacity to correctly understand what they are meant to be learning
        B) confirms that the initial ‘chunking’ of this new knowledge was done correctly

        As importantly this assessment and feedback loop allows us to reflect on what has NOT been understood, that which we have learnt or worse that we have learnt incorrectly. Surely that is vital for your own interleaved model. In so much that it informs aspects of your teaching that you may have not emphasised previously when first covering.

        Finally how would you look upon assessinh knowledge a week or two weeks after the initial teaching to confirn disconfirm success. Is uour objection with afl simply around a narrow focus on assessing at time of teaching concept?

        • David Didau says:

          Yes – the last point. I’m only objecting to the use of formative assessment as an in class decision making guide.

          That being the case, maybe we should concentrate more on diagnostic assessment and on providing formative feedback for summative assessments.

          • alx w says:

            But do you not think formative assessment might be useful in class decision making, to tell us what students have not understood, rather than using it as proof that they have understood. The moving on, does not have to presume that they have learnt that idea, simply that they can at least respond correctly to what you have been teaching.
            Evolving the position it seems to me that formative assessment lesson is useful to dispel assumptions that teachers (including myself) make that students have even understood what they have just covered. If this part of learning has not been ‘transmitted’ correctly then no amount of cognitive psychology is going to help.
            Let’s call in class formative assessment part of phase 1 – do the students understand the content? What misconceptions do they hold?
            Our understanding of how well the initial concept is processed will then help us to direct our subsequent teaching (in lesson or across lessons) as to what proportion maybe required to reground the concept, challenge misconceptions and what to the process of securing it in memory and building strong retrieval success. Phase 2.

            Alongside this both the afl and the diagnostic assessments can help provide formative feedback. the former will be around how well do I understand this and the latter a combination of how well do I understand this and how well have I remembered it.

            Not sure your position automatically dispels need for some element of afl, although I absolutely agree that the idea that, on its own, it is the answer. I think that maybe the later is your position? That if Dylan Wiliam is suggesting that it is the sole use of assessment (which seems to have been one interpretation of some of his work) we have a disagreement?

  7. Nigel W says:

    Me , cynical… certainly! I have a predisposition to assume that everything can be improved upon. “Repetition is the mother of all learning” a friend of mine used to quote. Is that what you are suggesting? That irrespective of whether a learner demonstrates competence they should be encouraged to re-visit the skill / knowledge regularly to help move from short term to long term memory? If so, I completely agree. Understanding is the critical bit though isn’t it? In sport we might talk about closed practise. A technique used in isolation in a constant environment. I think, parallels can be drawn with many skills in English. Only when we move into open practice – where environment can be unpredictable, can we see if the learner has developed skill as opposed to mastering a technique.

    Like Dylan, I don’t see much to disagree with, so I’m still confused where the debate is. Look at what people can or can’t do and structure learning appropriately. To assume that a person is competent because they demonstrated a successful outcome once is not enough to convince me that they have “got it”. Can they apply what they have learned when it matters? – when the chips are down if you like. Only when they can demonstrate they have the understanding convincingly should they move on, and even then we don’t abandon the learning, we frequently revisit it as we use it as a launch pad for the next part of the learning.

    I think it’s great to question what goes on. I perpetually scrutinise what I do and how I do it. It’s exhausting, but means I can sleep at night believing I am always doing what I think is the best thing.

    So I’m not ignoring the fact you want a debate, but I’m still a bit unclear which bit is contentious 🙂

    • David Didau says:

      Yes, I am saying “that irrespective of whether a learner demonstrates competence they should be encouraged to re-visit the skill / knowledge regularly to help move from short term to long term memory”.

      The contentious bit is the finding that decreasing current performance increases long term retention and transfer (learning). Most teaching is predicated on the belief that if we improve performance we will improve performance. If this isn’t true, then that seems antithetical to the practice of formative assessment because evidence of struggle and misunderstanding might in fact be desirable.

      I may have fallen down a rabbit hole with this but I at least want to explore it thoroughly while I’m here.

  8. Heidi says:

    Hi David I think this could link tk your summary? I know of a school where they were relying heavily upon exit polls (as a whole school approach), to adapt future lessons to try to pitch them at the correct level for students/groups of students. They have now pulled the plug on that practise and felt that young pupils and those with SEN were not assessing themselves accurately at all. Heidi

    • Ian Lynch says:

      Any school that relied only or heavily on self-assessment by inexperienced and SEND pupils as the main means of informing planning is at best professionally naive. It doesn’t mean there is no value in getting children to reflect on what they have learnt.

      • Heidi says:

        Yes I agree and some children are very good at self assessment. I lecture to trainee teachers on the area of formative assessment. I am reading this to check I am approaching it from a helpful angle.

        • David Didau says:

          This is what Dylan said:

          “…in terms of self-assessment, it is, of course, tragic that in many schools, self-assessment consists entirely of students making judgments on their own confidence that they have learned the intended material. We have over 50 years of research on self-reports that show they cannot be trusted. But there is a huge amount of well-grounded research that shows that helping students improve their self-assessment skills increases achievement. David specifically mentions error-checking, which is obviously important, and my thinking here has been greatly advanced by working (in Scotland) with instrumental music teachers. Most teachers of academic subjects seem to believe that most of the progress made by their students is made when the teacher is present. Instrumental music teachers know this can’t work. The amount of progress a child can make on the violin during a 20 or 30 minute lesson is very small. The real progress comes through practice, and what I have been impressed to see is how much time and care instrumental music teachers take to ensure that their pupils can practice effectively.”

          Does that help?

  9. Keith says:

    Current performance is used in order to make an assessment on the steps you need to take with the individual. If I have remembered something from the last time we studied something but my peer has not surely we are now at different ‘starting points’.

    ” I agree that we ought to start teaching from where the student is in terms of prior knowledge; any new instruction will make so much more sense if it is relevant after all,”

    But it is new instruction when the subject content is met again. We need to know where each student is starting from. Surely this is formative assessment.

    I think you missed my point on students forgetting at a predictable rate. This is ‘on average’ – no use for when I have a class on 30 who will be displaying the full range from this average. I need to assess what they know so I can build on it effectively.

    • David Didau says:

      I didn’t miss your point – I understood what you meant but this is where, perhaps, we part ways. I’m in favour, broadly, of whole class instructions that makes assumptions about the needs of most learners. If I assume that they need a spaced reminder after 3 days, this will be true for all – memory may well have ‘decayed’ differently for all but everyone will benefit from encountering the material again – overlearning is an important concept here. This being the case, it is a waste of time for me to find out what they might have remembered as I’m continuing with my expertly designed scheme of learning regardless.

  10. Keith says:

    We do part ways here. I agree they all will definitely benefit from encountering the material again. In fact I see this is very important. However within the expertly designed scheme of learning I would like to have some way of ensuring there is specific re teaching for those who are now really struggling and more advanced materials for those who have ‘remembered more’. I don’t see finding out as a waste of time but essential in ensuring that each student is faced with challenging material. This for me is where formative assessment becomes essential.

    • David Didau says:

      Hence the debate. If you’re right, AfL becomes essential; If I’m right, it’s a waste of time. I can ensure each student is faced with challenging material by considering difficulty and complexity when designing my instructional sequence.

      • Dylan Wiliam says:

        This is, of course, an empirical question. For what it’s worth, my own experience leads me to the belief that because of the complexity of the learning process, the actual effects on students of an instructional sequence, no matter how well designed, will not be predictable, at least at any level of specificity useful enough to guide teaching. And that’s why I think it is essential to build checks on understanding into the instructional sequence.

        There’s also the issue of grain size, which seems to vary from subject to subject. In mathematics, if I want to teach adding fractions tomorrow, I need to check that students can generate sequences of equivalent fractions today. In history, however, because understanding of cause and effect, of chronology, and of documentary provenance develop over a much longer timescale, and students’ ability to demonstrate these capabilities are messy (in that they can often do things in one context but not in another) frequent checking on understanding may well be a waste of time.

        One final point. In the “Children’s mathematical frameworks” research done at Chelsea/King’s College in the 1980s, it was found that for a typical topic being taught in KS3:

        one third of the students already knew the content before the topic
        one third of the students still didn’t know it at the end
        one third of the students actually learned the intended material, but half of these had forgotten what they had learned within three weeks.

        However, a number of students who failed to demonstrate understanding of the content at the end of the topic were able to do so three weeks later, despite having had no further teaching on the topic. Somehow, the teaching they had received was incorporated into their thinking over the three week period after the topic had been concluded. It’s all very complicated…

        • Ian Lynch says:

          Which rather reinforces the fact that making generalisation about teaching methods across the whole curriculum based on isolated aspects of brain physiology is both unscientific and dangerous.

        • David Didau says:

          Again, thanks: I also need to read that study. It seems to corroborate the research of Graham Nuthall who found something similar with students studying science. He said “”…students know, on average, about 50 percent of what a teacher intends his or her students to learn through a curriculum unit or topic. But that 50 percent is not evenly distributed. Different students will know different things, and all of them will know about 15 percent of what the teacher wants them to learn.” (Hidden Lives of Learners p35)

          He went on to observe that students, no matter how good their understanding in the lesson, only seemed to transfer knowledge from working to long term memory if they encountered it at least 3 times in 3 different ways.

          I’m also interested in the ‘grain size’ you mention. I’m sure you’re right that learning & memory are different in different domains but in the example you give, does it necessarily follow that just because students can generate sequences of equivalent fractions today that they will still be able to do it tomorrow? I would suggest that this will be the case only if you didn’t teach equivalent fractions today. And if that’s the case isn’t the assessment diagnostic rather than formative?

          • Ian Lynch says:

            Perhaps the reason most exams have traditionally had a 50% pass mark is that this is the amount of knowledge most people retain from the 100% intended at that stage of their learning. I think probably emotional experience at the time is a way of transfer to long term memory.


          • David Didau says:

            I don’t disagree that emotion has a role to play. But so does thoughtful application of cognitive science.

            The average figure of what most people are likely to retain is nearer 30% according to Ebbinghaus. Where do you get the 50% figure?

          • Ian Lynch says:

            I think it is a mistake to separate them. Hydrogen peroxide has a lot of oxygen in it. Putting it with manganese dioxide releases the oxygen. You can get oxygen out of hydrogen peroxide in other ways it’s just a lot less easy. Studying cognitive science in isolation when the interest is in wider learning that involves all, not just one part of the brain is like studying dynamics in the absence of friction and expecting to get reliable results in real world behaviour of moving objects. There is a time and a place for controlling variables in science but all variable need to be understood and their interactions to apply the science effectively in technological solutions. Teaching is a technological solution, not pure science.

          • David Didau says:

            Well, yes. Trouble is, a great many teachers exclude what science has to teach us (either wilfully or through ignorance.) Many (most?) teachers rely on an intuitive feel for what will work best. Sometime they’re wrong. My point is that allowing ourselves to be informed by the science of how we learn should be a no brainer.

          • Ian Lynch says:

            Yes, my problem I suppose is that as a scientist it is rather like for an English specialist seeing people complain that grammar has gone to hell in a handcart and then suggest solutions with incorrect or at best extremely limited understanding of grammatical methods.

            These are essentially political debates. Just because one side claims science does not mean they are any more scientific than the other when they apply the science invalidly. If I had a £ for every politician that started with “The fact of the matter is…” I’d be a rich man. Science is about caution, scepticism and trying to highlight where theories go wrong.

          • David Didau says:

            I’m all for caution, skepticism and identifying errors. And I’m doing my best to make sure my understanding of cognitive science isn’t incorrect or limited. I’m doing this by reading everything I can get hold of. But, by necessity, my understanding is of course limited. Twas ever thus. But you saying my reading is incorrect?

          • Ian Lynch says:

            I don’t know the extent of your reading but I wouldn’t limit it to one aspect of the brain important as it might be. If we were looking at a computer and trying to get it to operate most effectively, reading all there is to know about the Central Processor would be of limited use even though that component is central to a computers performance, its only central in the way it works with other components. Or take astronomy. Most obviously studying stars but visible light only provides a limited part of the picture. And in the visible range, visible light interacting with a mirror is the engineering required to see detail not just knowing the nature of light itself. This is why the relationships between knowledge are at least as important as learning more knowledge. It’s obvious that some knowledge is needed to start with but that isn’t particularly interesting in itself. Einstein wasn’t that great at maths compared to the best mathematicians of the day but he had enough maths to combine with imagining extrapolations to extreme physical scenarios like travelling close to the speed of light or sitting in a very strong gravity field. It was the combination of these that resulted in the theories of relativity not only more and more knowledge in one focused field.

  11. Keith says:

    Got to say I’m with the first paragraph from Dylan on this. You could design an instructional sequence that considers difficulty and complexity but surely the effects on an individual will not be predictable? I know I’m repeating myself now but some form of assessment is therefore needed to ensure the next instructional steps you take are appropriate.

  12. Michael Dorian says:

    David, I think what you have to say has huge value. Well designed instructional programmes which build in opportunities for learners to encounter material on several occasions in differing contexts clearly addresses the issues of the forgetting curve which all teachers know everyone suffers from. However, before designing such instructional programmes, and I think you may agree with this, a teacher should use their past experiences to design and decide how instruction should occur. Moreover, this design of instruction, from my perspective at least, is based on my reflections from over ten years of pupils studying the subject I teach. For me, what I am doing is using formative processes to design my instruction which I think Dylan Wiliam is arguing to some extent. I know this sounds like I am sitting on the fence, but in my opinion everything that has been argued above has value in assisting pupil learning.

  13. Ali Messer says:

    Fascinating stuff! I agree with Ian Lynch that differences in subjects matter. This is because to plan for or assess learning we don’t just need cognitive science and formative assessment practices. We need to understand subjects as forms of thinking or knowledge, and what progression means in these forms of knowledge.
    As Dylan Wiliam says there are some ideas in History that can develop at different ways and speeds in different pupils and these may be uncoupled from each other. This may be messy but it is still worthwhile in History to explore student understandings frequently. These checks form a common part of any History lesson that is part of what Neil Mercer (in another context) calls the long conversation that a teacher has with pupils over a whole year.
    Key concepts can then be revisited, but in the context of different topics, and understandings developed further… In this way teaching is informed by learning, not just by testing what is remembered, but also by designing activities in which students reason historically using what they remember.
    We must attempt to assess this reasoning to plan for progression, and we will need to re consider how we teach a lesson sequence on the basis of what our attempts at assessment suggest. Our attempts may be incomplete and flawed but research into children’s learning in History (Lee,Wineburg, and others) suggests that if we do not consider what is being learned as we attempt to teach, then children may take away from our lessons many things we did not intend them to learn.

  14. Well I seem to have missed the boat on this one – what was I doing last April? But I can’t resist adding some more thoughts, hoping that they are not too late – and partly because this strikes me as an important discussion which really shouldn’t have petered out a year ago.

    Having said that it is an important debate, I also sympathise with Nigel W ( when he says: “I’m still confused where the debate is”.

    I agree with David that most people learn in similar ways, broadly speaking, so that repeated review and practice is likely to be justified for all learners within a particular learning sequence – given that they all learnt it to a similar level the first time round. If some people already knew the material at the beginning of the sequence, the repetition may well quickly become demotivating.

    Although we all benefit from a simlar approach to reptition, we are likely to “understand” or model new information in very different ways, generating very different types of misunderstanding. So some way of addressing those different misconceptions is surely also important. The process of learning might be similar but the “cognitive content” is likely to be very different.

    I also agree with David that capability is invisible and can only be inferred through performance. But the loop can also be closed: a inference of capability is predictive of future performance, and if that future performance does not materialize, then we might conclude that our inference was faulty. Another explanation is that the capability has degraded – but that can be checked independently. The point is that inference is not guesswork – it can be checked – and if it is done on the basis of repeated sampling, its accuracy becomes increasingly reliable. So I don’t follow David’s reasoning in saying that, just because performance and capability are different, they are not closely related. And I am baffled by the statement that:

    Most teaching is predicated on the belief that if we improve performance we will improve performance [you mean capability/learning?]. If this isn’t true, then that seems antithetical to the practice of formative assessment because evidence of struggle and misunderstanding might in fact be desirable.

    Taking this backwards: experience of failure/struggle/acknowledged misunderstanding is how we improve both performance and capability. In fact, we only experience these things through performance. And it doesn’t work that we try something, screw it up completely, reflect on what happened, and then walk away saying to ourselves, “Ah, now I understand how to do it”. We have to keep on practising (performing) until we get it right. So I do not see any rhyme nor reason to say that performance and capability are not closely related, or that believing in the power of performance suggests that failure is not an important part of the learning process.

    I *do* see that different pedagogies might suggest different amounts of failure. One might coach the student by locking them into a harness which rehearses the muscle/mental movements exactly until they are thoroughly drilled. Anther might allow them to pursue different hypostheses and come to their understanding through trial and error. I suspect that different combinations of the two approaches may be appropriate in different situations – and my hunch is that in our current belief system we underplay the value of coaching – getting it right first time – in favour of a more liberal but in some circumstances confusing, “you’re on your own, mate” approach. But that is down in the weeds – they are both implementations of a practice-assessment-feedback cycle, aiming at a defined objective.

    Similarly with self-assessment and peer-assessment – its down in the weeds. I agree with Dylan that much self-assessment might be wrong: this is why conectivism, social networking and the so-called wisdom of the crowd are so unhelpful in K-12. But it may be helpful in focusing the mind of the students on the difference between their own performances and the objective – this might aid the process of reflection which I suspect is important. It might also be a useful way of scaling feedback and interaction, which is always a challenge. Even incorrect feedback might be better than no feedback at all if it prompts reflection and self-examination, particularly if authoritative feedback comes along later. It strikes me as a very useful weapon in the armoury if it is properly structured – I like Eric Mazur’s approach at This puts peer-assessment (which I am not distinguishing from peer instruction) within a structured environment and enables it to scale, not replace, feedback from and interaction with the teacher.

    Do you tailor instruction on the basis of your monitoring of student performance or do you plough ahead regardelss with your preferred programme of study? I suspect a bit of both. I suspect that you are agile at micro level (Johnny is confused because he thinks y – this misconception needs to be identified and addressed), and at the macro level (Johnny cannot do multiplication or subtraction so he is not yet ready to learn long division). This is what Keith is saying ( But at the middling level (everyone learnt something new on Wednesday so they will need to review this new learning regularly over the next few weeks and months), little adaptability is required. I am not sure where the argument is because it seems to me that time and again, the answer is that we need to do both.

    In saying that these things are down in the weeds, I strongly agree with you, David, when you say that “Many (most?) teachers rely on an intuitive feel for what will work best. Sometime they’re wrong. My point is that allowing ourselves to be informed by the science of how we learn should be a no brainer” ( But I think that the “down in the weeds” issues will often require mixed and context-appropriate approaches and, though I am not an academic, it seems to me that the state of research is not yet sufficiently good to bring the scientific method to bear on these questions. I think our first steps ought to be to create the top level framework, including the definitions, which will support this further, more detailed, quantitative research. The existing research, I also agree with you, is often useful, but only at this higher level.

    That framework and terminology, I would argue, should include an understanding of how peformance, capability and learning relate to each other. It should include a generic model of pedagogy which includes a practice-assessment-feedback-progression cycle. Whether that progression decision is invariable or whether it is adaptive based on teacher judgement, machine algorithm or student choice will all depend on the particular pedagogy and context.

    Finally, to return to the original article, I am sympathetic to David’s argument that we should be wary of being too explicit about the success criteria that are applied to a single assessment. I think the problem with this sort of metacognition is that the success criteria are often going to correspond to the mark-scheme – and revealing the mark scheme to students is inevitably going to mean that they coach themselves in the mark-scheme, which is going to distort the true learning objective. As Daisy Christodoulou argued at ResearchEd2014 (, high-stakes assessments represent a sample from a wider domain (the curriculum) and for that reason the sample should be selected on a random – i.e. unpredictable – basis. Pre-publishing the limits of the sample does the opposite – and pre-publishing and coaching students in the mark-scheme does something very similar.

    This brings us back to the relationship between capability and performance. How, then, do you define what you mean by a capability? One of the major problems that we had with criterion referencing is that it attempted to define capabilities (i.e. learning objectives) in terms of short rubrics, which everyone interpreted in different ways. In my view, the only way to communicate real meaning around a capability is by exemplification of the capability through performances. For example, past papers, mark-schemes and sample scripts. This underlines the close relationship between capability and performance. The difficulty which comes with narrowing the curriculum by teaching to the sample, and not to the whole domain, must therefore be met by ensuring an extensive and open-ended set of examples, taken from all areas of the curriculum. We should try and find ways to vary the mark-scheme, or to make it less formulaic, or more dependent on context. “We gave Johnny 3/4 because…” and not “To get 3/4 you needed to…” The problem is then to ensure consistent interpretation in this larger body of examples. I think there are two approaches to this problem.

    First, by ensuring that consistency is stewarded by by a particular authority – this is what happens at the moment in the case of Awarding Bodies and tends to encourage the narrowing of the exemplified sub-domain (a fancy term for “what is assessed”). But by “stewarding authority”, I do not only mean formal Awarding Bodies – they might be a publisher or software house, school LEA or academy chain.

    Second, by using analytics software to demonstrate correlations (or lack of correlation) not only between the different peformances but also between the different stwearding authorities. This will help to widen out again the exemplified domain, so that although any particular assessment continues to represent a sample, the entire body of assessment examples maps closely to the whole domain, giving real meaning to our understanding of the learning objective.

    This is where I think ed-tech provides the answers, both to implementing formative assessment consistently and in doing the analytics that will make sense of the learning objectives. This will allow us, I think, to move away from making a distinction between formative and summative assessment. There is no reason why all practice should not be tracked (i.e. assessed) and the outcome of that tracking used for a multitude of purposes, including diagnostic, research, monitoring, certification, and helping to define more clearly the objective.

    Finally finally, it is not possible to re-read this thread without mourning again the recent death of Ian Lynch. He was a very thoughtful and much valued member of our community.


  15. […] we have, cherish them. I spent many hours arguing with Ian Lynch. I still feel his loss keenly. Here’s a taste of his abrasive, but useful […]

Constructive feedback is always appreciated

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: