Saturday, 23 January 2016

Slave to the algorithm

A few months back someone posted a screenshot from an Ofsted report on Twitter. The paragraph in question stated that 'according to the school's own tracking data, most pupils are not making expected progress', Ouch! The school appeared to have shot itself in the foot with its own system. 

It's tempting to write this off as an isolated case: a naive headteacher who made an error of judgement. More fool them. But this is far from being an isolated case; it's actually quite common. I regularly go into schools and get shown tracking systems that are awash with red. Loads of children are apparently below 'age-related expectations' and are not making 'expected progress'. Yet, invariably, the headteacher will claim that 'this is not a true reflection of the pupils in our school', and 'if you were to look in their books you'll see the progress they've really made', which begs the simple question:

What is the value of a system that is at complete odds with reality?

It does the school no favours whatsoever, yet they soldier on with it because they've 'already paid for it' and they've been 'using it for years' and 'the staff understand it' and 'the governors like the reports'. The fact that it is not painting the school in a favourable light is not enough to overcome their inertia. It's like changing banks: hardly anyone does it.

This term the issue has become particularly apparent due to the simplistic algorthithms used in some systems. Essentially, systems are counting how many objectives have been assessed as achieved or secured, and presenting this as a percentage of the total number of objectives to be taught that year. If the system has the in-built, crude expectation that pupils will achieve a third of the objectives per term (a common approach), then any pupil that has not achieved over 33% of the years' objectives this term will be classified as below age-related expectation and will not be awarded the additional point that indicates that they have made so-called expected progress. But is 'age-related expectation' really a simple on-off thing or is it more subtle that that? More realistically, the vast majority of pupils are likely to be working within the age appropriate curriculum; it's just their security within it and the support they require that differs.

So 'ARE' is far more nuanced than the data suggests yet schools are putting up with stark binary approaches. Rather than turning off their systems and attempting to do something more meaningful instead, they find workarounds, export the data and convert it into something else, or try to ignore vast swathes of the system that make their school look bad. But what happens when someone asks how much progress the pupils are making? With a sigh they will inevitably turn back to the data because they have nothing else, and risk shooting themselves in the foot in the process.

It appears that we have become hard wired to quantify progress. To distill pupils' learning down to a neat linear point scale even when it does us no favours whatsoever. Even when it bears no relation to the achievement of our pupils. Even when it jeopardises the standing of our schools. We are evidently finding it exceedingly difficult to break the chain. 

But break the chain we must. Legacy tracking systems - those originally designed to measure progress through levels - only serve one group of pupils well: those that are catching up. These pupils can get extra points by rapidly progressing through the curriculum. They make 'better than expected progress' in the traditional sense. However, a pupil that starts and finishes the year at the broadly age-related expectations appears to have made less progress despite deepening their learning; and the progress of the pupil that is closing gaps from the previous year is also not properly recognised. As for the SEN pupil that has not covered much in terms of curriculum but has overcome significant barriers to learning, their progress hardly even registers on the scale. 

Catching up, filling gaps, deepening understanding and overcoming barriers to learning - it is clear that we need more intelligent systems that are capable of recognising these different types of progress and treating them as equals. I can't see a simple algorithm doing this. Surely only a human is capable of identifying such complexities of learning and making an accurate assessment of progress. Unfortunately we have become accustomed to having a system make the decision for us. We have effectively absolved ourselves of responsibility for assessment and handed it over to a machine. Tick the box and press a button. This might have made us feel a bit less accountable in the past but now it's starting to backfire. All too often we find ourselves at odds with our systems.

As daunting as it sounds, it's time we started to wrestle back control of our data before it bites us on the backside. Do we really want to present data that erroneously suggests that only half of the pupils are at age-related expectations and few are making expected progress? No, of course not.

Now ask yourself this: if your tracking system ceased to exist would you reinvent it?

No?

So, what would you do instead?

Whatever your answer, it probably makes a lot more sense than many of the pseudo-level approaches that are currently on offer. I am not saying that we should ditch systems altogether, I'm simply saying we need to find better ways of doing things, that more realistically reflect the subtleties and complexities of pupils' learning.

What will be the consequences if we don't?







Thursday, 7 January 2016

A Brief History of the Primary Future: Part 1

The pace of change over the past 2 years has been extraordinary, bewildering and probably unprecedented. Reforms to assessment and accountability have come so thick and fast that I have got into the habit of checking the DfE and Ofsted websites and twitter feeds for updates just before I do a talk; and I've included last minute, 'hot off the press' items into presentations on a number of occasions. But now that things have settled down (a bit) I thought it'd be a good opportunity to try to summarise these changes - well, the key ones anyway - and provide a rundown of the main documents that have been thrown our way. This is as much for my benefit as it is for anyone else’s. Sometimes you have so much stuff piling up in your brain that it’s a good idea to dump it all out onto a page just to try to make sense of it all. Bit of task in this case but I’ll give it a go.

It all kicked off in earnest back in June 2011 with the publication of the final report of Lord Bew's independent review of key stage 2 testing, assessment and accountability following a consultation that ran from November 2010 to February 2011. Amongst numerous other things, this report recognised the inconsistencies and failings of the levels system and paved the way for their removal.

*Jumps into TARDIS and zips forward 2 years, probably missing loads of important stuff*

The DfE set out their stall  on 17th July 2013 with the publication of the consultation on Primary Assessment and Accountability under the new National Curriculum. This was the document that proposed decile banding of pupils and a KS2 floor standard of 85%. It also included the phrase 'secondary ready'. All these things have now gone, except they haven't. Not really. This document also confirmed the removal of levels, which prompted various software companies to announce that they'd sussed the whole problem by calling levels something else. Oh, and there was the small matter of reception baseline thing. Note the timing of publication of the consultation, by the way, just before the summer break. Yay!

There then followed the publication of the new National Curriculum and programmes of study on 11th September 2013, which gave us all the key objectives/statements/indicators that we are now assessing and building into our tracking systems. Actually, the consultation states that the programmes of study were published on 8th July yet the official document states 11th September. Maybe they're referring to different things. I'm not sure. I'm just the data guy.

The next key document came in March 2014. This was the Government's response to the consultation on primary assessment and accountability, which closed on 11th October 2013. This 24 page document is essentially the DfE saying "OK, we won't do the decile banding thing, but we're going ahead with everything else". Call me a conspiracy theorist but I still reckon the decile banding idea was a red herring.

Next up was the consultation on Performance descriptors for use in key stage 1 and 2 statutory teacher assessment for 2015 / 2016 , launched on 23rd October 2014. No one seemed to be particularly excited or upset by this and responses were few until Michael Tidd galvanised a Twitter army into action. The deadline for responses was 18th December 2014 (end of term. Hurrah!). We waited with bated breath.

The government published their response to the consultation on the performance descriptors in February 2015. Evidently Michael Tidd's army had made themselves heard, and the DfE, to their credit, listened - the performance descriptors were to be withdrawn and replaced in September 2015.

Then, in March some exciting news: the Commission on Assessment without Levels was announced. At last, something progressive and useful was happening. Unfortunately the final report wouldn't be published until September 2015, months after we were promised, but those of us keeping an eye on Twitter over the summer were treated to a leaked version via the Guardian's Warwick Mansell. It's a great report, full of reassuring and supportive statements, which has since been endorsed by both the DfE and Ofsted's Sean Harford. Finally, schools could feel empowered to go forth and develop meaningful methods of assessment, and leave those 'levels by another name' systems behind. To be honest, we could have done with this a year earlier but I'm not one to look a gift horse in the mouth.

July 2015 saw Nicky Morgan announce the DfE's plans to tackle so-called 'coasting schools'. Many are still trying to work out what a 'coasting school' actually is, but essentially it's floor standard-max: schools that are below the 85% attainment floor and below all progress thresholds across 3 years are deemed to be coasting. And since 2014 is the first year of this 3 year rolling programme, the first batch of coasting schools will be identified in autumn 2016. FFT estimated that around 5% of primary schools will be affected based on the last 3 years' data. It may well be higher than that.

Then, in the dead of summer, the DfE snuck out this document. The floor standard would remain at 65%, not be raised to 85% as originally proposed. Well, except in the coasting measure, but that's different, obviously. Still, it's nice to have some good news for a change.

Summer over and, as promised, in September 2015, the Interim Frameworks for Teacher Assessment at the end of Key Stage 1 and Key Stage 2 were published. These replaced the withdrawn performance descriptors with something that was, well, a bit similar. The key difference was the disappearance of 'mastery' and its replacement with the snappy 'working at greater depth within the expected standard'. At least it's definitive. The elephant in the room here of course is the use of the word 'interim'. Things are no doubt set to change again next year.

Then came the Assessment and Reporting Arrangements for EYFS, KS1 and KS2 in October 2015. These set out schools' responsibilities for administering the tests and provide key dates for testing and submission of data to LAs and the DfE. It also confirms that only the teacher assessments, not test scores will be submitted for KS1. Will that still be the case in 2017? I doubt it.

The ARA documents were updated in December 2015 to take account of the recommendations of the Rochford Review. This report provides an interim solution for pupils working below the standard of the key stage tests in the same format as the interim teacher assessment frameworks described above. The proposed pre-key stage standards contained therein have caused a lot of head scratching, a bit of consternation and even some amusement. Now, I'm more than happy to see the back of levels but the arguments against them on the grounds that they were complicated begin to look rather shaky when they are replaced with classifications such as these. On the plus side, at least such terminology is unlikely to be adopted for the purposes of formative assessment. Perhaps that's the point. And of course, there's that word interim again. What will happen next year is anyone's guess but will 'growing development of the expected standard' still exist? Possibly not. And is statutory teacher assessment an endangered species?

No doubt there are many more seismic shifts on the assessment horizon.