Monday, 25 May 2015

The Rise of the Assessbots: do systems influence assessment?

"And we would go on as though nothing was wrong" Joy Division, Transmission

I recently got into a discussion on Twitter about the differences between tracking and assessment, and whether there is any such thing as an assessment system. I'd suggested that many systems auto-assess (I referred to them as Assessbots), that the teacher ticks a few boxes and waits to see what comes out the other end. This is wide of the mark and, with hindsight, actually rather patronising so I apologise for my glib remark. Obviously systems don't teach, question, mark work, feed back and decide whether pupils have achieved or not (well not yet anyway). But do systems have tools that guide assessment? Yes. And can these tools actually influence assessment in some way?

Probably.

That was the point I was trying to make in my rather clumsy way and it's certainly something worth exploring further. 

If we take a look at most of the popular systems in use in schools they will have some sort of assessment tool built in. This will involve an APP-style series of statements or objectives against which a teacher will tick an appropriate box or enter a code to signify the pupil's level of understanding. Obviously, here the teacher is carrying out the assessment but perhaps there are some grey areas regarding who's the master and who's the servant:

1) Some systems are inflexible, offering a set list of objectives - the provider's interpretation of what is and is not important - and so map out what is to be taught, perhaps in a particular order. Moreover, some systems may have just a few key objectives (e.g. NAHT KPIs) to guide assessment whilst others have many more. The former relies more on a teacher's judgement within a broad framework whereas the latter is more prescriptive and definitive. Clearly some schools are happy with just a handful of key indicators to guide assessment whilst others seem to find security in having more exhaustive lists of objectives to inform assessment. Whilst there is no right approach, perhaps the latter risks sidelining the teacher's professional judgement, reducing assessment down to a 'painting by numbers' approach. It's possible that systems are providing a crutch that users become dependent on but which does little to develop a teacher's skills in assessment.

2) The ticks, codes or scores entered against each statement are then converted into a numerical value, weighted, aggregated and compared against a series of thresholds to arrive at an overall judgement (e.g. ARE band) used for the purposes of tracking pupil progress. The teacher has assessed the pupil to be developing in some areas and secure in others but it is the system that decides whether the pupil is Year 4 Developing or Year 4 Secure overall. Many schools may use these labels as a guide and manually adjust them as appropriate but some clearly don't, choosing instead to accept the category the system assigns pupils to. And even if you don't agree with the system and change the outcome accordingly, any adjusted judgement is still constrained by the system's pre-defined parameters. In other words, you may change outcomes to something more realistic and find that the tracking sheet turns bright red, causing you to question the changes you've just made. It might then be tempting to go back and tweak a few assessments at the objective level to alter the end result. We then succumb to Goldilocks Assessment Syndrome, testing the input until the output is just right. Believe me, it happens. I've had a few teachers admit that this is going on in their school because the system they use wasn't producing the data they'd expect. So, just go back and tweak it until it does. Unfortunately, what you are left with is not formative assessment. It does not provide an accurate record of a pupil's strengths and weaknesses. It does not help identify gaps in a pupil's learning. Instead it is useless data, gerrymandered to gain the desired outcome.

All of the data produced by these systems are based on the observations made by teachers but there is a concern that the important detail of assessment risks being lost or supplanted by clumsy, auto-generated judgements, which can influence next steps, reinforce misconceptions, have a bearing on future teaching and assessment, and even cause recent assessments to be re-evaluated in light of the outcome.

Perhaps the issue is best illustrated by the following statements. Ask yourself if you've ever heard anything like these being uttered in your school:

"they can't be secure because they were only emerging last term"

"I ticked all these objectives and they've only made one step"

"They need to achieve at least 33% of the objectives to move up a band"

"They have to make at least 3 steps per year"

All of the above are warning signs that the system is exerting an influence over assessment in a school; that there may be a temptation to allow what we want out of the system to affect what we put in. A classic case of tail wagging dog.

System breakdown 

A few months ago I helped a school extricate itself from its existing system and set up an new one. The previous headteacher had left and the new head was keen to implement something more user friendly. The old system was not popular with staff - they found it clunky, overcomplicated and unintuitive - and everyone wanted to try something new. 

So I began the process of extracting the data and getting it into a format that I could import into the new system. Whilst going through the spreadsheet I'd put together I noticed more and more spurious data and weird anomalies: pupils that hadn't changed level for over a year, some that had jumped a level in the space of a term, and others that had spiked and dramatically gone backwards. After much head scratching I turned to the new head:

Me: "I don't think I can use this. It's all over the place. It just looks all wrong"

HT: "well, what are we going to do then? We have to have something"

Me: "Yes, but not this. This is nuts. Do teachers have anything else?"

HT: "I doubt it - can't see why they would - but I'll ask"

Within an hour teachers came to the office bearing gifts: excel files on memory sticks, word tables, hand drawn grids on A3. A complete, alternative set of assessment data. The headteacher looked fairly stunned:

HT: "Where has this come from? Why have you got all this?"

Teacher: "We all made our own assessments. You don't think we trust the crap that system churns out do you?"

It turns out that teachers had been expected to use the system's APP tool despite their serious misgivings and lack of faith in the data it generated. It was felt that the system provided assurance and made teachers less accountable for outcomes. That it is in someway better to allow a system to categorise pupils, regardless of accuracy, than to let teachers use their professional judgement. This was evidently a big mistake. 

I've encountered similar situations in other schools. Due to the massive pressures of accountability and need for evidence there is, quite understandably, a growing desire for systems that don't just lessen the administrative burden, but also reduce risk, deflect blame, and devolve responsibility. A system we can point at and say "it wasn't me, it was him". They thrive in a culture of fear and high stakes.

So, no, systems don't assess, teachers do, but they can certainly influence the assessment process. Through their metrics, algorithms, and parameters the system converts assessment data into tracking data and there is therefore a very real risk of allowing the desired outcome to dictate the input. A couple more 'exceedings' here, a few more 'secures' there and - voila! - just right.

Hopefully none this will ring true with you, in which case there's nothing to see here, move along. However, if there's a grain of truth in any of the above then maybe it's time to take a good hard look at your system and do some soul searching about your rationale for assessment.

Again it comes back to this:

Your tracking system must be tailored to fit the way you assess your curriculum, not the other way round.

And perhaps it's time we did some cold turkey and tried weaning ourselves off our dependence on systems that dictate what we do.


Friday, 8 May 2015

The #TwitteratiChallenge

Many thanks to Mary Myatt (@MaryMyatt) for nominating me for the #TwitteratiChallenge. I feel most honoured. I hope i'm completing this within the required time frame; it's taken me a while to sort out a list in my head.
The #TwitteratiChallenge started by by Ross McGill of the @TeacherToolkit fame with the following aim:
“In the spirit of social-media-educator friendships, this summer it is time to recognise your most supportive colleagues in a simple blogpost shout-out. Whatever your reason, these 5 educators should be your 5 go-to people in times of challenge and critique, or for verification and support”
There are only 3 rules.
1. You cannot knowingly include someone you work with in real life.
2. You cannot list somebody that has already been named if you are already made aware of them being listed on #TwitteratiChallenge.
3. You will need to copy and paste the title of this blogpost and (the rules and what to do) information into your own blog post.
What to do?
This what to do:
1. Within 7 days of being nominated by somebody else, you need to identify colleagues that you rely regularly go-to for support and challenge. They have now been challenged and must act and must act as participants of the #TwitteratiChallenge.
2. If you’ve been nominated, please write your own #TwitteratiChallenge blogpost within 7 days. If you do not have your own blog, try @staffrm.
5. The educator that is now (newly) nominated, has 7 days to compose their own #TwitteratiChallenge blogpost and identify who their top 5 go-to educators are.

So here are my nominations in no particular order:
Michael Tidd (@MichaelT1979) for knowing everything there is to know about primary curriculum and assessment and changing my mind about pretty much everything.
Jack Marwood (@icingonthecake) the caped crusader of school data, never tires of fighting injustice and nonsense.
Hayley Earl (@hayleyearl) forward thinking Assessment Lead in Gloucester primary with whom I had a fantastic chat today. Also writes a heartfelt blog about her experiences as a teacher.
Peter Atherton (@DataEducator) for late night data chat on twitter, helping me get things clear in my head.
Karen Horne (@mrskhorne) hardworking and hilarious headteacher of a Birmingham primary. Common sense and comedy in equal amounts. Wonderful.

Thursday, 7 May 2015

The Progress Myth II: alternative approaches to measuring progress

Anyone who's read my recent tweets or blogs, or been to one of my ranting training sessions, will know that the concept of linear progress - that pupils should all progress at the same rate regardless of start point - has become a bit of an obsession. As I've blogged about previously, it concerns me that so many established tracking systems are maintaining a continuous point scale (points now called steps) in order to track progress. Deep down we all know this is not how children learn - it's one of the main reasons why getting rid of levels is a good thing - but it makes tracking easier if we distill progress down to a simple number: an expected rate of progress that applies to all. So, we define a unit of progress and expect all pupils to make three of them per year. Why three? Well, because there are three terms per year and that's what our systems have always done so redevelopment is minimised. Meet the new boss, same as the old boss. We therefore continue with the deep-rooted, universal expectation of progress for all pupils because it makes tracking easy. The system dictates the measure.

Seriously, if systems dictated everything then we'd probably shop in alphabetical order.

4 steps good, 3 steps bad (or average, or expected, or not good enough) has become doctrine and lives on in a new guise. Hardly assessment without levels. Just look at your system and ask yourself this: how does it define better than expected progress? Does a pupil need to tip into the next year's curriculum in order to gain that all important 4th point? If so, then you should be concerned. Unless we seek out 'real' alternative approaches to tracking progress we are destined to continue down the same path, focussing on pace at the expense of depth of understanding, and repeating the mistakes of the past.

On two separate occasions recently - once on Twitter and once during one of my sessions - I've been asked what the alternative is to linear approach to progress measures. On both occasions I've suggested simply comparing the percentage of objectives in a particular subject that are deemed secure at, say, the start and end of the term. And on both these occasions the response has been the same: "but that is linear progress!".

No it's not.

It only becomes linear if we assume and apply a common expected rate to the data and make a judgement about the pupil's progress by comparing the percentage change against an arbitrary threshold. For example, we expect the percentage of 'secure' objectives to increase by 33% points each term, which is common rule in many established tracking systems. So, if a pupil moves from 40% to 70% secure, they have failed to make the expected progress but if they progress from 40% to 75% they've done OK. And if one pupil progresses from 70% to 100% secure they've made the same amount of progress as a pupil progressing from 40% to 70%. And perhaps if they progress by 40% points or more across the term they've made better than expected progress because that's a nice, neat figure that we can all easily remember.

So, stating that a pupil has progressed from X% to Y% isn't the problem as I've not yet drawn a conclusion from those two observations. The problem is when we then seek to categorise the progress pupils make by applying a common  rule to the data - a universal constant of learning. We simplify the data and neaten it up to make it easier to understand; so that we can make a quick and easy judgement about pupil performance. Ultimately we want to have a magic number, standard unit of progress that allows us to compare progress of pupils of differing abilities, from different start points, even in different years and subjects, so that 3 steps in maths in year 2 supposedly has the same value as 3 steps in reading in year 6.

The truth is that linear progress is an easy concept to get our heads round so we accept it despite knowing that it's wrong. However, as is often the case with data, the more we simplify it, the less useful or meaningful it becomes. We're just pigeon holing pupils for the sake of convenience. It doesn't mean anything, it's just makes our life easier. Plus, it's how the system works, so we have no choice, right?

So what are the alternatives to measuring progress in this way? Here are a few options:

Option 1: Do nothing

As I've discussed in my previous post, The Progress Myth, maybe progress isn't something we can quantify or categorise; and maybe we shouldn't attempt to. Instead, perhaps we should simply use our tracking systems to identify gaps in learning so that teachers can better support their pupils. A brave and radical move away from making judgements about pupil progress based on arbitrary thresholds, but maybe it's what we should be doing. Maybe we should stop trying to quantify the quantifiable.

Option 2: A teacher assessment of progress

Almost as brave as the above but makes sense when you think about it. If teachers can make an assessment of attainment, then why not progress? Why do we need to rely on a system to quantify and make a judgement on progress based on some calculation that we don't agree with? Imagine if the teacher made an assessment of the progress pupils made that took into account start point, expectations, targets, attitude to learning, effort, learning difficulties, and other influencing factors. Is that really so radical?

Option 3: Establish progress pathways

If we accept that different pupils learn at different rates in different subjects from different start points, then we could attempt to establish progress pathways for a more meaningful approach to the tracking of pupils' learning journeys. Essentially, individualised target setting whereby progress is checked against appropriate interim milestones set for the end of each term or year. Not easy to establish but more meaningful than a straight line.

Option 4: Track towards an end of Key Stage VA estimate

This is attractive on two counts:

1) VA will be the only measure of progress from 2016 onwards and considering the anxiety around the proposed attainment floor standards (85% meeting the expected standard at Key Stage 2), schools are going to want to pull apart and understand VA more than ever.

2) VA involves comparing a pupils attainment against an average outcome for pupils with the same prior attainment nationally, which means that it does not involve a universal expected rate of progress like the levels of progress measure. It is therefore fairer.

The problem with tracking towards such a distant target is that systems generally do an awful job of it. Common methodology involves generating an end of key stage prediction based either on an expected rate of progress or extrapolating the pupil's actual rate of progress to date. The former is meaningless (see above) and the latter is highly inaccurate because we can't assume that pupils will continue to progress at the same rate because progress is not, well, linear. Instead, to do this properly we'd need to establish appropriate interim milestones that don't necessarily sit along a straight line. Obviously, VA estimates are going to be vitally important; it's just that we haven't really worked out how to track towards them yet.

(Right now, with so much ambiguity around assessment and tracking, I'm favouring option 2.)

And so......

We know that the concept of progress is changing. We know now that it is more about depth and less about pace, and that our systems are struggling to adjust. I'm seeing too many systems that are awash with red because pupils are apparently below age-related expectations and are making poor progress or have gone backwards. Some schools wearily accept this whilst others attempt to work around it, perhaps tinkering with the assessments to arrive at a more acceptable range of figures and colours. This can't continue.

We don't really have a clear idea how any of this will work and that's why we need to stop and think long and hard about how we track and monitor pupil progress. We certainly shouldn't blindly accept the approach employed by the systems we use. Ask yourself this question: Does your current system provide a fair, accurate and meaningful representation of the progress pupils make? If it does and you like your system, that's great, but if it doesn't then ask your provider about alternative approaches. If there is no alternative approach, then seriously consider changing the system. Do not make compromises to your preferred approach. 

And always remember the mantra: the system must be tailored to fit the way you assess, not the other way round.