Friday, 4 May 2018

Test score upload and progress analysis in Progress Bank/Insight

Andrew Davey at Insight (@insightHQ; has been busy building a very neat, intuitive interface for the quick uploading of standardised test scores into Progress Bank and Insight, and analysis of the data.

As stated previously, the aim of Progress Bank is to provide schools with a simple, online system that will capture any standardised test data from any provider and measure progress between any two points. The sort of data that could be uploaded and analysed includes:
  • NFER tests
  • GL progress tests
  • CAT4
  • STAR Assessment
  • KS1 scaled scores
  • KS2 practice SATS results
  • KS2 actual SATS results
  • Reception baseline scores
Ultimately, we want to be able to build up enough data to enable the calculation of VA between any two points. This will involve a DfE-style calculation whereby pupils are placed into prior attainment groups based on a previous assessment, and their score on a following assessment is compared to the average score of pupils in the same prior attainment group. This could be from reception to KS1, or from KS1 to KS2, or from Y1 autumn to Y5 spring, or Y3 entry to Y6 SATS (useful for junior schools). In theory, if we get enough data, we can measure progress between any two points. The progress scores will be shown for pupils, key groups and cohorts, for reading and maths (and possibly SPaG if you are testing that too). By measuring progress using standardised tests, it is hoped schools will stop reinventing levels and use teacher assessment purely in the classroom, for formative purposes.

Until we reach the point where we have enough data to calculate VA, we will instead track changes in standardised scores or percentile rank for cohorts, groups and pupils (bearing in mind that standardised scores do not always go up, and no change is often fine). 

The system involves a three step process:
  1. Upload CTF (it is secure and GDPR compliant)
  2. Upload test scores
  3. Analyse data 
It is fairly quick to do. Once a CTF file has been uploaded, users can then upload test scores via a simple process that allows them to copy and paste data onto the screen.

Then paste the names and chose the order of surname and forename. This will enable the system to match pupils to those already in the system:

Then validate the data. Any pupils that don't match will be flagged and can be matched manually.

we can then select the assessment for which we want to upload scores for this particular cohort:

and add the scores on the next screen, again by copying and pasting from a spreadsheet:

That gets the data into the system (you can retrospectively upload data for previous years and terms by the way) and all we need to do now is analyse it. This is done via a simple pivot table tool within the system. The following screen shows summary of year 5's NFER tests scores for autumn and summer term broken down by key group. There are various options to select cohorts, start and end points, assessments, columns and rows values, and cell calculations. Note that the progress column currently shows change in standardised score, and the plan is to move that to a VA measure when enough data is available.

And finally, by clicking on a cell, we can drill down to pupil level; and by clicking on a progress cell we can access a clickable scatter plot, too.

Red dots indicate those pupils whose scores have dropped, and green dots show those whose scores have gone up. Clicking on the dots will identify the pupil, their previous and current test scores, and progress score between the two points selected. The colours are not intended to be a judgement, more an easy way to explore the data.

That's a quick tour of the Progress Bank concept, as it currently stands. The upload tool is already available to Insight users, and the pivot table report will be rolled out very soon. Progress Bank, featuring data upload, pivot tables and scatter plots, will be launched as a standalone tool in the Autumn term, for those schools that just want to capture and analyse their standardised scores without the full tracking functionality of Insight. It will therefore complement existing systems, and provide a quick and simple way of generating progress scores for Ofsted, governors and others.

Prices to be announced. 

More info and register your interest at

Thursday, 26 April 2018

5 Things primary governors should know about data. Part 5: pupil groups

This is the 5th and final part in a series of blog posts on data for primary governors. Part 1 covered statutory data collection, part 2 was on sources of data, part 3 explained progress measures, and part 4 dealt with headline measures. In this post we're going to discuss those all-important pupil groups.

When we look at school performance data in the performance tables, Analyse School Performance (ASP) system, the Ofsted Inspection Data Summary Report (IDSR), and FFT Aspire, we can see that all those headline figures are broken down by pupil characteristics. Keeping tabs on the performance of key groups is evidently vital; and senior leaders and governors have an important role to play in monitoring the progress of these groups and the attainment gaps between them. Broadly speaking we are dealing with four key types of data: threshold measures (percentages achieving expected or higher standards), average scores, progress scores, and absence figures. Officially, we only have average scores and progress scores at KS2, although your school's internal data may have other measures you can track, including data from standardised tests. Also note that Ofsted, in the IDSR, have a pseudo-progress measure for KS1 whereby attainment is broken down by start point based on Early Years (EYFSP) outcome. More on that later.

Before we push on to look at the main pupil groups and what the various sources of data show us, it is important to note that it is easy to read too much into analysis of data by group. If we take any two groups of pupils - eg those with last names beginning A-M vs those beginning N-Z - there will be an attainment gap between the two groups. What can we infer from this? Nothing.

The main pupil groups are: gender, disadvantaged, SEN (special educational needs), EAL (English additional language), mobile pupils, term of birth, and prior attainment. Some of these require more explanation.

This group includes pupils that have been eligible for free school meals (FSM) in the last 6 years, have been in care at any point, or have been adopted from care. It does not include Forces children. Previously this group was referred to as pupil premium (and still is in FFT reports). When we look at reports we may see reference to FSM6 (or Ever 6 FSM). These are pupils that have been eligible for FSM in last 6 years and usually this is the same as the disadvantaged group although numbers may differ in some cases. We may also have data for the FSM group, which usually refers to those that are currently eligible for free school meals; and numbers will therefore be smaller than the disadvantaged/FSM6 groups. 24% of primary pupils nationally are classified as disadvantaged.

SEN is split into two categories: SEN Support and EHCP (Education, health and care plan). Note that EHCP replaced statements of SEN, but your school may still have pupils with statements. Nationally, 12.2% of primary pupils have SEN Support whilst 1.3% have an EHCP/statement.

Mobile pupils
The DfE and FFT have quite a strict definition here: it relates to those that joined the school during years 5 or 6. If they joined before year 5 they are not counted in this mobile group. Your school's tracking may have other groupings (eg on roll since reception).

Term of birth
Quite simply, this refers to the term in which the pupil was born. Research shows that summer born pupils tend to do less well than their older autumn or spring-born peers but that the gap narrows over time. ASP and IDSR does not contain any data on these groups, but FFT reports do.

Prior attainment
This could be a blog post all on its own. Here we are talking about pupils categorised on the basis of prior attainment at the previous statutory assessment point (i.e. EYFS for KS1, or KS1 for KS2). Whilst there are 24 prior attainment groups used in the KS1-2 progress measure, for the purposes of reporting we are just dealing with three groups: low, middle and high. Unfortunately, it's not as simple as it seems.

At KS1, pupils' prior attainment is based on their level of development in the specific subject (reading, writing or maths) at foundation stage (EYFSP). The prior attainment groups are not referred to as low, middle and high; they are referred to as emerging, expected or exceeding (terms used for assessment in the reception year). The percentages achieving expected standards and greater depth at KS1 are then compared to the national figures for the same prior attainment group. This data is only shown in IDSR.

At KS2, pupils' prior attainment is based on their results at KS1, and the main method involves taking an average of KS1 results in reading, writing and maths, rather than just looking at prior attainment in the specific subject. Broadly speaking, if the pupil averaged a Level 1 or below at KS1, they go into the low group; if they averaged a Level 2 then they slot into the middle group, and if they are Level 3 average then they fall into the high group. However, please note that a pupil with two 2As and a L3 at KS1 will also be categorised as high prior attaining; they don't need L3 in all subjects. This is the main method used in ASP and IDSR.

This means that at KS1, prior attainment relates to the specific subject at EYFS, whilst at KS2 it depends on an average across three subjects, known as overall prior attainment. But it doesn't end there. ASP, as well as offering us data for those overall prior attainment bands for KS2, also offers us subject specific prior attainments bands as well. Therefore, a pupil that was L1 in reading and writing and L3 in maths at KS1, who is categorised as 'middle' based on the main method, will be low or high depending on subject using the second method.

And then there's FFT who take a different approach again (and it's important we know the difference because it can cause problems). FFT use average prior attainment across subjects at EYFS (for KS1), or KS1 (for KS2), rank all pupils nationally by prior attainment score, and split the national pile into thirds. Pupils falling into the bottom third are referred to as lower, those in the middle are middle, and those in the top third are higher. Schools will have more lower and higher prior attainers in an FFT report than they will in ASP or IDSR.

Sources of data and national comparators
Once we have results for our various groups, we need something to compare them to so we can ascertain how well they are doing. And again, this is not as straightforward as you might think. FFT simply compare the attainment of the group in the school against the result of the same group nationally. Seems fair enough. But what if we are comparing an underperforming group to an underperforming group? Is this going to give a false impression of performance, result in lowering of expectations and possibly a widening of the gap? This is why the DfE (in the ASP system) and Ofsted (in the IDSR) take different approaches.

In ASP, by clicking on an 'explore data in more detail' link, we can access a table that summarises data for numerous key groups and compares the results to national figures. If we look at the national benchmark column we will notice that it is not a fixed figure; it keeps changing. That's because the DfE use different benchmarks depending on the group. These benchmarks can be split into three different types: all, same, and other.
  • All: The group is compared to overall national average (i.e. the result for all pupils nationally. This applies to school's overall results and to EAL, non-EAL, and SEN groups. The comparison of SEN group's results to overall national figures is particularly problematic and it is worth seeking out national figures for SEN pupils as a more suitable comparator. These can be found in DfE statistical releases, and in FFT. 
  • Same: The group is compared to national figures for the same group. This applies to boys, girls, non-SEN, and prior attainment groups. The key issue here is that girls do better than boys in reading and maths at KS2, which means that girls are compared to a higher benchmark than boys. This is not likely to solve the gap problem.
  • Other: The group is compared to the national figure for the opposite group. This applies to disadvantaged/FSM pupils and to looked after children. The aim is to focus schools on closing the gap between low attaining groups and their peers. Note that the data compares the results of these groups in school to the results of other pupils nationally; it does not measure the 'in-school' gap. 
The problem with ASP, despite all the pupil group data on offer for EYFS, phonics, KS1 or KS2, is that the presentation is a bit bland. It provides no visual clues as to whether results are significantly above or below average or significantly improving or declining. It's just a load of numbers in a table. FFT's pupil groups' report is clearer. 

Unlike ASP, which contains data for numerous groups, IDSR just has four: disadvantaged, and low, middle and high prior attainers. Whilst schools certainly need to be able to talk about the performance of other groups, Ofsted have chosen not to provide data for them. Clearly tracking progress of disadvantaged pupils and the gaps between those pupils and others is essential. It is also important that schools are tracking the progress of pupils from start points, and it is recommended that tracking systems are set up for that purpose to enable quick identification of pupils in these key groups.

As in ASP, IDSR compares the results of low, middle and high prior attainers to the national figures for the same groups. There is however a difference in IDSR when it comes to disadvantaged pupils: they are not only compared to the national figures for 'other' (i.e. non-disadvantaged) pupils, but also to the overall national figure. The former is no doubt the more important benchmark.

FFT, like ASP, have numerous key groups but tend to do a better job of presenting the data. Bearing in mind the difference between FFT prior attainment groups, comparators and terminology (FFT use the term 'pupil premium' rather than disadvantaged) explained above, FFT reports are undeniably clearer and easier to understand. They provide three year trends, and indicators to show if results are significantly above or below average (green/red dots), and/or significantly improving or declining (up/down arrows). The report ranks groups in order of progress scores so it is quick to identify the lower and higher performing groups; and can show three year averages for each group, which is useful where numbers are small. In addition, the overview page of the FFT dashboard, lists up to three lower and higher performing groups overall and in each subject. This is done for both KS1 and KS2. FFT also have a useful report on disadvantaged pupils; and, as mentioned above, provide data on pupils by term of birth.   

A word about FFT and progress measures
The default setting in FFT is VA (value added). This means that progress is measured in the same way as it is in ASP and IDSR. It is simply comparing each pupil's result to the national average result for pupils with the same start point, and scores should match other sources. When we look at group level progress data in FFT and focus on say, disadvantaged pupils, the scores are VA scores and will be same as those calculated by the DfE. Using the VA measure in FFT, disadvantaged pupils' progress is not compared to disadvantaged pupils nationally; it is compared to any pupil nationally with the same prior attainment. A like-for-like comparison will only happen if you click the CVA button (which takes numerous factors into account to compare pupils with similar pupils in similar schools). Some people may be dismissive of FFT data because they mistakenly believe it to be contextualised. Progress data is only contextualised if the CVA button is clicked, otherwise it is no different to progress data found elsewhere. The difference - as explained above - is in the attainment comparisons, where results are compared to those of the same group nationally.

I hope this series has been useful. Feel free to print, share and copy. Just ask that you credit the source when doing so. 

Many thanks. 

Sunday, 22 April 2018

Capped scores in 2018 progress measures

The DfE recently announced that they would cap the extremely negative progress scores (but not the extremely positive progress scores by the way) in order to reduce the impact that such scores can have on school's overall measures. This is a welcome move considering how damaging these scores can be, particularly for schools with high numbers of pupils with SEND or with small cohorts.

The guidance states that 'the limit will mean that there is a minimum progress score that is assigned to pupils within the prior attainment groups (PAGs) where extremely negative scores exist. Where a pupil’s score is more negative than the minimum score, the minimum score will replace the pupil’s original progress score when calculating a school’s progress average. The minimum score for each PAG will be determined based on the variation in pupil progress scores for pupils across the country within that PAG (as measured by the standard deviation). The minimum scores will be fixed at a set number of standard deviations below the mean so that approximately 1% of pupils are identified nationally (we anticipate this will normally be no more than 1 or 2 pupils in any school).'

Essentially, this means that the threshold at which progress scores will be capped will depend on the PAG the pupil is in; and the threshold will represent the progress scores of the bottom 1% within each PAG nationally. Whilst the guidance states that 'predicting which pupils will, and will not, have their score affected by this methodology change, in advance of progress scores being made available, will not be possible', we can get a rough idea of the capped score thresholds by using the 2017 standard deviations for each PAG (Ofsted make these available on their IDSR guidance website here).

To calculate the capped score thresholds, the 2017 standard deviation for each PAG has been multiplied by -2.3. This because the bottom 1% on a bell curve are -2.3 standard deviations from the mean.

A spreadsheet with indicated capped score thresholds for KS2 and KS4 is available to download here

As always, please note that the thresholds in this spreadsheet are NOT definitive; they are just for guidance. They are intended to show the variation in scores between PAGs and also indicate how low the scores could be even after the capped is applied.

Further reading:

Key stage 2: See pages 8-9 of Primary Accountability Guidance 

Key Stage 4: See pages 12-13 of Progress 8 Technical Guidance

Saturday, 21 April 2018

Converting standardised scores to scaled scores

Many schools are using standardised tests from the likes of NFER, GL and Rising Stars to monitor attainment and progress of pupils, and to predict outcomes; and yet there is lot of confusion about how standardised scores relate to scaled scores. The common assumption is that 100 on a standardised test (eg from NFER) is the same as 100 in a KS2 test, but it's not. Only 50% achieve 100 or more in a standardised test (100 represents the average, or the 50th percentile); yet 72% achieved 100+ in the KS2 reading test in 2017 (the average was 105 that year). If we want a standardised score that better represents expected standards then we need one that captures the top 72%, i.e. around 92. However, to be on the safe side, I recommend going for 94 (top 66%), or maybe even 95 (top 63%) if you want to be really robust. Whatever you do, please bear in mind that standardised test scores are not a prophecy of future results, they are simply an indicator. Michael Tidd (@ MichaelT1979) has written an excellent blog post on this subject, which I recommend you read if you are using standardised scores for tracking.

The purpose of this blog is to share a conversion table, that will give you a rough idea of how scaled scores convert to standardised scores. It is based on distribution of 2017 KS2 scores in reading and maths, taken from national tables (table N2b). The cumulative percentages in the KS2 national tables are converted to standardised scores via this lookup table.

The conversion table can be downloaded here.

Please note: this is not definitive; it is a guide. It will also change next year, when 2018 national data is released, but hopefully it will demonstrate that one score does not directly convert into another.


Wednesday, 18 April 2018

5 things primary governors should know about data. Part 4: headlines and trends

This is fourth part in a series of five blog posts for primary governors. Part 1 covered statutory assessment, part 2 dealt with sources of data, and part 3 explained the progress measures. Here, we will look at the headline measures governors need to be aware of.

Inspection Data Summary Report (IDSR) Areas to investigate
This is an important place to start. The IDSR lists your school's strengths and weaknesses (under the banner of 'areas to investigate'), as well as information relating to floor standards and coasting, on its front page and governors definitely need to have sight of this. The list of areas to investigate is not exhaustive - your school no doubt has more strengths that those listed (and possibly more weaknesses).

Early Years Foundation Stage
Key measure: % achieving a good level of development
As explained in part 1, pupils at the end of reception are assessed as 'emerging', 'expected' or 'exceeding' in each of the 17 early learning goals (ELGs). If a pupil reaches the expected level of development (i.e. assessed as expected or exceeding) in the 12 main ELGs, this is described as a 'good level of development' (GLD). Our first key measure is therefore:
  • % achieving a good level of development at end of reception
This data is not available in the performance tables (i.e. is not in public domain) but can be found in Analyse School Performance (ASP), where the school's result is shown against both LA and national figures; and in Ofsted's IDSR, which shows a 3 year trend against national figures. Pay attention to the trend: is it going up or down and how does the school compare to national. Always consider the context when comparing the results of different cohorts.

Phonics in year 1 (with possible retake in year 2)
Key measure: % attaining the expected standard
The phonics check is carried out in year 1 and if a pupil does not achieve the pass mark - which, since its inception in 2012, has been 32 words correctly decoded out of 40 - then they take it again in year 2. The key measures that governors should be aware of are:
  • % attaining expected standard in year 1
  • % attaining expected standard by end of year 2
Note: % attaining expected standard by end year 2 takes the whole cohort into account, not just those that retake in year 2. 

Again, this data is not in the public domain. ASP provides a comparison against LA and national figures for year 1 results only (no 'end of year 2' measure') and does not provide a trend; IDSR shows a 3 year trend against national figures. Again, note how the school compares to national, and whether or not standards are improving. Again, always take context into account when looking at trends.

Key Stage 1
Key measures: % attaining expected standard, % attaining greater depth
KS1 assessment, made at the end of year 2, is mainly focussed on reading, writing and maths (but don't completely ignore science!). Pupils can be assessed as 'below' or 'pre-key stage' if they are below the standard of the curriculum, but the vast majority of pupils are either working towards the expected standard, working at expected standards, or working at greater depth. The key measures that governors should be aware of are as follows:
  • % attaining expected standards or above in reading, writing and maths (3 separate measures)
  • % attaining greater depth in reading writing and maths (3 separate measure)
Unlike at KS2 where the DFE produce a single, combined result for reading, writing and maths (see below), here they are kept separate. However, if your school uses FFT you can get a combined result for KS1 (i.e. the dashboards show % pupils attaining expected standards in all three subjects). 

ASP provides us with percentages attaining expected standards in each subject (3 measures) and the same for greater depth; and school results are compared against LA and National figures. Note if your school is above or below these comparators, but make sure you consider prior attainment of pupils (based on EYFS outcomes) when you do this. For this reason, IDSR is more useful because it breaks the results down by prior attainment, namely emerging (low), expected (middle), and exceeding (high), thus providing useful context.

Governors should at least be aware of percentages attaining expected standards and greater depth in reading, writing and maths at KS1, how those results compare to national figures (note that IDSR will indicate if results are in top or bottom 10% nationally), and whether or not they have improved on the previous year. Neither IDSR nor ASP currently provide trend data for KS1, due to there being only two years of comparable data, but you can view and download previous year’s data from ASP if you have access. We can compare 2017 to 2016 results but please ensure you consider context of cohorts (e.g. prior attainment, SEND, EAL etc) when doing this. 

The FFT KS1 dashboard does provide previous year's data, and the overview page can be particularly insightful. It provides a combined reading, writing and maths result for both expected standards and greater depth, displayed as neat, easy to understand speed dial. Unlike ASP and IDSR, the data will indicate if results are significantly above or below national average (green or red dot), and will also show if results are significantly improving or declining (up or down arrow). The right hand side of the report shows how the school's KS1 results compare to estimated outcomes based on pupils' start points (using EYFS data). This is a form of progress measure, and it will reveal if results are above 'expected' despite being below national, or below 'expected' despite being above national, depending on pupils development at the end of foundation stage. 

Key Stage 2
Key measures: % attaining expected and high standards, average scaled scores, progress scores, floor and coasting standards
Let's face it: there are a lot of measures at KS2. The key measures that school's have to display on their websites (and that are shown in the public performance tables) are a good place to start:
  • % attaining expected standard in reading, writing and maths combined*
  • % attaining the higher standard in reading, writing and maths combined**
  • Average progress in reading
  • Average progress in writing
  • Average progress in maths
  • Average scaled score in reading
  • Average scaled score in maths
* score of 100+ in reading and maths test and expected standard in writing
** score of 110+ in reading and maths test and greater depth in writing

Unlike EYFS, phonics, and KS1 data, which is not in the public domain, the KS2 data listed above is neatly presented on the main page of the performance tables for each school, and governors are advised to be aware of this. The school results are shown alongside LA and national figures, and previous years' results are now available (just 2016 at time of writing) for comparison. Again, context of cohorts needs to be taken into account when evaluating performance over time. 

The DfE does not categorise attainment data (i.e. do not indicate if it is significantly above or below average - you'll need FFT reports for that information) but the IDSR will show if results are in the top or bottom 20% nationally (this will be stated on the front page as an 'area to investigate'). Progress scores, however, are categorised (in both ASP and performance tables) as follows:
  • Well above average (dark green): progress is significantly above average and in top 10% nationally
  • Above average (light green): progress is significantly above average but not in top 10%
  • Average (yellow): progress is broadly in line with national average
  • Below average (orange): progress is significantly below average but not in bottom 10%
  • Well below average (red): progress is significantly below average and in bottom 10%
It is vital that governors are aware of their school's progress category for each subject, and most importantly are able to discuss progress in broad terms, particularly issues that have resulted in low progress scores, or what has led to high scores.

As at KS1, FFT's KS2 reports show if attainment and progress is significantly above or below average (green and red dots) and indicate if standards have significantly improved or declined (up and down arrows). Again, the overview page of the FFT governor dashboard is incredibly useful as for quick reference. 

Other KS2 headlines that governors should know about are:
  • Floor standards: is your school below floor or has it been in the past? (see IDSR front page)
  • Coasting: similar to floor standards but over 3 years. Is your school defined as coasting? (again, see IDSR front page)
  • Absence: it's not an academic measure but it's vital we know about school's overall and persistent absence figures, how they compare to national and if they are going up or down
There are of course other KS2 measures including percentages attaining expected and higher standards in individual subjects. It may be that the combined result is low due to underperformance in all subjects or just one, and it's important that we investigate this. Also, don't ignore science and grammar, punctuation and spelling (GPS/SPaG), results for which can be found in IDSR. Also note that FFT dashboards show progress data for grammar, punctuation and spelling, and - as for other subjects - indicate if results are improving or declining.

This is a lot to take in and governors cannot be expected to carry all this information around in their heads. Focus on the headline measures in the bullet point lists above; be aware of how results relate to national figures, and whether or not those results are improving.

And, of course, we also need to know the performance of key groups of pupils, and that's the subject of the last post in this series. 

Saturday, 14 April 2018

The Progress Horizon

Following the release of further details on the proposed reception baseline and future progress measures this week, and the inevitable battle for the soul of primary education already in full swing, I find myself distracted by the possible mechanics of these measures, specifically what will happen to those pupils that change schools.

The issue of measuring the progress of 'mobile' pupils is a murky and complex one. Currently, it is straightforward in design and yet deemed by many to be extremely unfair. When a pupil moves schools they take their baseline - their KS1 or KS2 results - with them (if they have KS1 or KS2 results of course), and they are included in the new school's progress measures. The new school is solely responsible for the progress that pupil makes, even if they arrive late in Year 6 or Year 11. Of course, in some cases a school may benefit by admitting a pupil that does very well in relation to their specific start point, but often pupils that change schools do less well than their more rooted peers.

But what of this new reception baseline to KS2 progress measure? How will it deal with mobile pupils? Will it include them or not? Over the last year or so a number of people have told me that the new progress measure would be 'a cohort-level measure'; that it would not involve progress of individual pupils and would not take account of movement of pupils in and out of the school. If this were true then it would be a radical departure from the current measure which does just that. I assumed that this resulted from a misinterpretation of information in the primary assessment consultation, which states that reception baseline data will be used 'to calculate their school’s cohort-level progress measures.'; that it would not be used to evaluate the progress of individual pupils.

This is no different to the guidance on the current progress measure, which is 'a school-level accountability measure. Progress is calculated for individual pupils solely in order to calculate the school’s overall progress scores. There is no need for schools to share individual pupil progress scores with their pupils or parents.'

On the subject of whether or not so-called mobile pupils will be included in future progress measures, we are getting mixed messages even from the experts. In the TES (16th March 2018, p14), Greg Watson, chief executive of GL Assessment, lists 'three key challenges: matching the pupil data accurately in the first place, keeping track of the data as pupils move between schools, and, in the cases where pupils have moved, deciding how much credit each school gets for progress.' This suggests that the issue of mobility is high on the agenda, and the last point - apportioning credit for progress between schools in the case where a pupil moves - is an interesting and new development that deserves some serious consideration.

And yet Professor Robert Coe, director of the Centre for Evaluation and Monitoring at Durham University, is quoted in the same article as saying "Does it make sense to wait seven years from the time children start school to make a punitive judgement about the school based on the performance of whatever proportion of that small number of children are still at the same school? Not remotely"

Putting aside the main point - which I agree with - this implies that the measure will only involve those pupils retained since the start of reception. I assumed that the current methodology would continue, whereby individual pupil's progress scores are calculated and aggregated to generate the school's progress score; and that any pupil that changes school will be matched back to their baseline score and included in the new school's measures - no matter how unfair that seems.

But there is an interesting sentence in the primary assessment consultation response* which is maybe the source of much of the confusion:

In addition, we will work with analytical experts to develop the rules around the new progress measures, for example the minimum cohort size required and the minimum proportion of pupils that need to have been in the same school between reception and key stage 2.

But this can be interpreted in two ways:
  1. Mobile pupils ARE included in the progress measures. The DfE calculate the percentage of pupils retained since reception and do not publish progress data if retention falls below a certain threshold.
  2. Mobile pupils are NOT included in the progress measures. The DfE calculate the percentage of pupils retained since reception and do not publish progress data if retention falls below a certain threshold. 
A measure of retention would certainly provide useful contextual information, and perhaps progress measures should be withheld for those schools with high mobility, but I'm not sure I want to see mobile pupils omitted from measures full stop. What percentage of pupils actually remain in a school from reception to KS2 anyway? We could see a lot of pupils excluded from progress measures. 

The way I see it, we have four choices:
  1. Simply compare average attainment at the start of reception to the average attainment at the end of KS2, and ignore any movement in between. This would be a crude and meaningless measure. You only need look at the difference between the KS1 prior attainment of the current year 6 in a school and the KS1 results four year ago to see that such an approach would not work. This measure takes no account of movement in and out of the school.
  2. Measure only the progress for those pupils retained since reception, and not include any new arrivals. Many schools will therefore have small numbers of matched pupils, and, according to statement above, could end up having no published progress measures if retention falls below a certain threshold. This measure removes those that leave but does not add those that arrive.
  3. Carry on as now, including all pupils with a baseline in the school's progress measure, regardless of where that baseline was administered and how long the pupil has been in the school. This measure takes account of those that leave and arrive.
  4. As above but apportioning progress between schools in the cases where pupils have moved. This measure takes account of those that leave and arrive but is proportional (but no doubt complicated).
I know all of this is years away - the first cohort of reception baseliners reach the end of KS2 in 2027 - but I dwell on these things and some clarity, or at least some vague proposals, would be welcome. Otherwise we'll all continue to speculate and worry. I have tweeted the DfE for an answer.

I eagerly await their  response.

*Many thanks to Kate Barker (@K8ebarker) for bringing this to my attention.

Tuesday, 10 April 2018

5 things primary governors should know about data. Part 3: progress measures

The key stage 1-2 (KS1-2) progress measure is a value added (VA) measure, and this is nothing new. We have had VA measures for years, both at KS2 and at KS4. But previously these VA measures - which took up pages of the old RAISE reports - played second fiddle to the levels of progress measure. This was for a number of reasons:
  1. Levels of progress was a key measure with floor standards attached
  2. It was in the same language as everyday assessment
  3. It made target setting easy (just add 2 levels to KS1 result)
  4. It was simple and everyone understood it
But levels have gone, and for good reason: they labelled children, they were best-fit so pupils could have serious gaps in learning but still be placed within a level, and progress became synonymous with moving on to next level rather than consolidating learning and developing deeper understanding. Plus, they were never designed to be split into sublevels or points and used as for progress measures anyway. 

Most confusing of all, the two progress measures - VA and levels of progress - often contradicted one another. It was possible, for example, for a school to have all pupils make 'expected' progress of 2 levels, and yet have a VA score that was significantly below average. This was because - contrary to popular belief - the VA measure had nothing to do with levels; it was all to do with average KS2 scores from KS1 start points. 2 levels might be enough from one start point but nowhere near enough from another. 

But this is all rather academic now because levels have gone and we are left with a single progress measure: VA.

So, what is VA? 

VA involves comparing a pupil's attainment score at KS2 to the average score for pupils with similar prior attainment. There are a few myths we need to bust first, before we continue:
  1. We do not need data in the same format at either end of the measure to calculate VA. Currently we have KS1 (sub)levels at the beginning and KS2 scaled scores at the end. These data are not in the same format. We needed compatible data for the levels of progress measure but not for VA. This misconception is a hangover from levels, and it's something that is better understood in secondary schools where they have KS2 scores at one end and GCSE results at the other.
  2. We do not even need the same subjects at either end. Again, this is better understood in secondary schools, where the baseline comprises KS2 scores in reading and maths (note: no writing) and the end point is any GCSE the student sits. VA can be measured from KS2 test scores in reading and maths to GCSE result in Russian or Art, for example. 
  3. KS1-2 VA has nothing to do with that magic expected standard score of 100. Plenty of pupils get positive progress scores at KS2 without achieving a score of 100 in KS2 tests. They just need to exceed the national average score of pupils with the same prior attainment, and scoring 92 might be enough, depending on start point. And pupils that achieved 2b at KS1 (often referred to as 'expected' in old money) do not have to achieve 100 to make 'good' progress; in 2017 they had to exceed 102!
Each pupil's KS1 result - their prior attainment or start point - is therefore crucial to this process. Each p-scale, level and sublevel in reading, writing and maths at KS1 has a point value, which enables the DfE to calculate a KS1 average point score (APS) across the three subjects for every child that has a KS1 result (note: pupils without a KS1 result are excluded from progress measures). Their KS1 APS is then used to place pupils into a prior attainment group (PAG), of which currently we have 24, ranging from pupils that were on p-scales at KS1 (pupils with SEND) up to pupils that were Level 3 in all subjects. There is even a PAG for pupils that were level 4 at KS1, but there aren't many pupils in that group. 

All pupils with KS1 results are therefore slotted into PAGs alongside thousands of other pupils nationally. The DfE then take in all the KS2 test scores and calculate the average KS2 score for each PAG. Let's look at two examples for reading at KS2 (the process is the same for maths):
  • We have two pupils in a class that have KS1 prior attainment of 16 APS (2b in reading and writing and 2a in maths at KS1). They are placed into the same PAG as thousands of other children nationally with 16 APS at KS1. The DfE take in all the thousands of reading test scores for all the pupils in this PAG and calculate the average score, which for this PAG is 105 (note: in reality benchmarks are to 2 decimal places e.g. 104.08). 105 therefore becomes the benchmark for this group. Our two pupils scored 108 and 101 in their KS2 tests and both have met the expected standard. However, only one pupil has a positive progress score. The pupil scoring 108 has beaten the national benchmark by 3 whilst the other has fallen short by 4. These pupils' VA scores are therefore +3 and -4 respectively.
  • We have two other pupils in our class who have KS1 prior attainment of 10 APS (2c in reading and Level 1 in writing and maths). They are in the same PAG as thousands of other children nationally with 10 APS at KS1. The DfE collect the reading test scores for all pupils in the group nationally calculate the KS2 average score, which in this case is 94 (again, in reality this would be to 2 decimal places). 94 therefore becomes the benchmark for this group. Our two pupils scored 98 and 88 in their KS2 tests. Neither have met the expected standard but the first pupil has beaten the national benchmark by 3 whilst the other has fallen short by 7. These pupils' VA scores are therefore +4 and -6 respectively.
This process is repeated for each pupil that has a KS1 result. All pupils are placed into PAGs and their scores in KS2 tests are compared to the national average score (the benchmark) for pupils in the same PAG. If a pupil beats the benchmark, they have a positive progress score; if they fall short, their progress score is negative. Page 17-18 of the primary accountability guidance has a table of all PAGs with their corresponding KS2 benchmarks in reading, writing and maths. 

What happens next?

The DfE take all progress scores for all pupils in the year 6 cohort in your school, and calculate the average. In our example above we have four pupils (two with prior attainment of 16 APS and two with 10 APS). Let's imagine that is our entire Y6 cohort (it's a small school!). We add up the progress scores (3 + -4 + 3 + -7 = -5) and calculate the average (-5 / 4 pupils = -1.25). This school's VA score is therefore -1.25, and you will see these aggregated progress scores presented in the performance tables and ASP (where they are colour coded and categorised), in Ofsted's IDSR (where they inform the areas to investigate), and in FFT reports (where they are shown to be in line with, or significantly above or below average). 

And what does -1.25 mean. Putting it crudely, it tells us that, on average, pupils scored 1.25 fewer points in their test than similar children nationally. And when the DfE say 'similar children', they are basing this on prior attainment alone, not contextual factors. A progress measure that takes context into account is called Contextual Value Added (CVA), which the DfE scrapped in 2011, but which FFT still offer. CVA is an attempt to create a like-for-like progress measure but is not favoured by government.

Are there any issues with KS1-2 progress measures? Whilst VA is preferable to levels of progress, there are numerous problems:
  1. Writing! There is no test for writing at KS2 but there is still a progress measure. As in reading and maths, pupils are set benchmarks in writing that are fine graded to decimal points (see p17-18 here), but because pupils do not have a test score, these benchmarks are essentially unachievable. Instead, the DfE have assigned 'nominal' scores to teacher assessments for writing, which makes for a very clunky measure. The vast majority of pupils are assessed as either working towards the expected standard, working at the expected standard, or working at greater depth. These attract values of 91, 103, and 113 respectively. In reading, and maths pupils can achieve test scores in the range of 80-120; in writing, they get 91, 103 or 113. It doesn't work.
  2. Pupils below the standard of the tests/curriculum are also assigned nominal scores, which range from 59 for the lowest p-scales, up to 79 for the highest of the pre-key stage assessments. These pupils often have SEND and tend to end up with big negative progress scores, which can have a detrimental impact on a school's overall progress scores. The system is therefore punitive towards those schools that have large groups of pupils with SEND (or towards small schools with just one such pupil). The DfE plan to mitigate this issue by capping negative scores this year. 
  3. It can't be predicted. The benchmarks change every year (they are the national average scores for each PAG that year), and we don't know what they are until after pupils have left. This is a headache for many headteachers and senior leaders.
  4. It relies on the accuracy of KS1 results. I’ll say no more about that. 
Now you know how these progress measures are calculated, and what the issues are. But what do they mean in terms of school accountability? 

That's the subject of the next post in this series: headlines and trends.