Geography
This video focuses on how general assessment principles are applied in Geography to ensure fairness and accessibility. It looks at the importance of stimulus material in question design and the different approaches to mark schemes.
- Transcript
-
Stacey Hill and Simon Oakes
Stacey Hill Hello everyone and welcome to this geography assessment training. My name is Stacey Hill. I'm the Head of Curriculum for geography at AQA and I'm joined by Simon Oakes, who is the Chair of Examiners for GCSE Geography at AQA, and today Simon and I will take you through this content relating to GCSE geography assessment. So, Simon, how do we ensure that our assessments are valid and reliable?
Simon Oakes Well there’s two big ideas, validity and reliability and they're sort of the two cornerstones for everything we do in assessment. If I take validity, first of all, the idea of ‘what is a valid assessment?’, and a valid assessment does what it says on the tin. A geography qualification is about learning information about the world, about places and processes and systems. But it's not just knowledge, it's not just knowledge and understanding, maybe like it was more when I was young and did my geography exams; what we used to call ‘capes and bays geography’. Modern geography qualifications have focused a lot more around skills; the softer, evaluative and interpretive skills, also making sure that students are very good at data storytelling, that they can work with information and they can show that they can walk and talk like a geographer when it comes to actually interpreting information. There's a whole load of things that we're supposed to be testing as part of the geography qualification, so the validity of our assessments is ‘are we actually doing what we set out to do?’. And so that's why we have to think incredibly carefully about the questions that we ask to make sure that that's going to be the outcome.
Stacey Hill So, why is it important that we use specification language in our assessments?
Simon Oakes Well, an important part of asking valid questions is that every student has to understand the question you've asked them otherwise it's immediately not a level playing field for trying to measure performance. I mean, a good example of that with a subject as old as geography is, and with so many different ways of teaching, there's a huge vocabulary out there that exists for different kind of glacial landforms and bits of population geography. Now, our specification doesn't use all of those words because you wouldn't want to have a huge dictionary of geography there. So, only certain words and phrases and terminology appear within a specification. So, you know, people who have taught for a long time like myself who are setting exams we have to be really careful that we don't start reaching for words and vocabulary that we've always used for a topic that is naturally there. Otherwise you might lock some learners out from being able to actually answer the question. The best way to do this is probably to look at an example.
Stacey Hill That's really true. And if we look at this example from the specification, this is taken from unit two Change in Economic World and Economic Futures in the UK. When you look at quite a lot of the content here, there are some natural, real key terms, that a lot of geography teachers will probably inevitably use in their teaching, things like quaternary industries or even the specialist term things like counter-urbanisation. Now, we can't assume that all students in all schools are going to be taught this content using those key specialist terms, so it would be really poor practice if we were to use some of those key terms, for example, in our assessments. And therefore it's really important that when we're writing good questions that we really stick to the language as it exists in our specification to really benefit all students.
Simon Oakes Absolutely, I mean if you look at that bit there is some quite stretching terminology in there. I think deindustrialization is in there, post-industrial. So you've really got a couple of quite complex bits of terms in there. Obviously, when it was written, we weren't wanting to overload it by putting quaternary and counter-urbanisation in as well. And yet you know that some teachers out there might use those terms in their teaching, but you absolutely can't assume it would be all right therefore to use one of those words in a resource, or even worse in a question, because it would lock some learners out whose teachers hadn't used those terms in language.
Stacey Hill So, can you tell us a bit about reliability then?
Simon Oakes Well, yeah. Once you've got your validity sorted, which is making sure you've asked the right question, the reliability is making sure it gets marked the right way. So this is when we move from question paper to the actual mark scheme. Reliability is about satisfying everyone who matters. First and foremost the students, then their teachers, their parents, the schools and their governors, the regulators, all the stakeholders satisfying everybody that we can trust the mark that you've got on the front of the script is the right one, and whoever's got the highest mark, well, they were the strongest candidate that year when it came to their knowledge and all their other skills and competencies that we've been talking about.
Stacey Hill So, can you tell us a little bit about how mark scheme creation works?
Simon Oakes Sure. I mean, mark scheme, I think maybe a lot of people don't realise just how much work has to go into a mark scheme because, particularly with a subject like geography which is always changing, you sometimes find with a course that you're asking questions that have never been asked before in a high stakes terminal exam. So, you don't know how 190,000 students are going to behave under pressure when they see this question that hasn't been asked before. So, the senior examiners will, at the initial design stage, produce a mark scheme that they think based on their professional expertise, that they think is a reasonable, ‘this is probably what a great answer would look like, and then this is what it will look a bit more at the top end or the bottom end’. But they may find that when they actually come to the day of the exam and they start looking at the scripts, there's quite a big mismatch between what they thought the students were going to do and what's actually happened.
Stacey Hill And with the best will in the world, we don't have a crystal ball where we can determine exactly how students are going to respond to something when they sit down to write their exam.
Simon Oakes No, you don't. So at that point, what you have to do, and this is what the senior examiners will do, is they'll spend about a week taking a sample of scripts, reading through them, looking across the whole ability range to see what students seem to be doing within that range from the highest performance down to the lowest. And then they might have to modify the mark scheme really quite significantly compared with their first draft because you don't want a mark scheme that has everybody pushed down towards the bottom end or pushed up to the top end of the marks. We do need to have a scheme that produces a wide range of marks because that is the job of what we're trying to do with these terminal assessments is ultimately take the cohort and divide them into a wide range of grades and we want those marks to be quite far apart from each other for all kinds of reasons, and that involves refining, reflecting on the mark scheme until it is doing that job.
Stacey Hill Okay. So if we look at this example from paper one, which is about extreme environments, can you talk a little bit about how levels based mark schemes work?
Simon Oakes Yeah, I mean, I'm sure all the teachers watching this know all too well that there's two components to a level-based mark scheme. You've got the table which has in it the overall key instructions about what level you think a candidate's response should go into. And then you've got the indicative content. Now, the table is so important because we talked a moment ago about how modern geography exams aren't just about testing recall, they're also about testing what students can do, that they can think like a geographer, work like a geographer and so forth. And the table is there to offer a holistic view of overall when you take into consideration what they know, but also what they're doing with the information, does this feel midband? Does it feel upper or does it feel lower band? But of course, as the tariff goes up as you move up from 6 to 9 marks, and at A-level you've even got 20; these levels become quite big. So, you also then with the indicative content, you need to give teachers and assessors this extra ammunition to actually sort of work out how am I going to decide if it's the bottom or the top of the level? And that's where the indicative content comes from. These are all things that students could possibly say that you might want to take into consideration when you're deciding where you're going to pitch within the level your final mark.
Stacey Hill And is it fair to say that this indicative content isn't exhaustive?
Simon Oakes No, I don't think it's ever exhaustive with a subject like geography with so many great teachers who do things in different ways. You might start off; we’ve talked about the first draft of the mark scheme, how you then have to adapt it once you've seen what students write. Well we work in isolation, a single person thinking ‘well if I taught it this way I would put this in’. That's the first draft of the indicative content from the writer but once they've seen what all the schools around the country are putting in there, we will begin to add some of that to the indicative content as well, saying, ‘well, actually a lot of schools are making this point’. Which is great; and we hadn't thought of when we originally asked the question when we set the paper. So we'll add that. And so the indicative mark scheme could just keep growing forever if you're not careful because a subject like geography, there's always something else you can bring in. So you have to have markers aware of that that even if something isn't in the indicative content. If it's a great point, you should be crediting it.
Stacey Hill And how is that different to point mark questions?
Simon Oakes Well like chalk and cheese. I mean, it's a completely different approach point marking questions. If we think just about, say, a one mark question and the amount of thought that has to go into just deciding how you're going to mark a one mark point question, if you ask a fairly straightforward one mark question, you don't want to be in a position where you end up giving everybody the mark or nobody the mark because it was too hard; because everything is about trying to produce a stretch in a range of marks, because our job is to differentiate candidates into grades nine, eight, seven, six, five and so forth. So we can look at a one mark question what originally we thought would be worth the mark and we might find, you know, to our surprise, that a good surprise at first that everybody's getting that mark and we might at that point decide that maybe we should be a little bit more discerning on what we want, or we might find that nobody's getting the mark and that clearly we've pitched too high initially in terms of what we want. And, you know, this is for the teachers watching, when you're setting your own tests for internal testing, it's a really good thing to think about is do you need to modify what you've done at any point. If we just have a quick look at an example from the, I think it's the November 21 paper, where there's a question ‘what is energy insecurity?’ for one mark. And have a think about how we would develop a mark scheme for that. If you just as a first draft from the mark scheme used the kind of academic definition we've got here that would probably be a mistake. ‘Energy insecurity is a multi-dimensional construct that describes the interplay between physical conditions of housing, household energy expenditures and energy- related coping strategies.’ I think we can all agree it be quite unfair to ask a typical 16 year old to have remembered all of that before you award them the mark. So, the senior examiners have to look at what students have written and work out where they're actually going to draw the line in terms of how much understanding is necessary for the mark and that really involves reviewing a big sample of students’ work before you produce the final draft of the mark scheme.
Stacey Hill So previously Simon, you and I have talked about the halo effect and what that does to mark scheme and reliability. Could you just talk a bit about that?
Simon Oakes Yeah, a really good bit of advice someone gave me when I was first training as an examiner is that when you're marking and you're trying to make these holistic assessments, particularly of longer responses at GCSE or essays A-level, you've got to watch out for something called the halo effect when you're making your final assessment of how many marks you're going to give; the halo effect being a sort of slightly angelic effect where the student says something particularly good just at one point in the essay and you're so blown away by that that you kind of lose your objective judgement and start reaching for the high marks just because they use that one bit fabulous terminology that you like to use in your own teaching, you know. So maybe they've heard of carbon capture and storage or they've used the phrase carbon colonialism, there's something that's really surprised you that a GCSE student has known and used this concept, this idea, without prompting. And you think ‘oh, that's got to be a top grade student’ and you lose your objectivity for a minute. You forget that you're trying to make a holistic judgement of all of the different skills and competencies and you end up giving the student a very high mark just because you were really personally impressed by this, this one phrase they used. I think I did that a little bit in my early career as a teacher, and that's yeah, it's called the halo effect and we have to try to make sure that we avoid that when we're marking.
Stacey Hill Now surely that has to work the other way as well.
Simon Oakes Yeah, it definitely works the other way and that can be even worse as well. We all know as teachers, those times when you're writing a really good essay and then the student says something about Africa as a country, or they are writing about Sheffield and they tell you with great confidence how Sheffield's docks shut down in the 1950s, and you think, ‘oh, how could you have said that?’. So there are certain things that students say where you can just you can just picture the teachers scratching their heads and saying, ‘I can't believe you said that’. And you have to be really careful as well that you don't mark students down with a big red ring around something because they said that particular thing. You have to remind yourself, this is a student under a great deal of pressure, high stakes exam at the end of two years. They don't really think Africa is a country. They know it's a continent but they were writing under pressure. They don't really think that it's Sheffield that had a docks industry, they actually meant Liverpool. But in the exam, you know, they've panicked, they've rushed, they've made a mistake, we don't want to be pulling them down a level because as geographers there was, there was something that we found particularly disappointing in their answer. That's when the halo effect would work negatively.
Stacey Hill Okay, so if we move on now to think about the assessment objectives, why are assessment objectives so important?
Simon Oakes Well we've already talked a bit, well we've talked quite a lot so far, haven't we, about how geography is more than just knowledge and understanding. It's also about what skills and competencies students are going to acquire through study of the discipline of geography that makes it such a great subject to do, and so, you know, so good for the workplace. Assessment objectives are there to keep us focused on making sure that every year; year on year, our assessments are functioning in the same way and keeping the same high standards up. At GCSE level, you've got your four assessment objectives which you can see on the screen at the moment. AO1 and AO2, they're the knowledge and the understanding, the recall elements of the qualification. But then you move on to AO3, which is, it's a real mixed batch of an assessment objective. It's got interpret, analyse and evaluate within it, which can include some quite basic skills in terms of your analytic ability just to extract some information from a resource. But it also includes the evaluation and the judgement arriving at a judgement which in some ways is the most demanding thing that happens at GCSE level. So AO3 is this big, really important assessment objective of a lot of different things students will show us they can do. And then AO4 is the skills component where a lot of the mathematical questions are tagged as AO4 and also the field work.
Stacey Hill It's probably really important at this point to talk about the regulatory side of these assessment objectives. So obviously AO1, AO2, AO3 and AO4 are the same, no matter which exam board your students sit exams for, and really that's designed to make sure that there's parity across all those exam boards. But the interesting thing is that exam boards like AQA have a great deal of flexibility in terms of how we can utilise those AO breakdowns. And you can see on screen, if you look at the table, there are different ways that different exam boards take those assessment objectives and apply them across their suite of papers. So we can pay particular attention based on what you just said Simon, about the importance of AO3. So just how much of that appears in paper three for example? Now, if we're talking about AO3 as a very analytical and evaluative assessment objective, you can see why so many of those marks appear in that paper because students are expected to take their pre-release material and use that understanding that they've gained from their preparations in the classroom and apply that to the decision making exercises that they're expected to respond to in the exam and very similarly their field work experiences as well.
Simon Oakes Yeah, absolutely. Each exam board, the qualification, always has to have the same balance of assessment objectives, but you're obviously right. We and all the other boards are deciding to use the combinations in different ways across different papers and certainly papers one and two, as the table shows, are very heavily weighted towards the knowledge and the understanding. And students who do well on one of those papers are probably quite likely to do well on another one as well. But paper three is a real shift, isn't it? Because it's much more focusing down this skills route of how they're able to work with information, and also telling us what they've done as part of their fieldwork.
Stacey Hill Now, you've mentioned that assessment objectives can be used in isolation or in combinations. Could you talk a bit more about that?
Simon Oakes Yeah, most of the level based questions are combining assessment objectives in different ways because, we've looked at examples already, because they're trying to test more than one or two things at the same time. So if we just look at a couple of examples of this, if we have a look at the summer 2022 physical paper, there's a four mark question here. ‘Explain why earthquakes and volcanic eruptions take place along destructive margins.’ That was an AO1 and AO2 question. It's pretty much knowledge and understanding that's being tested there in equal measure. If you look at the six marker, it's shifted its assessment objectives considerably and there's this very important steer for students there which says ‘use figure 16 in bold and your own understanding’, because what we flip to here is an AO2 and AO3 question where the resource has to be used as part of the answer because the students are assessing the benefits of using hard engineering and soft engineering, using that resource as a springboard for what they're going to talk about. But they then have to apply their own understanding to this particular context, they've got to put it through an evaluative filter and show that they, you know, that they can they can work with information that they've not seen before. So we want to have both assessment objectives in there, we want to credit their ability to work with something new, to interpret and analyse it and possibly evaluate it. But we also want to credit them for bringing their own understanding to their answer as well.
Stacey Hill Now you've mentioned that these questions that combine the AO2 and AO3 element, the resources that act as a springboard for students. How is that different if there isn't a resource for students to refer to?
Simon Oakes Yeah, well, it's not a resource it won't have that command to use the figure and your own knowledge, it'll be more talk about case studies you've learned in your own understanding. And yeah, the assessment objectives would shift again then. So we might have a nine mark question such as the one that was on that same paper about tectonic hazards. And there you've got three marks AO1, three marks AO2 and three marks AO3 because there you are rewarding much more recall as part of that answer, but you still putting it through the evaluative filter because you've said to what extent do you agree with the statement?
Stacey Hill So how do we utilise all that advice to really come up with good questions?
Simon Oakes Well, if we take a look at the challenges that face urban areas part of the specification and think about how we would have generated a question to ask about this. Bearing in mind we've got to set questions, as we keep emphasising, that are not just testing students what they've learned in class about the topics, but also that they can put this information through an evaluative filter and give us an assessment or an evaluation as well. So we put something together which talks you through the thought process that you would go through when trying to work out what is a good evaluative question to ask about this part of the spec. Yeah, here's a first draft at a question, it's a terrible question but it just is to illustrate the thought process. You could ask them to discuss this statement: ‘The myriad problems created by rapid urban growth in LICs/NEEs are insurmountable. To what extent do you agree?’ And some of your colleagues might say, ‘well, that's a really hard sounding question’. Yeah, very academic sounding. That's a good question. But somebody else might say it's not a good question at all, because the words myriad and insurmountable are not in typical 16 year old’s vocabulary that might help you differentiate grades seven, eight and nine students. But, half the candidates are not even going to attempt that question because they might be very uncomfortable with the words that been used in it. So it's really important to be looking at the language that we use. And here's a second stab at a question: ‘Discuss urban growth in LICs/NEEs’. Nice short statement, hardly any words in it, easy to read, but actually really, really hard to assess reliably because the students actually; what are they going to discuss? Are they going to discuss how big the urban growth is, the reasons for it, what the consequences of it are? You've actually; you could write a book about that, you've asked them a massive question and it's going to be very hard to compare different candidates work to actually work out who should get high marks and who should get a low mark because you haven't given the students any scaffolding. And again, some students might run away from that question because it is just too big and they haven't really you haven't given them an entry point. And here's a third attempt at trying to set a question on this part of the spec: ‘To what extent can the challenges of housing, clean water, health, education, unemployment and crime ever be tackled in LICs/NEEs?’ I mean, that question is verbatim from the spec, everything's there, it tells the students exactly maybe what they should be writing about. But if you look at that, it'd be quite an unmanageable question. There are six things in there for them to cover, you know, what would you do with a really great answer that covered three of those things really well but didn't tackle the other three as well, there's too much scaffolding now in that one. If you're actually requiring students to write all about all of those things, what we finally pitch up with is this question which featured in the exam, which is: ‘Assess the extent of the challenges created by urban growth in an LICs or NEE.’ It's got a more narrow focus on the challenges, it's not too prescriptive insisting that you've got to write about five or six of them, it's got 'assess the extent'; so it's got the evaluative filter on it as well. So overall, that would be the thought process towards finding a question that's big but not too big: that's limited but not too limited.
Stacey Hill The last thing that I thought we should talk about today is accessibility. Now, this has come up a few times across the various different topics that we've talked about here, for example, language. But why is it really important that our assessments are accessible to students?
Simon Oakes Yeah, we have touched on it already, haven't we? The idea that I think it was when we were talking about the language we use and, you know, you can't use words that aren't in the spec or you can't use words that are not in an ordinary student's vocabulary because all the questions have got to function like a big tent in which everybody can join in the conversation. Nobody gets left behind. Every single student, whether their outcome is going to be a grade one or a grade nine, they can have a go at that question. They can all participate and join in. I mean, that's such an important kind of equity point in terms of ‘everybody's got to be able to join in’ like that. So you've got to use, create the language, create the questions that are going to facilitate that. But at the same time, those questions have also got to give those that might be able to get a grade seven, eight or nine, gives them the chance to show us that as well. So we're trying to create these accessible questions that everybody can read and decode and have a go at. But that also allows for fine grained differentiation, right at the top level, which allow a grade eight and nine to stand out but also allow us to maybe work out what's a grade two or a grade one. That is the big challenge that I hope I've explained something about today when it comes to some of the extended writing questions.
Stacey Hill How does this apply to accessibility of the vast amounts of stimulus that we've got across the papers?
Simon Oakes Oh, okay. So slightly, slightly different thing here. Yeah, which is that we've got so many visual resources on the papers. Yeah, it's not just the actual questions, we've got to be on our toes about accessibility for, it's all of the maps and charts and figures and photographs that a typical geography exam is just absolutely packed full of. We've got to look at each one of those and say, ‘is this also something which is going to have this big tent capacity to not scare some students off’, but also has got something in it that really allows the top end to run with. Yeah, I thought we could have a look at a few questions from the November ‘21 paper and look at how those are being constructed with accessibility in mind.
Stacey Hill So if we look at this example then what is it about this example that shows it's accessible for students?
Simon Oakes Yeah, I think this is a good example of it's a really fair resource to be using with candidates given there's only two marks available. There's just 15 data points here. There aren't any in North and South America, so it's quite a constrained distribution they're looking at where the majority are in Asia with a couple of outliers in Europe and Africa. So it's a very manageable task. What you wouldn't want to do is use a map that's got 50 data points on it spread all over the place and set students up with a really unfair amount of work to do, particularly if they're if they're very diligent students who feel they've got to say absolutely everything. So there's a bit of an art in terms of knowing what is enough for a resource. And I think this is a great example of that.
Stacey Hill There are lots of ways that we use maps like this, for example, in our assessment, not just where we've got data points, are asking questions about distribution. So if we look at another example where we've got a choropleth map, what is it about this example that makes it accessible for students?
Simon Oakes Well, I think probably a lot of teachers watching this, you might have seen a version of this in textbooks where there's seven, eight, nine different classes on the map and it's been it's much more complex. This is a nice one because it's been adapted just to show three basic levels, high, medium and low. And that means that if you're thinking about data storytelling in a high stakes final exam where you have you've given the students some that's very manageable to be able to work with. If there's teachers watching, you're trying to set your own internal tests as part of ongoing assessment and you want to find and select good maps and visual stimuli to use, be very careful about selecting things that have got nine or ten different classes in the key, because it's really, really hard for students to work their way around that amount of information. I remember once, quite early in my teaching career, creating a test for my students where I had found a really brilliant chart of all the world's transnational corporations, the top 50 TNC's in the world in a newspaper and thought ‘oh that would be really good to set a test around’. And I gave the poor students a table with 50 companies in it and said, describe that. You can't expect students to have to. You need to edit that down to, maybe, just five would be enough. I think the thing about resources is you have to remember in an exam what they're there to do, which essentially is to help you differentiate candidates into different grades. And the, you just need to see how much information, visual information, is enough to do that and to do it in an inclusive way. And if just having five companies in a table is enough then only use five companies. And indeed, if you think you can do a distribution map with just five or ten points on it, that would be enough.
Stacey Hill We've talked a lot about accessibility here in terms of all students being able to access the assessment. We've talked about it from the point of view of written questions. We've talked about it from the point of view of resources and stimulus; but are there other measures of accessibility that we think are really important when it comes to principles of geography assessment?
Simon Oakes I think there are and people are becoming more aware of them as time goes by. Colour vision deficiency is another area which all boards are moving towards trying to adapt resources where possible for that as well. Our November 2021 papers in the options have got, if viewers want to go and look at them, have got some bar charts which are adapted in ways that work very well for students who've got colour vision deficiency because rather than using different colours, they're using textures: they're black and white, some of them across hatched, others have got dots on them. And I mean that that is where you can use that which you can't always in geography because maps can be such a challenge, but where possible, it's always good to try and make those kind of adaptations.
Stacey Hill Thanks. Simon. Now, we've come to the end of this now, and we've talked a lot about validity, reliability and accessibility. It'd be really nice, I think, if we could just give a bit of a summary of some of the key takeaways that we'd really like the audience to take away from this discussion today.
Simon Oakes Sure, I can go through each of those very briefly. With validity, what are we assessing? Fundamentally, what are we asking? Are we asking the right questions? Are we asking questions that don't just show what students have learned, but let us know if they're good, if they've become good data storytellers, whether they can think like a geographer, whether they can make judgements like a geographer, whether they've picked all this up from their course as well, in line with what the assessment objectives have told us the qualifications have got to do. So, what are we assessing fundamentally? Reliability, how are we assessing? How are we asking questions and how are we writing mark schemes that are going to mean that no matter who marks the script, what teacher, what assessor marks it, that the students are going to get the same overall mark, because there's a transparent process, there's a fair and easily decipherable road path for whoever's marking it to be able to follow, is it a reliable assessment? And finally, more and more getting focused on whether our testing is truly inclusive, whether it's accessible, whether we've given all the thought we can do to the language we use to the prompts that we give and that where we're using resources that we've really thought critically about, about making sure those resources are really open and accessible, so they're actually going to draw students into answering the questions that we wanted to ask them.
Stacey Hill Thanks so much Simon for all the expertise that you brought to this today.
Simon Oakes Thanks Stacey for having me on.
Stacey Hill And thank you everyone for watching. I really hope this is giving you some insight into the general principles of geography assessment at AQA.
Questions you may want to think about
- How can you use these insights to prepare your learners for exams?
- Do your internal assessments reflect the approach of the exam? To what extent do you want them to?
- What’s the most important or surprising thing that you’ve learned? How might it influence your teaching?
Mark scheme guidance and application
Find mark scheme guidance courses
Our online courses will give you the tools you need to mark with confidence. You’ll learn how to apply mark schemes for specific qualifications.
GCSE Geography: Mark scheme guidance and application
Location: eLearning
Reference: GEOGOE1
A-level Geography: Mark scheme guidance and application
Location: eLearning
Reference: GEOAOE3
Good assessment made easy with Exampro
Find past paper questions so you can make customised assessments for revision, homework and topic tests for GCSE, AS and A-level.
Connect with us
Join the conversation
Contact our team
Become an examiner
Join us and find out how it can benefit you, your students and your professional development.