After your exam has taken place, you can analyze the results to get a sense of the quality of the assessment. For digital exams, this analysis is available immediately afterwards in TestVision or Ans Exam. The results themselves are accompanied by a psychometric analysis which can tell you whether any of your questions were unclear or too difficult for students. By making adjustments based on this so-called item analysis, you can increase the quality of the exam and the reliability of the results.  

The most important indicators on the level of both the questions themselves and the exam as a whole are as follows: 

Cronbach’s Alpha, which gives an indication of the reliability of the exam as a whole; 
The difficulty index, or p-value, for the exam as a whole, as well as the individual questions;
The discrimination index, or rit-value, expresses to which extent the question discriminates between stronger and weaker students.

If it turns out the exam was more difficult and/or less reliable than you anticipated, you should take a look at the quality of the individual questions. Questions with both a low p-value (= difficult question) and a low or even negative rit-value (= higher-scoring students answered incorrectly, or lower-scoring students answered correctly) are especially cause for concern. It could be that the question was unclear, the material wasn’t covered (sufficiently), the answer alternatives overlap (in MC-questions), or the answer key is incorrect. In such cases, you could update the answer key to include another alternative, for instance. This increases the reliability of the exam as a whole.

If you want to learn more about how to interpret psychometric analysis results and use them to increase the quality of your exam, you can follow the TLC’s e-learning module about analyzing exam results. 

Analysis of paper exams

Even if your exam isn’t digital but on paper, it’s very useful to analyze the results. To do so, make a list of student scores per question in Excel. This allows you to calculate the average score per question and the exam as a whole – this gives you the difficulty index, p. You can also use software to carry out a more detailed psychometric analysis.  The TLC has also created an Excel-template that can calculate p-values, rit-values and the alpha for your exam. Download the template and user guide here. 

What should you look at?

Pass rate for the exam as a whole

The first thing you can look at is the pass rate for the exam as a whole. In principle, students who meet the entry requirements for the degree or have passed the first year (propedeuse) should be able to pass their exams. If this is not the case, it might be because students did not study enough, but it could also be because of problems in the exam or the teaching itself.  

If more than 30% of first-year students fail an exam, it is helpful to investigate whether the exam was too difficult or the questions were unclear. If it turns out, for instance, that strong students all got a certain question wrong, there’s probably an issue with that question. Take a closer look at the phrasing of the question: is it open to multiple interpretations? Is it not clearly linked to the course material? Or are there other answers possible that are not in the answer key? You may need to adjust the key, for instance.   

Differences between graders

Another problem that may emerge when you analyze your assessment is grade differences due to team grading. This is most common in the case of papers, assignments, oral exams and presentations, but it also happens in exams marked by several staff members. It’s important to avoid  big scoring discrepancies in team grading, especially when they directly concern the difference between passing and failing for students.  You can spot such issues by comparing average grades. For instance, you can look at the extent to which each instructor’s scores deviate from the average.

One way of resolving this issue is by dividing up the grading per question, so that each question is graded consistently by one instructor. This does not completely prevent differences in team grading, but it’s a fairer way of grading student work. You can also calibrate beforehand, by having team members grade a few questions, exams or assignments together to ensure a consistent approach.

Cut-off point

If you have resolved any issues relating to the questions themselves and grading differences, and there are still too many failing grades, you can consider adjusting the cut-off point (pass/fail mark). There are several ways of doing this (see step 2 in the assessment cycle, constructing), but don’t make any changes without discussing them with your faculty’s assessment specialist. 

Exam too easy?

Analyzing your exam results may also indicate that it was not too difficult but, rather, too easy. For instance, if (nearly) all students pass a first-year exam, it may have been too easy. You could consider adjusting the cut-off point in such a situation, too. However, you must be able to explain this to students – that’s why it’s good to warn them that the cut-off point may change based on the results. In this case, too, first consult with your faculty’s assessment specialist. 

Frequently asked questions

Students performed very poorly on the exam. Is there anything I can do about this? 

If exam results are lower than expected, it can be tempting to blame the students: they didn’t study hard enough or pay attention during class. However, the culprit may also be the quality of the teaching or the exam itself. First of all, have a look at the individual questions. Are there any questions students scored particularly poorly on? Are those questions clearly phrased? If strong students also scored poorly on certain questions, you can check if there’s something wrong with the phrasing. Or maybe the question concerns something that wasn’t covered properly during class. You could consider scoring this question more leniently, accepting other answers than originally in the key, or as a final measure, adjusting the pass/fail cut-off.  

Everyone has passed the exam. Is this normal?

Of course it may be the case that you happen to have a very talented group of students, but it’s more likely that the exam was too easy, in retrospect. Consider scoring certain questions more strictly or adjusting the pass/fail cut-off, and make sure that next year’s exam is pitched at the right level.

Students scored very poorly on a certain question on the exam. What went wrong?

It’s not necessarily if there are difficult questions on your exam, as this allows strong students to distinguish themselves. But if it turns out that they also scored low on that question, there might be something amiss. Have another look at the way the question is phrased. Could it be confusing, in any way? Consider scoring the question more leniently or accepting other answers than originally in the key. 

Sometimes you may need to exclude a question from the final grade, or mark it as correct for all students. This has different consequences for students’ grades. Excluding a question lowers the final grade for students who had answered it correctly, and raises it for students who had answered it incorrectly. If you mark it as correct for everyone, the final grade stays the same for students who had answered it correctly, and increases for students who had answered it incorrectly. This option is the least detrimental for students.

We are grading in a team, and it turns out a colleague’s grades are consistently higher / lower than everyone else’s. What can we do about this?

When you’re grading in a team, it’s important to ‘calibrate’ the results, in other words, to make sure everyone is equally strict or lenient. One way to do this is to grade a few exams or answers together first, to make sure everyone ends up with the same grade. If it’s too late for this and it turns out there are grading discrepancies after everyone has finished grading, have one of the other instructors grade a few of the exams or answers in question and see if they arrive at a different result. If this is the case, you should re-examine all exams or answers that colleague graded and adjust the grades, if necessary.

Designing		How do I choose a form of assessment that accurately measures my learning outcomes?
Constructing		How do I construct effective questions and assignments?
Administering		What should I keep in mind while administering an exam?
The previous step: Grading		How can make sure my grading is efficient and reliable?
Analyzing		How do I evaluate and improve assessment quality after the fact?
The next step: Reporting		What should I keep in mind when returning grades and feedback?
Evaluating		How do I improve my assessment next year?