CodeRunner: Testing submissions stops when a test case fails to run

Hello,

I am using CodeRunner to setup a cpp_program question type. In some cases, students submit their code and it results in errors (while CodeRunner is running their submissions). Specifically Time-Limit-Exceeded type-of-error. So I guess that might be that their code resulted in an infinite loop or something like that.

The problem is when this happens, CodeRunner doesn't skip this specific test case and move on to test the remaining ones after it. That way students will not get the points from the subsequent test cases if their code runs on them successfully.

I am wondering if there is there a way to mitigate that, or a workaround?

Thank you in advance.

Re: Testing submissions stops when a test case fails to run

by Richard Lobb - Thursday, 15 February 2024, 10:30 AM

There are two reasons for aborting testing when a test times out.

Firstly, I strongly prefer all-or-nothing grading, giving students the opportunity to fix their bugs rather than giving them part marks when some tests pass and some fail. See my old diatribe here. However, I accept that that's very much a personal view and CodeRunner should support different viewpoints where practicable.

The second reason is perhaps more compelling. At least in courses I teach, by far the most common cause of a timeout is a broken while loop, such as one that fails to increment the loop control variable. This code is doomed to timeout on every test. If there's a 3 sec timeout and ten test cases, that's 30 sec of Jobe sandbox CPU time. Not only is that a long time for a student to wait under test or exam conditions but it has the potential to overload the Jobe server, resulting in denial of service to all other users. Consider an exam with several hundred students of which 10 have such a bug. That's 5 minutes of lost CPU time on the Jobe server.

In summary: I don't wish to change the existing approach of aborting on a timeout.

However, there are workarounds.

The easiest workaround is to build your own timeout into your tests, making sure that your timeout expires well before the Jobe timeout. This is highly language dependent, but in Python you might write a test in the following form:

from subprocess import run, TimeoutExpired
try:
     run( .... , timeout=2)
except TimeoutExpired:
    print("This test timed out")

You might want to hide the test infrastructure in either the global_extra field or the individual test.extra fields.

A better but vastly more complex solution is to write your own custom question type that uses a combinator template grader to run all the test in whatever way you fancy. Our own in-house question types have a template parameter abortonerror that determines whether or not to continue testing after an error. An overall watchdog timer goes off before the Jobe server's timeout in order to prevens the tester from losing control if a total time budget is exceeded. With this approach you need to set a very long timeout - say 50 secs - for the total run, which again raises the spectre of a jobe server overload.