Using CODE RUNNER to score running time

Using CODE RUNNER to score running time

de Osama Radwan -
Número de respuestas: 2
While setting my questions up I came across some questions where I want to tell CODE RUNNER to accept a O(n) solution and reject the O(n^2) one. I am willing to provide a big test case where the slower solution will take significantly more time. But I want to ask you if there is a way I can measure such time or a way where I can tell Jobe or the program to run for only 3000 ms -for example- and then stop ?

 Can you point me in the correct direction please?

Thank you!
En respuesta a Osama Radwan

Re: Using CODE RUNNER to score running time

de Richard Lobb -

All CodeRunner questions run with a timelimit. The time limit for a particular question is usually set by whichever of the following is defined first. 

  1. The setting of the TimeLimit (secs) field for a specific question, if customised. This setting is in the Advanced customisation section. Or, if this field is left blank ...
  2. The setting of the TimeLimit (secs) field of the prototype for the selected question type. Or, if this field is left blank ...
  3. The Jobe server default cputime parameter value (see https://github.com/trampgeek/jobe#run_spec-parameters), currently 3 seconds.

A maximum time limit of 30 seconds is enforced by the Jobe server, although this can be raised by a systerm administrator.

If you're trying to set a question that will reject solutions of the wrong big-O complexity, the easiest approach is usually the following.

Using a standard question type, customise the question and set its TimeLimit to some suitable value (if the default isn't suitable). Then have a series of test cases including (usually) a few small simple ones and at least one sufficiently large test case that won't run in the preset time unless of the correct complexity class. CodeRunner will usually try to package all the test cases into a single job and run that. However, if a timeout occurs, CodeRunner then backs off to running each test case as a separate Jobe submission, so students still get a result table showing *** TIMEOUT ERROR *** on the tests that failed but with valid results for the smaller test cases.

That's usually good enough, but if you want more control over the grading and feedback of such questions, you can instead use a custom grader. The CodeRunner documentation has a section on supporting or implementing new languages, which shows how to run the student code from within the template script. The key bit of code in that example is 

    try:
        output = subprocess.check_output(["./prog"], universal_newlines=True)
        print(output)
    except subprocess.CalledProcessError as e:

If you wish to catch a timeout error you can modify that example to something like

    try:
        output = subprocess.check_output(["./prog"], timeout=2, universal_newlines=True)
        print(output)
    except subprocess.CalledProcessError as e:
        ...
    except subprocess.TimeoutExpired:
        ... set up whatever feedback/grading option you want here ...

You must make sure that the subprocess timeout you set is at least 1 or 2 seconds less than the timeout set for the question itself, otherwise the jobe timeout will kick the whole job out and the exception handler won't run.

Another advantage of that approach is that if the student job doesn't time out, you can measure how long it took by recording the time before and after the subprocess.check_output call, e.g. using Python's  time.perf_counter. That time is measured in fractions of a second, whereas timeouts are in seconds, so this gives you an alternative approach: rather than using a large test case that times out, you can use smaller ones and see how long they take, grading accordingly.

For even more flexibility, you can switch from a per-test template grade, as used above, to a combinator template grader, which allows you to run all the tests within a single Jobe submission. That lets you do pretty much anything you want, but at a considerable increase in complexity.