CodeRunner: Feature suggestion: add hints, and show model answer after X tries

Hi Richard,

Another request that I'm passing on from STEM colleagues. This is again for use purely for formative/informal teaching and learning purposes, usually when the CR questions themselves are ungraded.

Many other Moodle Question types have a hint system (e.g. show some text) after a student gets an attempt wrong (this assumes an Interactive with Multiple Tries [IMT] behaviour).

Obviously CodeRunner has it's own behaviour, so if this is feasible, maybe it would be harder to build this in to the existing behaviour, than to allow CR to work with IMT behaviour?

In terms of what my STEM colleagues are asking, the process would be:

If first attempt ("Check" button pressed) incorrect, show Hint 1.
If second attempt incorrect, show Hint 2.
If third attempt incorrect, show Hint 3.
If fourth attempt incorrect, show model answer.

(But if you have a blank hint for the first attempt, you can delay a hint from showing until their second/third incorrect attempt).

So two feature requests, really:

Option for hints.
Ability to show answer after X turns.

If this is agreeable, I appreciate it could be a fairly big task. We could probably assist with the development if it'd help.

Best wishes,

Chris.

Re: Feature suggestion: add hints, and show model answer after X tries

por Richard Lobb - martes, 27 de octubre de 2020, 12:09

As you say, CodeRunner ignores the quiz's default behaviour and uses its own adaptive_adapted_for_coderunner behaviour. It would be possible to have other behaviours, such as interactive_adapted_for_coderunner but that would soon turn into a maintenance nightmare.

Adding hints to the existing authoring interface would be much better but I really do worry about the complexity of the existing authoring UI and am very reluctant to add more stuff to it.

I use combinator-grader templates for nearly all my questions now, which gives me almost unlimited control over feedback. It should be possible to provide the number of the current attempt as a template parameter and, with an additional template parameter hintlist (a list of strings), the question type itself could deliver whatever form of feedback was wanted. With that approach, each user can develop their own question types without cluttering the authoring UI for everyone. I grant that template parameters aren't quite as convenient as html form entry fields in simple cases but they're vastly more flexible. Here, for example, are the template parameters we use for our standard first-year python3 question type:

abortonerror: true to abort testing when a runtime error occurs. Default: true
allowglobals: set this to true to allow global variables (i.e. to allow lowercase globals, not just "constants"). Default: false.
allownestedfunctions: set this to true to allow functions to be declared with a non-global scope. Default: false.
checktemplateparams: set this false to bypass the usual check for validity of template params (e.g. when doing randomisation, although prefixing the extra template params with '_' is preferred).
echostandardinput. If false, the standard builtin Python input function will be used. Otherwise, it will be replaced with a version that echoes the prompt to standard output to mimic the behaviour observed when standard input comes from the keyboard. Default: True
extra: should be a string, one of "", "pretest" or "posttest". If set and not empty, the TEST.extra field is inserted into the program before or after TEST.testcode for the values "pretest" and "posttest" respectively. Default: ""
floattolerance: a floating point number which, if defined and non-zero, changes the test for correct output as follows. The expected and got outputs are both right-stripped then broken into lines. If the number of lines don't match, the answer is deemed wrong. Then (after compressing white space to a single character if strictwhitespace is false) the got and expected outputs are compared line for line. Each line is split by a regular expression pattern that matches any floating point number and the two lines are compared token by token. If both tokens are floating point number they must be equal to within a given absolute or relative error of floattolerance. Otherwise they must match exactly. The absolute error is the absolute difference between the two numbers, the relative error is the ratio of the absolute difference to the expected. Default: 0.0.
globalextra: should be a string, one of "", "pretest" or "posttest". If set and not empty, the QUESTION.globalextra field is inserted into the program before or after TEST.testcode for the values "pretest" and "posttest" respectively. If TEST.extra and QUESTION.globalextra are both being inserted before the test or both are being inserted after the test, the globalextra precedes the TEST.extra. Default: ""

imagewidth: if this is given it sets the width in pixels of any matplotlib images inserted into the result table. Height is automatically scaled to match. Otherwise the image is inserted unscaled. Ignored unless usesmatplotlib is true. Default: None.
imports: this is a list of python import strings. Each string is either just a python module name or a full python import string. If just name is given, the import statement is simply "import name", otherwise the import string is used as given. For example:
```
{ "imports": ["math", "from blah import thing as twaddle"] }
```
Imports go at the very start of the generated program. This mechanism can be used to import test support functions, too, and is preferred over the use of a _prefix.py file.
isfunction: unless this is explicitly set to false, or the student's code already begins with a docstring, a dummy module docstring will be inserted at the start of the program. Also, if isfunction is true, the supplied code will be run stand-alone to check if it generates any output and an error message will be generated if it does. Thus, if your question is of the "write a program" variety, you should set this to false. Otherwise omit it. Default: true.
maxfunctionlength: this is the maximum number of statements that a function body can contain. Statements within statements are counted. Blank lines and comments aren't statements. This is a more-reliable alternative to the pylint max-statements parameter, which behaves strangely at times.
maxnumconstants: the maximum number of constants (i.e. uppercase globals) allowed. An integer, defaulting to 4.
maxoutputbytes: the maximum allowed number of output bytes. Default 10000.
maxstringlength: the maximum allowed length of the output string or error string in the result table. Strings longer than this have their inner content snipped out. An integer defaulting to 2000.
norun: if set to true, the normal execution of the student's code will not take place. Any test code provided will however still be run.
nostylechecks: true to suppress all normal style checking, including the checkers listed in "precheckers". Default: false
notest: if present and set to true, the test code will not be inserted into the code to be executed. Its role is then just as documentation for the student (as it still appears in the result table).
precheckers: a list of the names of programs to be run when prechecking the correctness of the code. Currently only "pylint" and "mypy" are supported. Default: ["pylint"].
prelude: a possibly multi-line string that is inserted into the file after any imports and other template-generated code but before the student answer (and before the _prefix.py file, if supplied).
proscribedbuiltins: this is a list of the Python built-in functions that cannot be used. Default: ["exec", "eval"].
proscribedconstructs: this is a list of Python constructs (if, while, def, slice, listcomprehension, etc) that must not appear in the student's program.
proscribedfunctions: this is a list of functions (sum, product, etc) that must not appear in the student's program. Default: []
proscribedsubstrings: this is a list of strings that must not appear anywhere in the student's program (even in comments). Default: []
pylintoptions. A list of strings to be added to the default list of options to pylint (relevant only if "pylint" is specified as one of the precheckers). For example, the Template parameters string in the question authoring form might be set to
```
{"isfunction": false, "pylintoptions":["--max-statements=20","--max-args=3"]}
```
to suppress the insertion of a dummy module docstring at the start and to set the maximum number of statements and arguments for each function to 20 and 3 respectively. Default options:
- "--disable=C0303,C0325,C0330,R0903,R0915,star-args,unbalanced-tuple-unpacking,consider-using-enumerate,simplifiable-if-statement,consider-iterating-dictionary,trailing-newlines"
- "--enable=C0326"
- "--good-names=i,j,k,n,s,c,_"
requiredconstructs: this is a list of Python constructs (if, while, def, etc) that must appear in the student's program. Default: []
requiredfunctiondefinitions: this is a list of the names of the functions that must be defined within the student's program. Default: []
requiredfunctioncalls: this is a list of the names of functions that must be explicitly called within the student's code
requiretypehints: if True all functions must have type hints for all parameters and the return type. Default: False
restrictedfiles: this specifies which files the students program is allowed to open. It is dictionary with two optional keys 'onlyallow' and 'disallow'. Each of these should map to a list of files that are allowed to be opened and filenames that are not allowed to be opened. The filenames in the lists can be a regex.
Default:
```
{"disallow": ["__.*", "prog.*", "pytester.py"]
```
restrictedmodules: A dictionary that specifies what modules are to be restricted. Keys are the names of modules and the values are a dictionary with two keys 'onlyallow' and 'disallow'. Each of these is a list of the names of objects within the module which are allowed or disallowed. The names of objects in these lists can be a regex. This is a runtime check only, not part of style checker.
Default:
```
 "restrictedmodules": {
    "builtins": {"onlyallow": []},
    "imp": {"onlyallow": []},
    "importlib": {"onlyallow": []},
    "os": {"disallow": ["system", "_exit", "_.*"]},
    "subprocess": {"onlyallow": []},
    "sys": {"onlyallow": []},
}
```
runextra: if set (to any value) the Extra Template Data is added to the program as test code before the usual testcode. [Deprecated: use the extra parameter instead.]
strictwhitespace: by default when checking correctness trailing blank lines and trailing white space on each line are ignored but otherwise white space must match exactly. If this parameter is set to false, white space within a line may vary, i.e., multiple spaces are treated as 1 space. Default: true
stripmain: if set to true, the program is expected to contain a global invocation of the main function, which is a line starting "main()". That line is deleted from the program. If the line is not present a "Missing call to main" exception is raised.
stripmainifpresent: if set to true and the program contains a global invocation of a main function, which is a line starting "main()", that line is deleted from the program. Otherwise nothing happens (cf stripmain).
suppresspassiveoutput: if set to true, any output generated by the student code even without any CodeRunner tests being run is ignored. This can be used, for example, to ignore output from any test code the student has included and/or to ignore the main output from a "write a program question". Only the output generated by CodeRunner tests will be displayed and marked. Default: false.
timeout: number of seconds allowed for each test case. Default: 5 secs. Be careful to ensure that the total time for all test cases can not exceed totaltimeout, particularly if abortonerror is false.
totaltimeout: total number of seconds allowed for the whole run. Cannot exceed the maximum allowed by Jobe, which is 50 seconds (and which is the default value for this parameter).
useanswerfortests: if true, a run with the sample answer precedes the run with the student answer and the results from the sample answer are used for correctness testing but only if no expected output is supplied. However, because this takes place at runtime, this doesn't work for "Use as example" tests, for which the expected output must be supplied by the question author.
usesmatplotlib: if true, header text is inserted at the start of the program to switch matlab graph output to use the non-interactive 'Agg' backend, which writes images to disk an .PNGs. After each test, the current state of the pyplot figure is saved to a new file. When all tests have been run and graded, the set of image files is inserted row-by-row into the result table with each figure below any text in the cell. For this to work correctly at least the first test must create a figure. The image is not graded - it is provided only for reference, so usually the test code will need to extract and display attributes of the current figure independently. See also the template parameters useanswerfortests, which results in the expected images being inserted into the table too and imagewidth which sets the width (and hence height by uniform scaling) to a desired number of pixels. Note that if usesmatplotlib is selected, most of the pylint options relating to imports (ordering, reimporting, positioning etc) are disabled. Also, you may need to increase the timeout value for the question.

Also, if usesmatplotlib is true, a function print_plot_info(data_type, x_samples=None) is made available for use in the test code or post-test extra. This prints various properties of the current plot for grading purposes. data_type should be one of 'points', 'lines' or 'bars'. In the case of 'lines', the 'x_samples' can also be supplied, as a set of abscissae at which the plotted line should be read out. The y values are read out using a second-order interpolation, and are printed to 2 decimal digits of accuracy. If x_samples is not supplied, the first 5 and last 5 points only are displayed.

Default: false.
usesnumpy: if true, the line import numpy as np is inserted at the start of the program, and the usual check for unused imports is turned off. Additionally, since numpy is used in a mathematical context where it is hard to define what variables name might be legitimate, the checking for valid names by pylint is disabled.
warnifpassiveoutput: if set to true and isfunction is also true, generate a style error if the student's code seems to produce output even without any CodeRunner tests being executed. This is probably the result of the student pasting test code as well as requested function(s) into their answer. Default: true

Let me look into the possibility of providing the number of the attempt as a template parameter to enable the question author to control hinting.

Re: Feature suggestion: add hints, and show model answer after X tries

por Chris Nelson - martes, 27 de octubre de 2020, 23:08

Thanks, Richard. Attempt number as a param would definitely help :-)

"I really do worry about the complexity of the existing authoring UI"

As do we for all our question types... question authors want more and more control, but less and less option complexity!

Re: Feature suggestion: add hints, and show model answer after X tries

por Fatih Gelgi - domingo, 20 de junio de 2021, 06:17

As far as I understand, when you submit a code (click on "check" button) the result shows up. I didn't see a way to pass the current state such as the current attempt number. I didn't see it in twig parameters. How do you understand the current attempt number?

Thanks,

Re: Feature suggestion: add hints, and show model answer after X tries

por Richard Lobb - lunes, 21 de junio de 2021, 21:01

The current state is part of the Twig QUESTION variable, specifically QUESTION.stepinfo. See here for the documentation.

To make effective use of this information you really need to be using a combinator template grader. Then, if the student's current submission fails and the student has already made a certain number of tries you can add hints or the sample answer after the result table. If you're not using a combinator template grader then you can only provide hints and feedback within the result table itself and without knowing if the current submission has succeeded or failed. That's rather clumsy.

See this forum discussion for more on this subject, including a very quick prototype of a question that serves up a hint (which may even be the question author's answer) after a number of failed submissions.

Re: Feature suggestion: add hints, and show model answer after X tries

por Fatih Gelgi - martes, 22 de junio de 2021, 06:06

Yes, you're right. I checked the documentation last year but somehow didn't see it.

That's why instead of number of checks, we had a hint system that is parametrized on the received score.

Thanks for the help!