Question Authors' Forum

Programming statistics in R

 
Picture of Chris Sangwin
Programming statistics in R
by Chris Sangwin - Wednesday, 23 November 2016, 1:13 AM
 

Is there any support for programming statistics in R?  I've had some interest from statisticians here about this possibility.

If not then how much work would this be to add?

Thanks,

Chris


Picture of Jenny Harlow
Re: Programming statistics in R
by Jenny Harlow - Wednesday, 23 November 2016, 6:00 AM
 

I asked about this and we did a very little investigation about a year ago but then I ran out of free time and got distracted ...  We started by looking at a basic R install to see what configuration options would be useful.  I do remember noting down the --silent option.  There did seem to be potential for creating some resources that our stats students who need to use R but are not programmers could use to at least start finding their feet with it.

Jenny



Picture of Richard Lobb
Re: Programming statistics in R
by Richard Lobb - Wednesday, 23 November 2016, 11:47 AM
 

Just to give you a bit of background to Jenny's answer ...

When Jenny raised the question of R a year or two ago, my first response was that it would be trivial to ask R questions of the simple "write-an-R-function" or "write-an-R-program" variety. I installed R-base on our Jobe server (5 mins work) and we used a Python3 question (with R as the Ace language) to run R. That was easy enough. Here's a possible template:

import subprocess
r_prog = """{{ STUDENT_ANSWER | e('py') }}"""
r_prog += "\n" + """{{TEST.testcode | e('py')}}"""
with open('prog.r', 'w') as fout:
    fout.write(r_prog)

cmd = "R --slave --vanilla"
subprocess.call(cmd.split(), stdin=open('prog.r'), universal_newlines=True)

With that question type you can ask questions like the following:

Screen shot of simple R question

However, as I recall that wasn't the sort of question that Jenny thought the stats lecturers would want to ask their students, who aren't really programmers. So then you get into the much harder issue of "What sort of question to you really want to ask?".

Richard

Picture of Chris Sangwin
Re: Programming statistics in R
by Chris Sangwin - Wednesday, 23 November 2016, 10:10 PM
 

Thank you both for such swift and helpful replies.

I'm very reassured that CodeRunner can, at least at a technical level, accept R code.  I'm not sure exactly what my colleagues have in mind. We will be running on a server which also has my own STACK question type installed (https://github.com/maths/moodle-qtype_stack) and I think a combination of the normal Moodle questions, STACK and CodeRunner will be an interesting combination of tasks for students which combine mathematical and programming elements.

The 10^6$ question is always "What sort of question to you really want to ask?".  I'll talk with colleagues about that one...

Chris


Picture of Chris Sangwin
Re: Programming statistics in R
by Chris Sangwin - Thursday, 16 November 2017, 4:53 AM
 

I'm in the process of setting up CodeRunner to call "R" so that we can develop some introduction to R/stats courses here in Edinburgh.  I'm getting a strange setup problem, which I suspect is nothing to do with CodeRunner.

I've setup the latest version of Jobe, and CodeRunner ($plugin->version  = 2017082200;)  etc. 

CodeRunner works just fine with Python, Java, C, and also with Octave.

I've created the attached very simple python code which calls R, and this executes with the following result.

python3 r.py 
[1] 3
[1] 1.581139

So, I think I have R on my Jobe server, and python can call R in the way Jobe would expect to (permissions not withstanding....).

I'm using the sample R question referred to above, but, I get the error shown in the screen shot.

***Error***
Fatal error: couldn't allocate node stack

Having done a little bit of digging, I think this error is related to node.js.  I'm not sure.  Does anyone know what is causing this error and how I can fix it please?  

Thanks,

Chris  




Picture of Richard Lobb
Re: Programming statistics in R
by Richard Lobb - Thursday, 16 November 2017, 7:04 AM
 

This is probably just an out-of-memory error. I'd suggest using Customise > Advanced Customisation and setting MemLimit (MB) for this question to a something like 500. The default is 200 MB, which is probably not sufficient for Python + R together. Or you could try setting it to 0 to turn off memory limit checking altogether.

Richard

Picture of Chris Sangwin
Re: Programming statistics in R
by Chris Sangwin - Saturday, 18 November 2017, 1:59 AM
 

Thanks Richard,

I've tried this.  Even with 0, to remove limit checking, this error persists.

Chris

Picture of Richard Lobb
Re: Programming statistics in R
by Richard Lobb - Saturday, 18 November 2017, 11:11 AM
 

I've downloaded R and found the actual error you're getting. It's in r-source/src/main/memory.c:

    R_BCNodeStackBase =
        (R_bcstack_t *) malloc(R_BCNODESTACKSIZE * sizeof(R_bcstack_t));
    if (R_BCNodeStackBase == NULL)
        R_Suicide("couldn't allocate node stack");

So certainly it's a memory error - a failed call to malloc.

I tried making a simple R test question using exactly your prototype and of course it worked fine for me :)  Here's the proof:

Proof it works

I attach the exported Moodle XML question; please try importing and running that first off. If that works on your system too, then the problem is in the setting of the memory limit. But I suspect it will fail on your system too. In which case: tell me a bit about your Jobe server. What version of Linux is it running on? Were any non-standard actions taken during the install? 

Richard


Picture of Chris Sangwin
Re: Programming statistics in R
by Chris Sangwin - Saturday, 25 November 2017, 6:45 AM
 
Thank you Richard,

This is now working.  I think the explicit memory limit 0 has fixed this.

You help is much appreciated.  We can take this from here.

Chris


Picture of Richard Lobb
Re: Programming statistics in R
by Richard Lobb - Saturday, 25 November 2017, 8:59 AM
 

Good to know the problem is fixed. However, I'd advise against using 0 for the memory limit in a production question because a memory gobbling submission might then be able to cripple Jobe. It's safer to find a value at which a typical submission will run OK and then, say, double it. 

I'm also curious as to why your Jobe server seems more prone to memory limit problems than ours. Are you perhaps running it on a 32-bit OS? 

Richard