Just want to post about my first use of CodeRunner for real. Our Intro to Object-Oriented Programming course went from two semesters to a condensed one-semester version to a half-semester rather intensive course. This pretty much precluded taking a week to mark each assignment (for 80 students) - hence CR for rapid-return results. But something curious happened. In the past, students would get their marked and commented assignments back and that was that as they knew that there was no way to improve their mark by whining. With CR however, a good mark is always possible, especially if one of their colleagues has already aced the assignment. This resulted in the marks being distributed as an upside-down Gaussian and an awful lot of cloned answers. Student cooperation evolved to where someone would sacrifice themselves on a question in order to get an acceptable answer which then propagated to the others. They were certainly learning something, just not a lot of OOP. So finally I had to tell them that exercises would be marked, to give them feedback on how they were doing, but not included in their course grade. This seemed to cut down significantly on the panic to get a good mark by borrowing code.
Initially, I tried having hidden test cases, but this led to a great deal of frustration as the exercises got more difficult, and to more shared code whenever someone stumbled upon a test-passing answer. Finally I've settled on something like: adaptive mode; all-or-nothing grading with a 0,10,20,... penalty scheme; the first test case shown as an example; all test cases visible; hide-rest-if-fail checked for all tests. I have a sample answer for each question, but that's mostly for testing the test cases; I don't share that with students.
Once the exercises get somewhat complex, just evaluating on whether the code gives the right answer is not really sufficient - I've gotten truly horrible code which passes all test cases. I'm currently working on getting a prototype template to run FindBugs for some proper static code analysis (maybe next week, DV & WP :^) ). I've already posted on some stuff I've cobbled up for testing things like replacing a loop with a forEach (Java8) but doing this case-by-case is really tedious. There's always a hole somewhere that pops up as soon as students get loose on it, much to their amusement of course (but does serve to illustrate to them the notion of incompletely-tested code).
Ideas? Suggestions? Awesome-ways-of-doing-things?