Hi Angus
I script all my Matlab questions in Python. The script runs to many hundreds of lines and checks lots of style things like maximum function length, code layout and use of comments. Template parameters allow me to configure individual questions to require or ban certain constructs or the use of certain functions. This is all so different from your approach that I'm not sure I have much to offer. I don't generate pools of questions in the way you are doing. I've never used Octave's unit testing framework, so I can't help with that either.
However, a colleague in Maths, Jenny Harlow, does one thing which might be of interest to you. Since she is teaching numerical maths, it's primarily numbers she's interested in. Her question types are also scripted in Python. When grading a student's submission she first runs her sample answer and extracts the integers and floating point numbers from the output. Then she runs the student answer, extracts the numeric output from that, and compares the sets of numbers to a specified tolerance. The sample answer can be expanded directly into the template as {{ QUESTION.answer }} or, if you're scripting the process, can be stored in a string with a line like (for Python)
sample_answer = """{{ QUESTION.answer | e('py') }}"""
It sounds to me like you could do something similar - compare the output from your sample answer with the output from the student's answer.
I don't have any particularly nice way of comparing structs either. I usually compare them field-by-field. I've also used the fieldnames function to ensure the right names are present. That function can also be used to implement a custom struct printer. The function orderfields might also be useful to produce a standardised output for structs.
I'm not sure if that helps. But I suspect that the sorts of things you're trying to do would be a lot easier if you scripted the process in another language, running the student code and, probably, the sample answer, in a subprocess. I and all my colleagues now use Python for scripting but for some years I used to use Octave for the job, using the system call to run subprocesses. But, really, scripting string manipulation and OS calls in Octave is an exercise in masochism.
Richard