I think you've started by copying the simple example of a template grader from the documentation, but the assumption there was that the student answer was just a simple bit of text to be graded, rather than a program to be run.
The Twig STUDENT_ANSWER variable is whatever the student enters into their answer box. So in your case it's a C++ program, which you first need to compile and run in order to get the output it generates.
Here's a simplistic per-test template grader that assumes the output from the student program is just a single floating point number to be compared to the expected value within a certain tolerance. I attach a question that uses this template - one that asks students to write their own Newton root finder.
The Twig STUDENT_ANSWER variable is whatever the student enters into their answer box. So in your case it's a C++ program, which you first need to compile and run in order to get the output it generates.
Here's a simplistic per-test template grader that assumes the output from the student program is just a single floating point number to be compared to the expected value within a certain tolerance. I attach a question that uses this template - one that asks students to write their own Newton root finder.
import subprocess, json
TOLERANCE = 0.0001
__student_answer__ = """{{ STUDENT_ANSWER | e('py') }}"""
test = json.loads("""{{TEST | json_encode }}""")
with open("prog.cpp", "w") as outfile:
outfile.write(__student_answer__)
fraction = 0 # Pessimism
abort = False
compile_result = subprocess.run(
['g++', 'prog.cpp', '-o', 'prog'],
capture_output=True,
text=True,
)
if compile_result.returncode != 0:
got = f"*** Compile error ***\n{compile_result.stderr}"
abort=True
else:
# Compile OK. Try running it.
result = subprocess.run(['./prog'], capture_output=True, text=True, check=True)
got = result.stdout
try:
got_float = float(got)
expected_float = float(test['expected'])
if abs(got_float - expected_float) <= TOLERANCE:
fraction = 1
except ValueError:
got = f"Output was '{got.rstrip()}'\nCannot be converted to a float.\n"
got += "Further testing aborted."
abort = True
except Exception as e:
got = f"Unexpected error: {e}"
result = {"fraction": fraction, "got": got, "abort": abort}
print(json.dumps(result))
As the text of the attached question explains, this is inefficient as it does a separate compile-and-run for each test. You really should use a combinator template grader instead, iterating through all the test cases, but that does add extra complexity.