Stray files appearing in Support Files

Stray files appearing in Support Files

by Richard Lobb -
Number of replies: 12

I have a question for Tim (or anyone else who might have experienced the problem), relating to a confusing but fortunately extremely rare issue.

On a small handful of occasions in the last few years we've found questions that have acquired additional files in their Support Files section. This seems to occur when we saving the question after editing it. In the most recent occasion on which we have some data, there were NINE stray files! All were images from the text of other CodeRunner questions. The teacher to whom this occurred didn't think they even had all those other questions (the ones that were the apparent source of the stray image files) open at the time, but isn't sure. However, he is quite confident (as am I, from earlier occasions) that this wasn't an inadvertent drag-and-drop event or similar user UI error. You can only drag and drop into the filemanager when that section of the editing form is open, and it wasn't open at the time. Also - you don't make such slips nine times in one editing session!

As an example, there was a question with the following HTML in the question text:

<img class="img-fluid atto_image_button_text-bottom" src="https://quiz2024.csse.canterbury.ac.nz/draftfile.php/830/user/draft/884732361/_2021su2cosc122exam1.png" alt="Reload page and contact staff if you can't see the graph" width="567" height="451">

Somehow, that image file has been picked up by the filemanager element for the Support Files of the prototype that was being edited.

Looking through the CodeRunner source, I thought this pickup must have occurred within the following code (simplified) from within edit_coderunner_form.php:

$draftid = file_get_submitted_draft_itemid('datafiles');
$options = $this->fileoptions;
$options['subdirs'] = false;

file_prepare_draft_area(
    $draftid,
    $this->context->id,
    'qtype_coderunner',
    'datafile',
    empty($question->id) ? null : (int) $question->id,
    $options
);
$question->datafiles = $draftid;
 

BUT ... I've looked through the Moodle source code for file_prepare_draft_area and it searches the DB for all files belonging to the component 'qtype_coderunner', marked as 'datafile' and belonging to the given question id (which could be null, though in the example quoted above the teacher was editing an existing question, so it shouldn't have been null). I don't see how this could possibly pull in all those image files embedded in other questions!

So I'm probably looking in the wrong place :-(

Suggestions, anyone?

In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Tim Hunt -
I don't understand this either, and it seems like you have already done most of the debugging that I would have tried. One further idea ...

Are you on Moodle 4.2+? If so, Moodle logs every time a user upload a file to a form. However, those logs are all linked to the user context (because they realate to draft files) so linking them up to the question can be tricky. (You basically have to look for other actions around the same time by that user, to find what they were doing.)

What the log entries do have, in the 'other' colum, is things like hte filename and filehash. Postgres has nice helper fuctions for querying JSON values. I don't know about MySQL. Anyway, those log entries might help rule out user error - though I believe you when you say that is highly implausable. Anyway, it may let you see where the stray files were added, which might give some clues.
In reply to Tim Hunt

Re: Stray files appearing in Support Files

by Richard Lobb -
Thanks for the prompt response, Tim. I just looked in the logs. We have a very tight handle on when the problem occurred, since the user was making multiple edits to the question, so we can see in the question history which version acquired the stray files and when he cleared them out again. There is not a single log entry relating to his uploading of files; only a heap of entries when he at last discovered the problem and deleted them. Entries like:

The user with id '8' has deleted file '/_2019s3csc002exam1.svg' from the draft file area with item id 225448252. Size: 17.3 KB. Content hash: f46aa13e8b592ed23600ac7b88acefcfb1fb04ae.

The question concerned was one of the built-in prototypes (the Undirected Graph one), but I don't think that can be significant.

The problem is so rare that I think it might either be a race problem or some sort of hash collision. But it's so rare that it's nearly impossible to debug, so I'm putting it back in the Too Hard bin.
In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Mike McDowell -

Funny enough I encountered this myself just the other day! Maybe 3 days ago

I happened to copy/paste question text (that included an image) into the question textbox. Upon saving I got the error "Disallowed file name(s): Screenshot 2024-01-25 at 10.50.31 AM.png" under the answer box. The file had appeared under supporting files. Even after deleting it, the error persisted. 

This was a custom prototype question and the text was copied from a previous coderunner question, and pasted into the new question. Richard, what logs were you digging into? I'll have a look on my server as well.

In reply to Mike McDowell

Re: Stray files appearing in Support Files

by Richard Lobb -
Interesting. It has the same sort of flavour, certainly. I tried to replicate what you'd done, but of course nothing broke :-/

The logs I was looking at are only available to site administrators, under Sit administration > Reports > Logs. You can download the logs for the day of interest and then browse in a spreadsheet.

Thanks for posting. If we get enough such cases, the light may dawn.
In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Richard Lobb -
Coincidentally we yesterday discovered something that's probably related to the issue. If you export a question as an XML, any images in the questiontext also get included as <file> element children of the <questiontext> element. It turns out that you not only get the current images included, but any earlier images that had been used through the history of the question editing. That's because all the old images are still there in the database, and the query that is used to retrieve relevant ones is just matching context_id, module, area, and question_id.

In the case we had where lots of images mysteriously appeared as support files in a question, we observe:
  1. The images were all present as questiontext images in an XML export of one particular question.
  2. The recipient question was the prototype of the source question.

Are you able to confirm please, Mike, that the prototype question that received the rogue images in your case was the prototype of the question that was the source of the image(s)? And that the image(s) were in the question text of that source question?

In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Mike McDowell -

that the prototype question that received the rogue images in your case was the prototype of the question that was the source of the image(s)? 

No the prototype was not the same. In my case it the original question was a custom python prototype. I created a new question (I didn't clone the question, but rather created a new one). I choose our new prototype to start. Later I copied all from the question text from the old question, and pasted into the new question text and press saved. The issue began then.


And that the image(s) were in the question text of that source question?

Yes the image was in the original question text.


I've attached an export of the question that is giving issues, along with the prototype it uses. If you try to edit the question, it gives an error regarding the file name. I've exported this from Moodle 4.2 and imported on our test server on 4.5 (using the latest CR as well). The problem persists. 

In reply to Mike McDowell

Re: Stray files appearing in Support Files

by Richard Lobb -
Thanks Mike. In that case, the fact that we were editing prototypes when the problem occurred may be a red herring.

Since my last posting, we've discovered something very interesting. An XML export of the question version immediately after the problem occurred shows that the rogue files appear both as questiontext files AND as support files - the same set of files in both cases. The previous version of the question had no files at all. This is the same set of files that belonged to the questiontext element of the "source" question. So somehow a set of files belonging to one question has been dumped into the database and linked tp the recipient question twice - once into the questiontext area and then again into the datafile area. The fact that you observed the problem occurring when copying something into the questiontext makes me think that some low-level Moodle data-import has gone hyperactive.

Looking at your ISSUE.xml file, I see it has similar but not identical symptom: the file Screenshot 2024-01-25 at 10.50.31 AM.png appears twice, once in the questiontext element and a second time in the testcases elements. But in your case the file has landed up in a child element of testcases, the answerfiles element. This is where the files that belong to the sample answer go, whereas the support files go directly into the testcases element. [This is a bit inconsistent I agree but I don't think it's the cause of any problems - it's just the way the XML export evolved.]

You said the problems persisted after you deleted the offending support file, so it looks to me like the file might have originally been replicated in three different places in your case: as a questiontext file, a support file and a sample answer file.

If you wish to kick the question back into life, I think if you delete both those file elements from the XML and re-import, the question will come right. But of course the key question is - what is causing this sudden replicating of files into file-areas where they don't belong?
In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Tim Hunt -
The issue where, if add and remove images (or other files) from a content area in Moodle, they are all stored behind the scenes is a known thing. (But it is hard for TinyMCE to reliably know that if you delete the ... in the editor, then the associated file should go, for example what if it is used in two places?)

You can tidy this up manually using the 'Media manager' button in the editor - but only if you know to do that. (I think that dialougue now tries to detect unused files, and make it easy to remove them - MDL-75260.)

This can get particularly bad if people work by always duplicating an existing question, then re-writing the question text. In one of our language courses, we ended up with questions with 70+ audio files in them!)
In reply to Tim Hunt

Re: Stray files appearing in Support Files

by Chris Sangwin -
We had exactly the same problem in STACK! Colleagues copy questions over, and the files get left behind. It was causing quite a clog in some courses.

I added in a check for questions which have "unused" files in them.
https://github.com/maths/moodle-qtype_stack/blob/master/question.php#L1597
This kind of check might be a useful addition to coderunner. (Or indeed in Moodle!)

Chris
In reply to Chris Sangwin

Re: Stray files appearing in Support Files

by Richard Lobb -
Many thanks, Tim and Chris. I didn't know about the media manager before this thread, although my colleague Jenny Harlow was quick to update me once I had made my posting.

It does seem odd that the media manager is able to give you a list of the files that are unused in the question. Why not just delete them?

To be honest, though, having extra files attached to questions hasn't actually caused us any problems in the past. It's relevant here only because of the rare weird occurrence in which all files attached to the questiontext of one question suddenly appear attached to both the questiontext and the datafile (and perhaps even the sampleanswer in one case) sections of another question. I take it you've never seen such an occurrence with STACK questions, Chris?
In reply to Richard Lobb

Re: Stray files appearing in Support Files

by Tim Hunt -
The issue is that it is using heuristics to try to see what is needed, and they could be wrong. Automatically deleting the wrong files would be bad.

(E.g. the question text could like to an embedded HTML file - and that could link through to some CSS/JS/image files - Moodle is not going to pick that apart.)