VC4 toolchain hardware testing

I’ve been wanting to get testing for my VC4 toolchain running on a real Raspberry Pi, in order to bash out bugs and then go on to do some improvements to the compiler. Using the simulator is OK, but it’s not complete and almost certainly not 100% accurate in the stuff it does, and I’ve made some “best guess” changes without really verifying them against hardware. So it could be wrong in all sorts of ways.

I managed to hack Kristina’s open firmware┬áin order to load code (i.e. GCC regression test binaries) over the serial cable, and then run it. This is pretty much all you need for GCC testing (load test, run, observe output — again over serial), but you need to be careful to not let the state of the board deteriorate over time due to the effects of the tests you are running. E.g. they might miscompile in such a way that they scribble over the memory of the resident “shell” code that is returned to on completion of each test, or throw exceptions (divide by zero, for instance).

The latter in particular has been causing some trouble: some aspect of the VC4 processor appears to not have been understood properly (by me?). On taking an exception or handling an IRQ, the VC4 switches to exception mode (possibly aka “secure mode”). After we’ve done our exception or interrupt handling, we want to return to the original processor mode (supervisor mode) with an RTI instruction: but, for some reason, that doesn’t work and the processor stays stuck in exception mode. That’s a recipe for disaster: exception mode uses a separate stack pointer, and the sticking behaviour means that subsequently the two stacks trample over each other and the board inevitably crashes.

That still needs to be sorted out, but for today, I’ve come up with a workaround: just reset the board after running each test (no matter if it passes or fails). In a way, that’s kind of better, particularly when we don’t entirely understand the system: assuming the reset works correctly, it means we get a fresh state for each test we run. The method appears stable (though a full test run is rather slow), and results are OK: there are a few differences from a simulator run, which means there’s at least one code generation bug of some description somewhere that the simulator is hiding (something like two wrongs making a right?).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>