The ref tests runner takes screenshots of both the input page and the
expected page, then compares them. Ref testing allows us to catch
painting bugs, which cannot be detected with the layout and text tests
we already have.
With ref tests, we'll likely want to reuse the same expectation page
for multiple inputs. Therefore, there's a `manifest.json` file that
describes the relationship between inputs and expected outputs.