qutebrowser development blog : Config revolution

Since the last blogpost, I worked on the new config for another 5 days, so we're on day 17 now (already! Gnah...).

Sorry there wasn't an update in such a long time. I wanted to write a blog post multiple times, even wrote about half of it, and then other stuff got in the way. I guess it shows that I prefer coding to blogging... :D

My eye surgery went quite well, and I was able to start working again last week.

This is how things look:

218 files changed, 10234 insertions(+), 13768 deletions(-)

This is definitely the biggest change in qutebrowser's history... QtWebEngine was something like ~2500 new lines (and ~900 deleted) when I was able to merge the branch back in, but here it's all or nothing, because I only want one big config breakage for users using the git repo.

All tests are passing on the new-config branch by now:

4888 passed, 170 skipped, 60 xfailed in 605.11 seconds

This means the first phase of the config refactoring is complete. Everything is using the new config (with the exception of the setting completion, which is postponed until the new completion code is in), and everything is cleaned up.

I originally wanted to spend less time on this and postpone tests and such to after the crowdfunded time (and instead focus on the Python config and per-domain settings), but I decided it'd be better to finish what I started first. Otherwise, with such big changes, this would create a lot of trouble down the line.

If you want to try the new-config branch, it's definitely ready to see some testing, and I'd appreciate feedback! There is no config.py file yet, but if you're the kind of person who prefers :set or qute://settings, it should be good to go! Note that on Windows and macOS there will be breaking changes to the config's location, though - see the project board on GitHub to see what's still missing.

With that done, I will now take a longer break from qutebrowser work and focus on my upcoming exams instead. I'll be back for the remaining 3 days (and maybe a bit more full-time work on top, time permitting) in September, after my exams are done.

The T-shirts will also be postponed until then - like shown on the Kickstarter page, it's going to be somewhen between September to December until they are sent.

See the next few sections on what happened in the past few days, though!

Keybindings

Like I mentioned in the last update, my previous solution for handling keybindings wasn't really usable. Originally, all the default bindings were stored in the config like this:

bindings.commands = {
    'normal': {
        'gg': 'scroll-perc 0',
        ...
    },
    ...
}

When adding a new binding, the config would (correctly) detect that bindings.commands was changed, and then add the complete new value (including all default bindings) in your autoconfig.yml. This clearly wasn't the way to go.

I found a straightforward solution though: There are now two settings, bindings.default and bindings.commands (empty by default, i.e. {}) in the config. For keybindings, both are merged, starting with the default one.

This has various benefits:

When you want to add a new binding, you only mutate bindings.commands, so you only get your custom bindings in your config file.
When you don't want to load any default binding at all, you set bindings.default = {} - then only your custom bindings are bound.
When you want the default bindings, but don't want any new defaults automatically, you pin bindings.default to its default value explicitly.

I think this is a great solution - it's straightforward, and it makes things very flexible.

With that in place, I also updated all the code using those keybindings, and simplified some of it. That means :bind and :unbind now work like you'd expect again (and modify bindings.commands).

I also added a bindings.key_mappings which can be used to transparently map a key to another one. For example, by default <Ctrl-[> is mapped to <Escape> and <Ctrl-J> to <Enter>, so for any binding which is bound to <Enter> you can press <Ctrl-J> instead.

This is probably also very useful to adjust to different keyboard layouts, if you want to keep the bindings in the same place without rebinding everything.

Testing

To make sure the new config doesn't end up as the unmaintainable mess the old config was, I decided pretty early on that I wanted 100% test coverage for all new config code, just like for some other modules which are easy to test.

Writing the tests turned up some issues - among them an issue where modifying the configuration also modified the default value stored in qutebrowser internally... This one definitely would've been a big pain to debug further down the road.

After getting those all to pass (with 100% coverage), I also updated all older tests for the new configuration - this turned out to be much more straightforward than I thought it would be, with only a few things requiring a bit more work (mostly tests related to keybindings).

Completion tests are still skipped though, I'll take care of those once the new completion is merged in.

After everything looked nice and green locally, I pushed them and hoped things would look the same on Travis - however, I was in for some not-so-nice suprises with older Python versions:

Python 3.4 is worse at dealing with circular imports than 3.5 is, so I had to move some imports to accomodate for that. I really hope I'll be able to drop Python 3.4 for v1.0, though!
In Python 3.6, dictionaries are ordered by default (as an implementation detail), which caused me to not catch some issues where the tests relayed on that property.

Then there were some bigger issues...

Unicode is hard!

A test using hypothesis to do some intelligent fuzzing showed an issue I haven't seen locally, and was quite interesting. A slightly simplified version of the test:

@hypothesis.given(val=strategies.dictionaries(
    strategies.text(min_size=1), strategies.booleans()))
def test_hypothesis(val):
    d = configtypes.Dict(keytype=configtypes.String(),
                         valtype=configtypes.Bool(),
                         none_ok=True)
    try:
        converted = d.to_py(val)
        expected = converted if converted else None
        assert d.from_str(d.to_str(converted)) == expected
    except configexc.ValidationError:
        # Invalid unicode in the string, etc...
        hypothesis.assume(False)

It uses Hypothesis to get dictionaries which are filled with random data like {'x': True}, converts them to a string (like qutebrowser would when e.g. showing the value in the completion), converts that value back to Python again (like qutebrowser would when using :set) and makes sure the same thing comes out.

The problem here was that qutebrowser uses JSON to convert lists/dicts in the new config to a string (because it outputs compact, one-line representations), but YAML to parse lists/dicts from a string (because it allows for more a more lightweight syntax like {Hello: World} instead of {"Hello": "World"}).

This shouldn't be a problem because YAML is supposed to be a superset of JSON - however, turned out that's not true. Unicode codepoints starting from U+10000 are encoded as a surrogate in UTF-16. Since JavaScript (until recently) didn't have escapes for those 4-byte characters, JSON encodes them to a UTF-16 surrogate, which then gets read incorrectly by YAML:

>>> yaml.load(json.dumps({'\U00010000': True}))
{'\ud800\udc00': True}

The "solution" for this was easy: Simply disallowing those characters in the config inside dicts and lists.

configdata.yml performance

With the new configuration, all available config options are defined in a YAML file (instead of an almost uneditable Python file like before), see my older blog posts for details.

On every start, qutebrowser reads that config file and generates an internal structure with all available settings and default values. Now for some reason, this takes around 20 seconds (!) on Travis CI, for a ~2200 line YAML file. I've heard about YAML being a bit slow sometimes, but certainly didn't expect this.

I did some tests locally, and checked what difference the C extension of PyYAML makes (it has both an accelerated C implementation with a thin Python layer, and a pure-Python implementation).

With the C extension, reading the file took around 20ms on my machine, which is entirely reasonable. With it disabled, this jumped to 200ms which already isn't as nice anymore, but still bearable. But still, this is all orders of magnitude off from 20 seconds.

I still have no idea what happened there - I decided to open an issue (with some ideas like "compiling" the YAML to a Python file), and move on for now (after skipping the benchmark I wrote on Travis, because it took way too long).

Documentation generation

The settings reference in qutebrowser's documentation is autogenerated (but stored in the repository), with Travis making sure it doesn't end up being stale.

Updating the script to generate docs for the new config was relatively easy (especially because every config type already had a .to_str() implemented), but Travis told me that it still detected uncommited changes in the docs.

After changing the script which checks for those to show a git diff when it fails, it was clear what was happening: It was dictionaries being ordered differently again. A value like {"one": 1, "two": 2} could be shown as either that, or {"two": 2, "one": 1} in the docs.

I ended up doing something I wanted to postpone until some later point: Showing dictionaries and lists nicely as lists in the documentation, by pretty-printing them with a .to_doc() method (which would just fall back to .to_str() for most types).

New release

This week, PyQt 5.9 was finally released. It ships with Qt 5.9, which comes with some long-awaited QtWebEngine fixes.

This means it was finally time to release qutebrowser v0.11, with lots of bugfixes and new features (like the new private browsing supporting QtWebEngine, or pinned tabs).

A lot of the release process is automated already, but unfortunately, things didn't go quite as planned at first...

First, GitHub's API just showed me a "Broken pipe" error when trying to upload the source release. After some while I figured out that the files were kind-of half uploaded, and after removing them again manually, it worked.
My Windows VM constantly used 100% CPU and refused to tell me why - and was unusable as a result of that, with some 10s delay for every keypress. A reboot didn't help, but a hard reset did.
On Windows, I accidentally ran the release script in a virtualenv without the github3 package installed - however, the script only failed after some 10-15 minutes after the package was built. It now checks for that earlier.
Running qutebrowser from the binary failed because PyInstaller didn't know about some hidden PyQt OpenGL module - so I told it about it.
On macOS, the QtWebEngine resource files weren't copied correctly. I'm not sure why the last release even worked properly...
Unmounting any volumes on my Mac (hdiutil detach) mysteriously failed (with a device busy). I adjusted the script to deal with that
The script still used the old Windows installer names (.msi instead of .exe), which I fixed too.

This all means this release took more than 3 hours instead of the usual half an hour or so... but I managed to upload everything!

Completion refactoring

The other big change which is currently ongoing for v1.0 (the new completion PR from @rcorre) has also seen some work, with me mostly investing some more time in reviewing changes and reviewing the entire contribution (2600 new lines, 2700 deleted) again.

After some more minor things are taken care of, I hope to merge it into master (which is now for v1.0 material) soon.

Miscellaneous

There was a lot of other stuff too, but this blog post would get way too long if I mentioned them all. Some examples:

content.user_stylesheets is now a list taking multiple CSS files.
content.headers.do_not_track now allows to not send the DNT header at all.
Various setting renames and clarifications.

Finally, I wanted to make sure all my thoughts are written down before leaving this alone for the next few weeks. After closing various old/stale pull requests, I also opened various new issues to keep track of everything that's still missing, and so I can close the old big "Config (r)evolution" issue which has gotten quite big (100 comments).

I also made sure the project board on GitHub for the new config is up to date, including a column for all issues which need to be tackled before merging this all to master... which will happen in September, after my exams.