Check that open() calls for reading and writing use binary modes
Use alpine version with py3-pip package
Use python3 in Dockerfile CMD
Remove sys.setdefaultencoding() "hack"
Simplify ensure_directory() in common.common package
Then treat backfilling each channel just like backfilling each quality.
This is conceptually simpler (only one kind of thing, a (channel, quality))
and has better behaviour when a node is down (we only have one lot of error handling around it).
It also means we aren't asking the database for the same info once per channel,
and cuts down on logging noise.
We move all connection handling into get_nodes().
This means that problems connecting won't cause further errors
and cause the application to completely crash.
In turn, this means that the behaviour if the database goes down becomes
"continue backfilling from the nodes we know about" instead of crashing.
In testing, GDQ's stream delay went up over 1min, which caused backfillers to backfill
segments at the same time they were downloaded. We increase the window for now,
and also make it configurable.
This will signifigantly increase throughput when downloading
large ranges of segments.
The max concurrency is exposed as a cli arg.
We also slightly modify the logged info, so it reports segments downloaded,
not just number of missing segments (which we might skip downloading for various reasons).
To make this work, we make type a proper segment field.
We also tell get_best_segments to ignore temp segments, since they might go away
before we can actually use them.
We wrap direct dateutil calls to handle two distinct cases:
* `common.dateutil.parse()`: We want to handle arbitrary timestamps including tz info,
then convert them to UTC.
This is used in HLS parsing, and for command line input for backfiller
* `common.dateutil.parse_utc_only()`: We want to only handle UTC timestamps,
but datetime.strptime isn't flexible enough (eg. can't handle missing fractional component).
This is used for restreamer request params.