This will signifigantly increase throughput when downloading
large ranges of segments.
The max concurrency is exposed as a cli arg.
We also slightly modify the logged info, so it reports segments downloaded,
not just number of missing segments (which we might skip downloading for various reasons).
Since we never got a new conn after failure, we would just keep erroring with
"connection already closed" errors.
This isn't applicable to the main cutter loops since a DB failure there will restart the process.
To make this work, we make type a proper segment field.
We also tell get_best_segments to ignore temp segments, since they might go away
before we can actually use them.
nginx tries to resolve everything at startup, which doesn't work
if some of the services aren't present.
we instead generate the config file from a passed in env var, so that only
enabled services are present.
Exposes a way to read all rows, and write a single cell.
We need to read all columns of each row so we know what would be modified
so we only do updates to single cells that aren't already the correct value.
This keeps us from impacting the sheet load too much with constantly changing values,
which I think might be a thing even if the values are the same.
This allows manual uploads to work without needing to fill all the edit fields
with junk.
We also set a constraint on uploader asserting that any videos from claimed onwards have a known uploader.
Again, an exception is made for DONE to allow manual uploads.
Modified the model to place the responsibility for granular permissions on Thrimshim; rather than having a "Role Table" listing which fields can be updated by a user.
These can happen if a downloader or backfiller dies suddenly.
We treat it similarly to partial but lacking any hash.
At some point in the future we should probably have something
to find any temp segments, hash them and rename them to partials.
We wrap direct dateutil calls to handle two distinct cases:
* `common.dateutil.parse()`: We want to handle arbitrary timestamps including tz info,
then convert them to UTC.
This is used in HLS parsing, and for command line input for backfiller
* `common.dateutil.parse_utc_only()`: We want to only handle UTC timestamps,
but datetime.strptime isn't flexible enough (eg. can't handle missing fractional component).
This is used for restreamer request params.
Each method is fairly complicated, but is self-contained and can be examined independently.
cut_jobs in particular contains several extra helpers and directs control flow
via some iterators. This is unfortunately nessecary due to the requests interface.
This commit only lays out the main loop, showing the high-level flow
and defining shared utilities. This is for clarity.
The actual methods that do the work will be implemented seperately.