An edit has been submitted for the video. The video hasn't been submitted yet, thrimbletrimmer just informs other components how it wants the edit to be.
Adds a built-in "youtube-manual" location which is like "manual" except that it only works
with youtube URLs and populates the video_id column.
The intent is so that we can have playlist_manager manage videos we upload manually,
while still being able to distinguish between that and other manual links that shouldn't
be included (eg. links to third party youtube videos).
This is set when setting a manual link in thrimbletrimmer with a new checkbox, default off.
When comparing old and new video tags, it errors because it's a list, not string.
We change it to apply the transforms to all tags in the list, and also ignore changes in list ordering.
float() is inaccurate and Decimal() is very slow (~3x the cpu usage)
so instead we right-pad with 0s (eg. so 1.2345 -> 1.234500) then convert to int microsec directly.
Floating point error leads to 1us differences in parsed times,
which causes false positives in the overlapping segments check.
By using a Decimal, we get the exact digits from the filepath.
Because the checking process is entirely CPU-bound, it does not give any other
greenlets a chance to run while it is processing. This prevents us from responding
to metrics queries, and prometheus then times out.
By stopping to handle all other traffic in between each hour processed, we ensure metrics
remain responsive while processing.
strptime is very slow. In terms of pure get_best_segments() speed, this change
more than doubles the throughput.
In particular for segment_coverage, this halves the run time for each check.
It causes problems due to the sheer number of unique metrics emitted, which makes
the prometheus endpoint be very expensive / fail a lot.
The data is not useful enough to justify the cost.
The intended behaviour was to log a warning message and retry next time,
but still allow workers to be started for any streams found.
However, due to a missing continue, we fall through to attempting to start a worker
for a non-existent quality which causes a KeyError when looking up
`self.latest_urls[quality]`. This exception means we don't run through the other qualities,
so we never start any other quality.
* Add thrimshim to k8s jsonnet file
* Fix reference to bustime_start in thrimshim
* Add "enabled" config to selectively disable things
* Fix styling and handling of disabled components
* Don't need to "hide" enabled field
* Add port arg to thrimshim deployment
* fix indent nitpick
Co-authored-by: Mike Lang <ekimekim@users.noreply.github.com>
In order for the upcoming playlist manager to be able to use the DB `tags` column to know
what tags a video has, all the tags it needs need to be present.
Previously, this was a problem because the day and category tags only get added at the cutter
and so wouldn't be listed.
This moves them so they are added when parsing the row in sheetsync.
It also adds the poster moment tag if poster moment is checked.
Note that fully static tags that go on all videos are still only added in cutter,
but the playlist manager doesn't need to care about those (since by definition
they will match every video).