Commit Graph

585 Commits (suspect-segments)
 

Author SHA1 Message Date
Mike Lang 79ef0b89e4 Add new segment type "suspect"
We've noticed that when nodes have connection problems, they get full segments
with different hashes. Inspection of these segments shows that
they all have identical data up to a point.

Segments that fetched normally will then have the remainder of the data.
Segments that had issues will have a slightly corrupted end.
The data is still valid, and no errors are raised. It just doesn't have all the data.

We noticed that these corrupted segments all were cut off exactly 60sec after their requests
began. We believe this is a server-side timeout on the request that returns whatever data
it has, then closes the container file cleanly before returning successfully.

We detect segments that take > 59 seconds to recieve, and label them as "suspect".

Suspect segments are treated identically to partial segments, except they are always preferred
over partials.
5 years ago
Mike Lang bb05e37ae4 segments: Use longest segment in bytes if duration is the same
We occasionally see corrupted segments that are slightly shorter in size
but report the same metadata as the full segments. Prefer the largest version
as it's likely the least corrupt.
5 years ago
Mike Lang b516917e62 Add new "smart" cut technique 5 years ago
Mike Lang bb3814f9f7 overview.jsonnet: Use a template variable to allow restricting to certain nodes 5 years ago
Mike Lang 4a2fe7a6ed cutter: Explicitly set mime type of uploads correctly 5 years ago
Mike Lang e2f4162ac7 thrimbletrimmer: Fix a bug when trimming controls aren't enabled 5 years ago
Mike Lang fde2758275 cutter: Fix bug where uploader was cleared on non-retryable error
Instead of on retryable error
5 years ago
Christopher Usher 986c9a3413 removed redundant option 5 years ago
Christopher Usher 9c77dd1f40 added the ability to generate a webpage with all coverage maps 5 years ago
Mike Lang 8086c917fe Force correct postgres version 5 years ago
Mike Lang e23387b231 thrimbletrimmer: Document new shortcuts 5 years ago
Mike Lang 40f6a72ad7 thrimbletrimmer: Add keyboard shortcuts -/= to adjust playback speed 5 years ago
Mike Lang 0a379faf5a thrimbletrimmer: Relabel download button to make more sense 5 years ago
Mike Lang fd35c0dc20 thrimbletrimmer: Have a download link instead of a iframe
The iframe doesn't always work, this is more reliable.
5 years ago
Mike Lang c9f2e8e0a5 thrimbletrimmer: Preserve trim timings when re-loading playlist
Useful if you've already cut the start
but want to extend the range of times before cutting the end.
5 years ago
Mike Lang a002619c4c youtube upload: Explicitly set mime type 5 years ago
Mike Lang a438d86f80 cutter: Fix multiple problems with logging errors 5 years ago
Mike Lang 72003f28d0 downloader: Don't check the age of a worker we just spawned
Not only is this redundant, but it creates a race condition where
the worker fails before the latest_worker = workers[-1] check,
and we get an IndexError.
5 years ago
Mike Lang 3fabb2944f prometheus: Add scheme to url 5 years ago
Mike Lang 6e067fab83 prometheus: fix mistake 5 years ago
Mike Lang d76f38bf20 prometheus: include url as a label
for coverage maps
5 years ago
Mike Lang 9a1369cf98 overview: Fix job -> service 5 years ago
Mike Lang e1993c6a79 overview dashboard: Look up services by 'service' label, not job
Job can't be repeated across scrape jobs, service can
5 years ago
Mike Lang ac98d67853 overview dashboard: Hide UNEDITED and DONE states so the others are visible 5 years ago
Mike Lang 8a65d18f74 prometheus config: Support mixed http and https scraping 5 years ago
Mike Lang 94d81d708f Downloader: Change access_token call to match website
It stopped working, these changes bring it back in line with the website
so it works.
5 years ago
Mike Lang aa3ca60b73
make video description slightly narrower
so that with the 1px border it's not too wide
5 years ago
Mike Lang eba5fc498a Remove flask response size tracking
Despite our best efforts, this was causing chunked responses to be fully
buffered into memory as a side effect.

This is really bad because responses can be VERY large.
5 years ago
Mike Lang 4bbcc8bc06 Revert "Merge pull request #155 from ekimekim/mike/manual-uploads"
This reverts commit 99de586353, reversing
changes made to 4b04f70b6f.

We don't need this feature and it complicates things and adds bugs.
5 years ago
Christopher Usher 845744cbf6 use a UTC timestamp 5 years ago
Christopher Usher 6b51734bbf added the ability to change the filename prefix 5 years ago
Christopher Usher 1325ccf280 added a read only user to database setup script 5 years ago
Mike Lang dc7f093ba0 Disable mp4 option for restreamer cuts
It caused our RSS to explode and i'm not sure why
5 years ago
Mike Lang 59d0fa3e40 sheetsync: Don't mis-parse blank as bad time 5 years ago
Mike Lang 99de586353
Merge pull request #155 from ekimekim/mike/manual-uploads
manual upload
5 years ago
Mike Lang 9ccc7e4e8d thrimshim: Allow manual_link to set things from UPLOAD_PENDING to TRANSCODING 5 years ago
Mike Lang c580671da2 Create upload pending state 5 years ago
Mike Lang 4b04f70b6f overview dashboard: Add system-level metrics 5 years ago
Mike Lang 967ac7b856 segment_coverage: Reduce "no hours" warning to info
This is too noisy at warning level, and comes up for non-main channels.
5 years ago
Mike Lang ab157afe20 sheetsync: Clear event counts before each update
Otherwise, no count of 0 ever gets set, and things are left showing
values when they shouldn't.
5 years ago
Mike Lang 47c8ebf11f nginx: SSL server should have same options as non-SSL 5 years ago
Mike Lang b936b9ab1c
Merge pull request #153 from ekimekim/mike/cache-builds
Add ability to explicitly pull and re-use layers from other commits when building
5 years ago
Mike Lang d231078048 Add ability to explicitly pull and re-use layers from other commits when building
This is intended mainly for travis CI, because by default it doesn't cache any layers
between builds.

By pulling likely-reusable builds (all parents of the current commit),
we take a fixed cost slowdown but in many cases should see a dramatic speed increase
overall, since we won't need to re-build anything that hasn't changed.

This isn't needed for local builds, where docker will do this on its own
with any previously-built images.
5 years ago
Mike Lang 0ab15672ae
Merge pull request #152 from ekimekim/mike/nginx/ssl
Add SSL to nginx if certs are given
5 years ago
Mike Lang 64766bcf35 Add SSL to nginx if certs are given 5 years ago
Mike Lang cff5c38691 Add new dashboard 5 years ago
Mike Lang 2efe1d6218 Fix a bad logging line when handling errors 5 years ago
Mike Lang 59ee5cf5c0 Only log at INFO about multiple versions of a segment
Since these tend to happen around stream endings, etc,
we don't want them to be crazy noisy and cause us to disregard real problems.

We can use the segment coverage to see in metrics if there are overlaps.
5 years ago
Mike Lang 4be8faf82e
Merge pull request #151 from ekimekim/mike/sheetsync/track-row-stats
sheetsync: Record counts of rows in the DB, segmented by various columns
5 years ago
Mike Lang 89a9e5554c sheetsync: Record counts of rows in the DB, segmented by various columns
This lets us view a number of useful graphs in dashboards, eg. rows by state,
errored rows, rows by day, rows by category, meltdowns per day, fraction of
events that are poster moments by category.

Sheetsync was the natural place to do this since it was already periodically scanning
the entire events table.
5 years ago