Commit Graph

40 Commits (2939089edd07b735f59234db13a16cddc491610e)

Author SHA1 Message Date
Mike Lang 1add3c5c22 Implement tombstoning to allow for segment deletion
Rarely, we find ourselves needing to explicitly delete some data, eg. something that shouldn't
have been public and should be removed from all records.

It would also be nice if we could "clean up" bad versions of the same segment,
which occasionally come up when downloaders have issues.

With our distributed segment database, this is actually rather difficult as deleting the data
from any one server would cause it to be restored from the others. It was only possible
by stopping all backfill, deleting the data on all servers, then starting backfill again.

Here we introduce a more practical approach. An operator creates an empty flag file
with the same name as the segment to be deleted, but with a `.tombstone` extension.
eg. to delete a file `/segments/desertbus/source/2019-11-13T02/45:51.608000-2.0-full-7IS92rssMzoSBQDIevHStbTNy-URRV3Vw-jzZ6pwOZM.ts`,
you would create a tombstone `/segments/desertbus/source/2019-11-13T02/45:51.608000-2.0-full-7IS92rssMzoSBQDIevHStbTNy-URRV3Vw-jzZ6pwOZM.tombstone`.

These tombstone files do two important things:
* They hide the segment from being listed, which both means:
  * It can't be restreamed or put into a video
  * It can't be backfilled to other nodes
* The tombstone files themselves do get backfilled to other nodes, so you only need to mark them on one server.

Once the tombstone has propagated to all nodes, the segment file can be deleted independently on each one.

We chose not to have a tombstone automatically trigger a segment deletion for safety reasons.
2 years ago
Mike Lang a47c29fff4 Link images to github repo by adding a LABEL
When pushed, this tells github to associate the ghcr.io repo that was pushed to
with the github repo specified (the owner needs to match).

This does a few things.
Most importantly, this automatically gives github actions credentials to push to these
repositories when run in the context of the wubloader repo.
3 years ago
Mike Lang 62bd6539ea Unpin gevent as that was a workaround for a py2 issue 3 years ago
Christopher Usher 6c97bd462e fixed integer division issues introduced by port to Python 3 3 years ago
Mike Lang 21856c68aa Fix all instances of file.write() for py3
In python 3, file.write() may do a partial write and returns the number of characters written.
In order to not lose data, we need to wrap every instance of file.write() with our new
common.writeall() wrapper that loops until the data is actually written.
3 years ago
Mike Lang a56f6859bb more py3 fixes 3 years ago
Mike Lang f2a8007bf7 Fix build dependency issues 3 years ago
Mike Lang 19f70b1d06 py3 fixes for segment_coverage 3 years ago
HubbeKing 6d790a1b36 Do a first naive pass for py3 compatibility
Check that open() calls for reading and writing use binary modes
Use alpine version with py3-pip package
Use python3 in Dockerfile CMD
Remove sys.setdefaultencoding() "hack"
Simplify ensure_directory() in common.common package
3 years ago
Mike Lang f0546e2ee3 Pin gevent to 1.5a2 to avoid https://github.com/gevent/gevent/issues/1711 3 years ago
Mike Lang 3a19ba744d segment_coverage: Raise default check interval to 5min 4 years ago
Mike Lang 5235c3281a segment_coverage: Allow setting of check interval via cli flag 4 years ago
Mike Lang 15c357509f segment_coverage: Fix a problem where metrics would fail
Because the checking process is entirely CPU-bound, it does not give any other
greenlets a chance to run while it is processing. This prevents us from responding
to metrics queries, and prometheus then times out.

By stopping to handle all other traffic in between each hour processed, we ensure metrics
remain responsive while processing.
4 years ago
HubbeKing 86f7823348 Replace calls to gevent.signal() with gevent.signal_handler()
gevent.signal() was removed in gevent 1.5a4, see http://www.gevent.org/api/gevent.signal.html
Removed on Feb 5th, see https://github.com/gevent/gevent/pull/1530
4 years ago
Mike Lang a53786dc2d Add file and make as build dependencies
gevent now requires these to build. I'm not sure when this changed.
4 years ago
Christopher Usher d56801014b added support for suspect segments to segment_coverage 5 years ago
Christopher Usher 9c77dd1f40 added the ability to generate a webpage with all coverage maps 5 years ago
Mike Lang 967ac7b856 segment_coverage: Reduce "no hours" warning to info
This is too noisy at warning level, and comes up for non-main channels.
5 years ago
Mike Lang 6724027e5a segment_coverage: Reduce missing channel/quality to warning 5 years ago
Mike Lang 731ef9e2d0 Refactor dockerfiles for more shared layers
By carefully ensuring most of our dockerfiles are identical in their first few layers,
we only need to build those layers once instead of every time.

In particular, we move installing gevent to before installing common,
so that even when common changes gevent doesn't need to be reinstalled.

This is important because gevent takes ages to install.

Also fixes segment_coverage, which wasn't being installed.
5 years ago
Christopher Usher 28ef77b5a7 error handling changes as suggested by ekim 5 years ago
Christopher Usher 34e8d0a64b Handle the case where an hour directory disappears between listing the
hours and trying to list the segments in that hour. This could happen if the
backfiller is deleting old hours.
5 years ago
Christopher Usher 3207dd6878 Fixed temporary segments crashing segment coverage 5 years ago
Christopher Usher 49ccb6df86 warns when a directory does not exist or when there are no hours to make
a map from
5 years ago
Christopher Usher 3130e770c8 made while loop more pythonic 5 years ago
Christopher Usher a1880b2414 fixes based on ekim's suggestions 5 years ago
Christopher Usher bea876e0cc removed obsolete code 5 years ago
Christopher Usher 43e19c3c56 removed unneeded package 5 years ago
Christopher Usher 44390173ed comments, code style and better handling of empty hours 5 years ago
Christopher Usher 003261eae4 Promethous gauges and new style coverage plots 5 years ago
Christopher Usher 46b7c7a3b6 new plotting 5 years ago
Christopher Usher 9711dbab0e changing what I mean by overlap 5 years ago
Christopher Usher 8e79ac772a started on the guages 5 years ago
Christopher Usher 92a4cf0d7b bit of a clean up 5 years ago
Christopher Usher 20a8a214d6 working! 5 years ago
Christopher Usher ac72f775c9 functional 5 years ago
Christopher Usher 722cbd20fa first pass at checking for holes and repeats 5 years ago
Christopher Usher 66f5a06a5c basic segment counting working 5 years ago
Christopher Usher 3618510f35 basic functionality 5 years ago
Christopher Usher 929308f3e7 started on the segment_coverage service 5 years ago