1add3c5c22
Rarely, we find ourselves needing to explicitly delete some data, eg. something that shouldn't have been public and should be removed from all records. It would also be nice if we could "clean up" bad versions of the same segment, which occasionally come up when downloaders have issues. With our distributed segment database, this is actually rather difficult as deleting the data from any one server would cause it to be restored from the others. It was only possible by stopping all backfill, deleting the data on all servers, then starting backfill again. Here we introduce a more practical approach. An operator creates an empty flag file with the same name as the segment to be deleted, but with a `.tombstone` extension. eg. to delete a file `/segments/desertbus/source/2019-11-13T02/45:51.608000-2.0-full-7IS92rssMzoSBQDIevHStbTNy-URRV3Vw-jzZ6pwOZM.ts`, you would create a tombstone `/segments/desertbus/source/2019-11-13T02/45:51.608000-2.0-full-7IS92rssMzoSBQDIevHStbTNy-URRV3Vw-jzZ6pwOZM.tombstone`. These tombstone files do two important things: * They hide the segment from being listed, which both means: * It can't be restreamed or put into a video * It can't be backfilled to other nodes * The tombstone files themselves do get backfilled to other nodes, so you only need to mark them on one server. Once the tombstone has propagated to all nodes, the segment file can be deleted independently on each one. We chose not to have a tombstone automatically trigger a segment deletion for safety reasons. |
2 years ago | |
---|---|---|
.github/workflows | 3 years ago | |
api_ping | 3 years ago | |
backfiller | 2 years ago | |
common | 2 years ago | |
cutter | 3 years ago | |
db_scripts | 5 years ago | |
downloader | 2 years ago | |
monitoring | 3 years ago | |
nginx | 3 years ago | |
playlist_manager | 3 years ago | |
postgres | 3 years ago | |
restreamer | 2 years ago | |
segment_coverage | 2 years ago | |
sheetsync | 3 years ago | |
thrimbletrimmer | 3 years ago | |
thrimshim | 3 years ago | |
.gitignore | 6 years ago | |
.prettierrc.yaml | 3 years ago | |
.travis.yml | 5 years ago | |
DATABASE.md | 3 years ago | |
INSTALL.md | 4 years ago | |
LICENSE | 5 years ago | |
README.md | 4 years ago | |
SecurityModel.md | 5 years ago | |
build | 3 years ago | |
docker-compose.jsonnet | 3 years ago | |
generate-docker-compose | 5 years ago | |
generate-flamegraph | 6 years ago | |
get-build-tag | 5 years ago | |
initial-design-doc.pdf | 6 years ago | |
k8s.jsonnet | 3 years ago |
README.md
Wubloader is a system for saving, re-serving and cutting into videos of a target twitch (or probably other HLS, but some twitch specifics are assumed) stream.
It was designed to serve the needs of the Video Strike Team as part of Desert Bus For Hope.
A full design doc can be read at initial-design-doc.pdf, but a brief overview of the components:
downloader
grabs segments from twitch and saves them to diskrestreamer
serves segments from disk as well as playlist files allowing them to be streamedbackfiller
queries restreamers of other servers in order to pick up segments this server doesn't have already, ie. it replicates missing segments.cutter
interacts with a database to perform cutting jobssheetsync
syncs specifc database columns to a google doc which is the primary operator interface.thrimshim
acts as an interface between thethrimbletrimmer
editor and the database.thrimbletrimmer
is a browser based video editor.segment_coverage
regularly checks whether there is complete segment coverage for each hour.playlist_manager
adds videos to youtube playlists depending on tags.database
hosts a Postgres database to store events to be edited.nginx
provides a webserver through which the other components are exposed to the outside world.common
provides code shared between the other components.monitoring
provides dashboards to allow the wubloader to be monitored.
Usage
All components are built as docker images.
Components which access the disk expect a shared directory mounted at /mnt
.
A docker-compose file is provided to run all components. See docker-compose.jsonnet
to set configuration options, then generate the compose file with ./generate-docker-compose
.
Then run docker-compose up
.
There is also a kubernetes-based option, but it is less configurable and only supports replication nodes. See k8s.jsonnet for details.
Further details of installing and configuring the backfiller are provided in INSTALL.md.