wubloader

Commit Graph

Author	SHA1	Message	Date
Christopher Usher	48d11045d4	Change to backfiller.main to backfill the last 3 hours on start up before doing a full backfill	7 years ago
Christopher Usher	176633bf7d	More messing around with backfill_node to allow finer grained control of order segments are fetched	7 years ago
Christopher Usher	3a7624b107	added a setup file for the backfiller	7 years ago
Christopher Usher	ba499fe835	added more logging to backfiller	7 years ago
Mike Lang	7525b7c135	restreamer: Add basic prometheus stats to all endpoints I had to go to some effort to get nice labelling, which also meant none of the existing libs for this were any good, but this works well enough. Exposes the metrics on /metrics.	7 years ago
Mike Lang	17972b87aa	Allow setting of log level via WUBLOADER_LOG_LEVEL env var By using an env var, it is universal and happens prior to arg parsing, at the same point we do other logging setup.	7 years ago
Mike Lang	c0357680cf	downloader: Use caller's logger inside soft_hard_timeout	7 years ago
Mike Lang	a628676e74	downloader: Log to subloggers instead of the root logger This gives us some context when logging, and is best practice.	7 years ago
Mike Lang	57e665df2e	generate-docker-compose: Clean up the container afterwards I'll never understand why this isn't the default, docker.	7 years ago
Mike Lang	c8cc4a68a0	cutter: Fix bugs that meant things wouldn't actually be cut The calculations were backwards, so instead of cutting a video by, say, 2 seconds, it would cut by -2 seconds, which was clamped to 0. So it would never actually cut, it would always use the closest segment. Also, once we were actually cutting, we hit an issue where ffmpeg would finish and close its input early, because we'd reached the end of the cut video, but not all input had been written yet. This resulted in an EPIPE error (write to closed pipe) in the input feeder. We now ignore that.	7 years ago
Mike Lang	6bf709287a	cutter: Introduce an alternate cutting approach that is much faster This cutter works by only cutting the first and last segments to size, then concatting them with the other segments, so we only ever process a few seconds of video instead of the entire video duration. However, to make this work, care must be taken that the cut segments use the same codecs as the other segments. The reason it's experimental is that we are not yet confident in its ability to cut accurately and without sync issues. We have seen some minor issues when trying to play back the raw output files, but youtube's re-encoding has consistently smoothed out those issues and they seem to be highly player-specific. Vigorous testing is needed. Also note that both methods right now (cat then cut, and cut then cat) only work if all the segments are cattable, that is they all use the same codecs, have the same resolution, etc. If a stream were to change its encoding settings, and we were cutting over that change, both approaches would not work. We should add checks for that scenario (which can only happen over a stream drop), and if so fallback to a slow method using ffmpeg's concat filter, which will work even for disparate codecs, though reconciling mismatched resolutions or frame rates may require further work.	7 years ago
Mike Lang	6815924097	Fix some bugs and linter errors introduced by backfiller I ran `pyflakes` on the repo and found these bugs: ``` ./common/common.py:289: undefined name 'random' ./downloader/downloader/main.py:7: 'random' imported but unused ./backfiller/backfiller/main.py:150: undefined name 'variant' ./backfiller/backfiller/main.py:158: undefined name 'timedelta' ./backfiller/backfiller/main.py:171: undefined name 'sort' ./backfiller/backfiller/main.py:173: undefined name 'sort' ``` (ok, the "imported but unused" one isn't a bug, but the rest are) This fixes those, as well as a further issue I saw with sorting of hours. Iterables are not sortable. As an obvious example, what if your iterable was infinite? As a result, any attempt to sort an iterable that is not already a friendly type like a list or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable and putting it into a form that python will let us sort. It also avoids the nasty side-effect of mutating the list that gets passed into us, which the caller may not expect. Consider this example: ``` >>> my_hours = ["one", "two", "three"] >>> print my_hours ["one", "two", "three"] >>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward') >>> print my_hours ["one", "three", "two"] ``` Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours (which is an api call for a particular variant), but at a time when we weren't dealing with a single variant. My solution was to get a list of hours for ALL variants, and take the union.	7 years ago
Mike Lang	78a9a4e525	Set up a docker compose file to run all images For ease-of-use, we use a jsonnet file to generate the yaml. Jsonnet is a language for generating JSON documents. In this case it's useful to us because it lets us have comments, references to settings defined at the top, and some basic logic like converting qualities from a list of strings to a comma-seperated string. To avoid requiring jsonnet to be installed, we use the official jsonnet docker image in the generate script.	7 years ago
Mike Lang	25185f8f1f	travis.yml: Make script into individual lines Setting -eu fucks up travis's scripts, so instead we should feed it everything command-by-command so it can fail out using its own logic.	7 years ago
Mike Lang	4dc00052f6	Add .travis.yaml to set up CI Nothing fancy, just build the images and push them, and if it's a push to master then also build latest.	7 years ago
Mike Lang	18aadd6b82	restreamer: Also have an endpoint for generating cut videos on demand This is mainly just for testing until we get the database and proper cutter up, but it might prove useful to have in the long run too. This code will probably end up being totally rewritten, as it uses the most naive form of cutting and reencoding, and it has a whole bunch of http-serving specifics intertwined with the cutting logic.	7 years ago
Christopher Usher	b42202434f	Minor Fixes as sugged by ekimekim	7 years ago
Christopher Usher	0b524a72cb	docstings and a few minor feature additions to the backfiller	7 years ago
Christopher Usher	a59f6e1569	ignore tempuary files	7 years ago
Christopher Usher	3b0342b872	added options to limit range of hours backfilled and to randomise hours backfilled	7 years ago
Christopher Usher	fec0975d18	fixed white space and the like	7 years ago
Christopher Usher	afd948576d	Forgot to try to remove temporary file	7 years ago
Christopher Usher	3cdfaad664	moved rename, ensure_directory and jitter to common Move a few useful functions in downloader used in the backfiller to common	7 years ago
Christopher Usher	7d26997b1f	modifications to the backfiller in response to ekimekim's comments	7 years ago
Christopher Usher	ba52bf7f5d	hopefully more robust	7 years ago
Christopher Usher	50bcb84c0c	Moving things around to make the backfiller a bit more like a proper package	7 years ago
Christopher Usher	494725fe34	Getting close to something I can show ekimekim	7 years ago
Christopher Usher	5615c1bdb0	Chipping away at backfiller I'm going to have to learn to write better commit messages	7 years ago
Christopher Usher	2fb17fff59	much closer to being functional	7 years ago
Christopher Usher	05fed36ac8	a few ideas extra	7 years ago
Christopher Usher	0e7ba25b76	start of a rough prototype of the backfiller	7 years ago
Mike Lang	97d77e19d6	restreamer: Add CORS headers to all responses TBH I'm not sure why this is needed (i'm completely clueless about browser stuff), but apparently thrimbletrimmer needs it.	7 years ago
Mike Lang	941b9b017e	build script: Add ability to push to remote repository after building	7 years ago
Mike Lang	afe19ca33e	restreamer: Implement graceful stop on SIGTERM	7 years ago
Mike Lang	7ffa90c7e6	restreamer: Make docker image work, fix missing dependencies setup.py and Dockerfile were both totally out of whack	7 years ago
Mike Lang	1dce14bf77	downloader: Fix and improve the stop mechanism, stop on SIGTERM Allows for graceful shutdown	7 years ago
Mike Lang	7b10429846	downloader: Dockerfile fixes to make it work	7 years ago
Mike Lang	6c3501db6f	downloader: Fix dateutil lib, which is actually called python-dateutil	7 years ago
Mike Lang	7257fb9b73	downloader: Include channel name in path, instead of assuming it's already in base_dir Previously, downloader would put files under BASE_DIR/VARIANT/HOUR/FILE.ts now, it will put files under BASE_DIR/STREAM/VARIANT/HOUR/FILE.ts This brings downloader in line with restreamer's concept of base_dir	7 years ago
Christopher Usher	84097f4bbb	Merge pull request #11 from ekimekim/chrusher/comment Added a comment to highlight recursion	7 years ago
Christopher Usher	efe30c1942	Added a comment to highlight recursion	7 years ago
Christopher Usher	62b184e333	Merge pull request #10 from ekimekim/add-license-1 Licence under MIT	7 years ago
Mike Lang	d0caf79768	Licence under MIT Closes #9	7 years ago
Christopher Usher	9782a3ebd1	Merge pull request #6 from ekimekim/mike/restreamer/improvements restreamer: Multiple improvements and general "finishing"	7 years ago
Mike Lang	b4e627f382	restreamer: When generating playlists, include discontinuities, timestamps and endlist This fills out the incomplete playlist generation functionality to handle holes and communicate extra information. See comments for details.	7 years ago
Mike Lang	201959888a	restreamer: More accurate target duration in playlist	7 years ago
Mike Lang	e34f04cf57	restreamer: Harden generate_media_playlist to handle weird inputs and defaults	7 years ago
Mike Lang	6fa74608fb	common: Improve some docs to note types of things that are ambiguous	7 years ago
Mike Lang	8f5a98a906	restreamer: Don't offer a variant on the master playlist if it's outside requested time range This prevents clients from picking a variant that they then can't play any content for. In general we expect the same content to be available on all variants being captured, but if the set of captured variants changes we still want to handle that gracefully.	7 years ago
Mike Lang	3bbe1ed32d	Prefer longer duration on multiple segments	7 years ago

... 19 20 21 22 23

1130 Commits (b46d9f30bbec40931f476358b3428582f91ac0c6) All Branches Search

1130 Commits (b46d9f30bbec40931f476358b3428582f91ac0c6)

All Branches