wubloader

Commit Graph

Author	SHA1	Message	Date
Mike Lang	df73225a5b	Enable backdoor in all services, and add telnet to containers	7 years ago
Mike Lang	f9aa6ef0e4	Add gevent.backdoor as an optional arg to all services Backdoor allows the operator to telnet into the given port, and get a python shell running inside the process, from which you can debug, modify state (eg. set the log level), or whatever. This is extremely useful for debugging weird states that you encounter randomly but can't easily reproduce, without restarting the process and needing to wait until it happens again.	7 years ago
Mike Lang	7f9a1dbe45	downloader: Remove implicit source quality arg This brings it in line with backfiller, is more flexible and less surprising	7 years ago
Mike Lang	89d6b3a6be	docker-compose: Add list of peers to backfill from	7 years ago
Mike Lang	0d627715f3	downloader: Track number of downloaded segments This is the most important metric, we can add more later.	7 years ago
Mike Lang	90ccc6d827	backfiller: Track number of successful backfills Other stats can come later, but this one is important as it tells us if a downloader hasn't been doing its job.	7 years ago
Mike Lang	c59892e148	backfiller: Add ability to set nodes as CLI arg	7 years ago
Mike Lang	bdcb217d20	docker-compose: Expose metrics ports for other services	7 years ago
Mike Lang	b4b315b6bc	Expose prometheus metrics for backfiller and downloader	7 years ago
Mike Lang	d90f01b8ce	common: Create general function for timing things, and use it to time get_best_segments The function is quite customizable and therefore quite complex, but it allows us to easily annotate a function to be timed with labels based on input and output, as well as normalize results based on amount of work done to get a better picture of the actual amount of time taken per unit of work. This will help us monitor for performance issues.	7 years ago
Mike Lang	b0ded641c3	Add a logging handler which counts logs for prometheus stats This isn't as good as having a full centralised logging system, but should suffice to know if anything funny is happening.	7 years ago
Mike Lang	c9d02b3318	restreamer: Prevent prom client blowing up after two different endpoints are hit Prom client doesn't like you creating two stats with the same name, even though they have different labels and this makes perfect sense. I feel like I just need to re-write the prom client at some point - it doesn't actually do all that much except get in your way, apart from the actual text encoding which I can steal. Anyway, in the meantime, we get around this by breaking up metrics into two names, a "foo_all" and a "foo_ENDPOINT". The foo_all lacks the detailed labels, but is still labelled by endpoint and can be used more easily. The foo_ENDPOINT labels have more information but require messier PromQL as you need to match on a name regex if you want to look at more than one specific endpoint.	7 years ago
Mike Lang	30c4bbec1d	restreamer: return the actual response from after_request even if untracked otherwise any untracked endpoints don't work	7 years ago
Christopher Usher	96e6904c85	Added monotonic to restreamer setup.py	7 years ago
Christopher Usher	225288980a	Added the backfiller to docker-compose	7 years ago
Christopher Usher	3fcd374449	Moved encode_strings to common	7 years ago
Christopher Usher	93dd216f89	Fixes and suggestions from ekimekim	7 years ago
Christopher Usher	db1b4e6539	Updated logging to match the other components	7 years ago
Christopher Usher	bae039977b	trying getting the backfiller to actually start	7 years ago
Christopher Usher	1fcd9b5b36	Adding in stuff to hopefully get this to run	7 years ago
Christopher Usher	013ad65c68	added a Dockerfile for the backfiller	7 years ago
Christopher Usher	48d11045d4	Change to backfiller.main to backfill the last 3 hours on start up before doing a full backfill	7 years ago
Christopher Usher	176633bf7d	More messing around with backfill_node to allow finer grained control of order segments are fetched	7 years ago
Christopher Usher	3a7624b107	added a setup file for the backfiller	7 years ago
Christopher Usher	ba499fe835	added more logging to backfiller	7 years ago
Mike Lang	7525b7c135	restreamer: Add basic prometheus stats to all endpoints I had to go to some effort to get nice labelling, which also meant none of the existing libs for this were any good, but this works well enough. Exposes the metrics on /metrics.	7 years ago
Mike Lang	17972b87aa	Allow setting of log level via WUBLOADER_LOG_LEVEL env var By using an env var, it is universal and happens prior to arg parsing, at the same point we do other logging setup.	7 years ago
Mike Lang	c0357680cf	downloader: Use caller's logger inside soft_hard_timeout	7 years ago
Mike Lang	a628676e74	downloader: Log to subloggers instead of the root logger This gives us some context when logging, and is best practice.	7 years ago
Mike Lang	57e665df2e	generate-docker-compose: Clean up the container afterwards I'll never understand why this isn't the default, docker.	7 years ago
Mike Lang	c8cc4a68a0	cutter: Fix bugs that meant things wouldn't actually be cut The calculations were backwards, so instead of cutting a video by, say, 2 seconds, it would cut by -2 seconds, which was clamped to 0. So it would never actually cut, it would always use the closest segment. Also, once we were actually cutting, we hit an issue where ffmpeg would finish and close its input early, because we'd reached the end of the cut video, but not all input had been written yet. This resulted in an EPIPE error (write to closed pipe) in the input feeder. We now ignore that.	7 years ago
Mike Lang	6bf709287a	cutter: Introduce an alternate cutting approach that is much faster This cutter works by only cutting the first and last segments to size, then concatting them with the other segments, so we only ever process a few seconds of video instead of the entire video duration. However, to make this work, care must be taken that the cut segments use the same codecs as the other segments. The reason it's experimental is that we are not yet confident in its ability to cut accurately and without sync issues. We have seen some minor issues when trying to play back the raw output files, but youtube's re-encoding has consistently smoothed out those issues and they seem to be highly player-specific. Vigorous testing is needed. Also note that both methods right now (cat then cut, and cut then cat) only work if all the segments are cattable, that is they all use the same codecs, have the same resolution, etc. If a stream were to change its encoding settings, and we were cutting over that change, both approaches would not work. We should add checks for that scenario (which can only happen over a stream drop), and if so fallback to a slow method using ffmpeg's concat filter, which will work even for disparate codecs, though reconciling mismatched resolutions or frame rates may require further work.	7 years ago
Mike Lang	6815924097	Fix some bugs and linter errors introduced by backfiller I ran `pyflakes` on the repo and found these bugs: ``` ./common/common.py:289: undefined name 'random' ./downloader/downloader/main.py:7: 'random' imported but unused ./backfiller/backfiller/main.py:150: undefined name 'variant' ./backfiller/backfiller/main.py:158: undefined name 'timedelta' ./backfiller/backfiller/main.py:171: undefined name 'sort' ./backfiller/backfiller/main.py:173: undefined name 'sort' ``` (ok, the "imported but unused" one isn't a bug, but the rest are) This fixes those, as well as a further issue I saw with sorting of hours. Iterables are not sortable. As an obvious example, what if your iterable was infinite? As a result, any attempt to sort an iterable that is not already a friendly type like a list or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable and putting it into a form that python will let us sort. It also avoids the nasty side-effect of mutating the list that gets passed into us, which the caller may not expect. Consider this example: ``` >>> my_hours = ["one", "two", "three"] >>> print my_hours ["one", "two", "three"] >>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward') >>> print my_hours ["one", "three", "two"] ``` Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours (which is an api call for a particular variant), but at a time when we weren't dealing with a single variant. My solution was to get a list of hours for ALL variants, and take the union.	7 years ago
Mike Lang	78a9a4e525	Set up a docker compose file to run all images For ease-of-use, we use a jsonnet file to generate the yaml. Jsonnet is a language for generating JSON documents. In this case it's useful to us because it lets us have comments, references to settings defined at the top, and some basic logic like converting qualities from a list of strings to a comma-seperated string. To avoid requiring jsonnet to be installed, we use the official jsonnet docker image in the generate script.	7 years ago
Mike Lang	25185f8f1f	travis.yml: Make script into individual lines Setting -eu fucks up travis's scripts, so instead we should feed it everything command-by-command so it can fail out using its own logic.	7 years ago
Mike Lang	4dc00052f6	Add .travis.yaml to set up CI Nothing fancy, just build the images and push them, and if it's a push to master then also build latest.	7 years ago
Mike Lang	18aadd6b82	restreamer: Also have an endpoint for generating cut videos on demand This is mainly just for testing until we get the database and proper cutter up, but it might prove useful to have in the long run too. This code will probably end up being totally rewritten, as it uses the most naive form of cutting and reencoding, and it has a whole bunch of http-serving specifics intertwined with the cutting logic.	7 years ago
Christopher Usher	b42202434f	Minor Fixes as sugged by ekimekim	7 years ago
Christopher Usher	0b524a72cb	docstings and a few minor feature additions to the backfiller	7 years ago
Christopher Usher	a59f6e1569	ignore tempuary files	7 years ago
Christopher Usher	3b0342b872	added options to limit range of hours backfilled and to randomise hours backfilled	7 years ago
Christopher Usher	fec0975d18	fixed white space and the like	7 years ago
Christopher Usher	afd948576d	Forgot to try to remove temporary file	7 years ago
Christopher Usher	3cdfaad664	moved rename, ensure_directory and jitter to common Move a few useful functions in downloader used in the backfiller to common	7 years ago
Christopher Usher	7d26997b1f	modifications to the backfiller in response to ekimekim's comments	7 years ago
Christopher Usher	ba52bf7f5d	hopefully more robust	7 years ago
Christopher Usher	50bcb84c0c	Moving things around to make the backfiller a bit more like a proper package	7 years ago
Christopher Usher	494725fe34	Getting close to something I can show ekimekim	7 years ago
Christopher Usher	5615c1bdb0	Chipping away at backfiller I'm going to have to learn to write better commit messages	7 years ago
Christopher Usher	2fb17fff59	much closer to being functional	7 years ago

1 2 3

101 Commits (mike/backdoor) All Branches Search

101 Commits (mike/backdoor)

All Branches