Commit Graph

1130 Commits (b46d9f30bbec40931f476358b3428582f91ac0c6)
 

Author SHA1 Message Date
Christopher Usher 48d11045d4 Change to backfiller.main to backfill the last 3 hours on start up before doing a full backfill 6 years ago
Christopher Usher 176633bf7d More messing around with backfill_node to allow finer grained control of order segments are fetched 6 years ago
Christopher Usher 3a7624b107 added a setup file for the backfiller 6 years ago
Christopher Usher ba499fe835 added more logging to backfiller 6 years ago
Mike Lang 7525b7c135 restreamer: Add basic prometheus stats to all endpoints
I had to go to some effort to get nice labelling,
which also meant none of the existing libs for this were any good,
but this works well enough.

Exposes the metrics on /metrics.
6 years ago
Mike Lang 17972b87aa Allow setting of log level via WUBLOADER_LOG_LEVEL env var
By using an env var, it is universal and happens prior to arg parsing,
at the same point we do other logging setup.
6 years ago
Mike Lang c0357680cf downloader: Use caller's logger inside soft_hard_timeout 6 years ago
Mike Lang a628676e74 downloader: Log to subloggers instead of the root logger
This gives us some context when logging, and is best practice.
6 years ago
Mike Lang 57e665df2e generate-docker-compose: Clean up the container afterwards
I'll never understand why this isn't the default, docker.
6 years ago
Mike Lang c8cc4a68a0 cutter: Fix bugs that meant things wouldn't actually be cut
The calculations were backwards, so instead of cutting a video by, say, 2 seconds,
it would cut by -2 seconds, which was clamped to 0. So it would never actually cut,
it would always use the closest segment.

Also, once we were actually cutting, we hit an issue where ffmpeg would finish and close
its input early, because we'd reached the end of the cut video, but not all input had been written yet.
This resulted in an EPIPE error (write to closed pipe) in the input feeder. We now ignore that.
6 years ago
Mike Lang 6bf709287a cutter: Introduce an alternate cutting approach that is much faster
This cutter works by only cutting the first and last segments to size,
then concatting them with the other segments, so we only ever process a few seconds
of video instead of the entire video duration.
However, to make this work, care must be taken that the cut segments use the same codecs
as the other segments.

The reason it's experimental is that we are not yet confident in its ability
to cut accurately and without sync issues. We have seen some minor issues when trying to play
back the raw output files, but youtube's re-encoding has consistently smoothed out those issues
and they seem to be highly player-specific. Vigorous testing is needed.

Also note that both methods right now (cat then cut, and cut then cat) only work if all the segments
are cattable, that is they all use the same codecs, have the same resolution, etc.
If a stream were to change its encoding settings, and we were cutting over that change,
both approaches would not work. We should add checks for that scenario (which can only happen
over a stream drop), and if so fallback to a slow method using ffmpeg's concat filter,
which will work even for disparate codecs, though reconciling mismatched resolutions or frame rates
may require further work.
6 years ago
Mike Lang 6815924097 Fix some bugs and linter errors introduced by backfiller
I ran `pyflakes` on the repo and found these bugs:

```
./common/common.py:289: undefined name 'random'
./downloader/downloader/main.py:7: 'random' imported but unused
./backfiller/backfiller/main.py:150: undefined name 'variant'
./backfiller/backfiller/main.py:158: undefined name 'timedelta'
./backfiller/backfiller/main.py:171: undefined name 'sort'
./backfiller/backfiller/main.py:173: undefined name 'sort'
```
(ok, the "imported but unused" one isn't a bug, but the rest are)

This fixes those, as well as a further issue I saw with sorting of hours.

Iterables are not sortable. As an obvious example, what if your iterable was infinite?
As a result, any attempt to sort an iterable that is not already a friendly type like a list
or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable
and putting it into a form that python will let us sort. It also avoids the nasty side-effect
of mutating the list that gets passed into us, which the caller may not expect. Consider this example:

```
>>> my_hours = ["one", "two", "three"]
>>> print my_hours
["one", "two", "three"]
>>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward')
>>> print my_hours
["one", "three", "two"]
```

Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours
(which is an api call for a particular variant), but at a time when we weren't dealing with a single
variant. My solution was to get a list of hours for ALL variants, and take the union.
6 years ago
Mike Lang 78a9a4e525 Set up a docker compose file to run all images
For ease-of-use, we use a jsonnet file to generate the yaml.
Jsonnet is a language for generating JSON documents.

In this case it's useful to us because it lets us have comments,
references to settings defined at the top, and some basic logic
like converting qualities from a list of strings to a comma-seperated string.

To avoid requiring jsonnet to be installed, we use the official jsonnet docker image
in the generate script.
6 years ago
Mike Lang 25185f8f1f travis.yml: Make script into individual lines
Setting -eu fucks up travis's scripts, so instead we should feed it everything
command-by-command so it can fail out using its own logic.
6 years ago
Mike Lang 4dc00052f6 Add .travis.yaml to set up CI
Nothing fancy, just build the images and push them,
and if it's a push to master then also build latest.
6 years ago
Mike Lang 18aadd6b82 restreamer: Also have an endpoint for generating cut videos on demand
This is mainly just for testing until we get the database and proper cutter up,
but it might prove useful to have in the long run too.

This code will probably end up being totally rewritten,
as it uses the most naive form of cutting and reencoding,
and it has a whole bunch of http-serving specifics intertwined with the cutting logic.
6 years ago
Christopher Usher b42202434f Minor Fixes as sugged by ekimekim 6 years ago
Christopher Usher 0b524a72cb docstings and a few minor feature additions to the backfiller 6 years ago
Christopher Usher a59f6e1569 ignore tempuary files 6 years ago
Christopher Usher 3b0342b872 added options to limit range of hours backfilled and to randomise hours backfilled 6 years ago
Christopher Usher fec0975d18 fixed white space and the like 6 years ago
Christopher Usher afd948576d Forgot to try to remove temporary file 6 years ago
Christopher Usher 3cdfaad664 moved rename, ensure_directory and jitter to common
Move a few useful functions in downloader used in the backfiller to common
6 years ago
Christopher Usher 7d26997b1f modifications to the backfiller in response to ekimekim's comments 6 years ago
Christopher Usher ba52bf7f5d hopefully more robust 6 years ago
Christopher Usher 50bcb84c0c Moving things around to make the backfiller a bit more like a proper package 6 years ago
Christopher Usher 494725fe34 Getting close to something I can show ekimekim 6 years ago
Christopher Usher 5615c1bdb0 Chipping away at backfiller
I'm going to have to learn to write better commit messages
6 years ago
Christopher Usher 2fb17fff59 much closer to being functional 6 years ago
Christopher Usher 05fed36ac8 a few ideas extra 6 years ago
Christopher Usher 0e7ba25b76 start of a rough prototype of the backfiller 6 years ago
Mike Lang 97d77e19d6 restreamer: Add CORS headers to all responses
TBH I'm not sure why this is needed (i'm completely clueless about browser stuff),
but apparently thrimbletrimmer needs it.
6 years ago
Mike Lang 941b9b017e build script: Add ability to push to remote repository after building 6 years ago
Mike Lang afe19ca33e restreamer: Implement graceful stop on SIGTERM 6 years ago
Mike Lang 7ffa90c7e6 restreamer: Make docker image work, fix missing dependencies
setup.py and Dockerfile were both totally out of whack
6 years ago
Mike Lang 1dce14bf77 downloader: Fix and improve the stop mechanism, stop on SIGTERM
Allows for graceful shutdown
6 years ago
Mike Lang 7b10429846 downloader: Dockerfile fixes to make it work 6 years ago
Mike Lang 6c3501db6f downloader: Fix dateutil lib, which is actually called python-dateutil 6 years ago
Mike Lang 7257fb9b73 downloader: Include channel name in path, instead of assuming it's already in base_dir
Previously, downloader would put files under BASE_DIR/VARIANT/HOUR/FILE.ts
now, it will put files under BASE_DIR/STREAM/VARIANT/HOUR/FILE.ts

This brings downloader in line with restreamer's concept of base_dir
6 years ago
Christopher Usher 84097f4bbb
Merge pull request #11 from ekimekim/chrusher/comment
Added a comment to highlight recursion
6 years ago
Christopher Usher efe30c1942 Added a comment to highlight recursion 6 years ago
Christopher Usher 62b184e333
Merge pull request #10 from ekimekim/add-license-1
Licence under MIT
6 years ago
Mike Lang d0caf79768
Licence under MIT
Closes #9
6 years ago
Christopher Usher 9782a3ebd1
Merge pull request #6 from ekimekim/mike/restreamer/improvements
restreamer: Multiple improvements and general "finishing"
6 years ago
Mike Lang b4e627f382 restreamer: When generating playlists, include discontinuities, timestamps and endlist
This fills out the incomplete playlist generation functionality to handle holes
and communicate extra information. See comments for details.
6 years ago
Mike Lang 201959888a restreamer: More accurate target duration in playlist 6 years ago
Mike Lang e34f04cf57 restreamer: Harden generate_media_playlist to handle weird inputs and defaults 6 years ago
Mike Lang 6fa74608fb common: Improve some docs to note types of things that are ambiguous 6 years ago
Mike Lang 8f5a98a906 restreamer: Don't offer a variant on the master playlist if it's outside requested time range
This prevents clients from picking a variant that they then can't play any content for.
In general we expect the same content to be available on all variants being captured,
but if the set of captured variants changes we still want to handle that gracefully.
6 years ago
Mike Lang 3bbe1ed32d Prefer longer duration on multiple segments 6 years ago