Commit Graph

171 Commits (fdd245a6d934283d8c1d18394d6dc1a21af3d290)
 

Author SHA1 Message Date
Mike Lang c8cc4a68a0 cutter: Fix bugs that meant things wouldn't actually be cut
The calculations were backwards, so instead of cutting a video by, say, 2 seconds,
it would cut by -2 seconds, which was clamped to 0. So it would never actually cut,
it would always use the closest segment.

Also, once we were actually cutting, we hit an issue where ffmpeg would finish and close
its input early, because we'd reached the end of the cut video, but not all input had been written yet.
This resulted in an EPIPE error (write to closed pipe) in the input feeder. We now ignore that.
6 years ago
Mike Lang 6bf709287a cutter: Introduce an alternate cutting approach that is much faster
This cutter works by only cutting the first and last segments to size,
then concatting them with the other segments, so we only ever process a few seconds
of video instead of the entire video duration.
However, to make this work, care must be taken that the cut segments use the same codecs
as the other segments.

The reason it's experimental is that we are not yet confident in its ability
to cut accurately and without sync issues. We have seen some minor issues when trying to play
back the raw output files, but youtube's re-encoding has consistently smoothed out those issues
and they seem to be highly player-specific. Vigorous testing is needed.

Also note that both methods right now (cat then cut, and cut then cat) only work if all the segments
are cattable, that is they all use the same codecs, have the same resolution, etc.
If a stream were to change its encoding settings, and we were cutting over that change,
both approaches would not work. We should add checks for that scenario (which can only happen
over a stream drop), and if so fallback to a slow method using ffmpeg's concat filter,
which will work even for disparate codecs, though reconciling mismatched resolutions or frame rates
may require further work.
6 years ago
Mike Lang 6815924097 Fix some bugs and linter errors introduced by backfiller
I ran `pyflakes` on the repo and found these bugs:

```
./common/common.py:289: undefined name 'random'
./downloader/downloader/main.py:7: 'random' imported but unused
./backfiller/backfiller/main.py:150: undefined name 'variant'
./backfiller/backfiller/main.py:158: undefined name 'timedelta'
./backfiller/backfiller/main.py:171: undefined name 'sort'
./backfiller/backfiller/main.py:173: undefined name 'sort'
```
(ok, the "imported but unused" one isn't a bug, but the rest are)

This fixes those, as well as a further issue I saw with sorting of hours.

Iterables are not sortable. As an obvious example, what if your iterable was infinite?
As a result, any attempt to sort an iterable that is not already a friendly type like a list
or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable
and putting it into a form that python will let us sort. It also avoids the nasty side-effect
of mutating the list that gets passed into us, which the caller may not expect. Consider this example:

```
>>> my_hours = ["one", "two", "three"]
>>> print my_hours
["one", "two", "three"]
>>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward')
>>> print my_hours
["one", "three", "two"]
```

Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours
(which is an api call for a particular variant), but at a time when we weren't dealing with a single
variant. My solution was to get a list of hours for ALL variants, and take the union.
6 years ago
Mike Lang 78a9a4e525 Set up a docker compose file to run all images
For ease-of-use, we use a jsonnet file to generate the yaml.
Jsonnet is a language for generating JSON documents.

In this case it's useful to us because it lets us have comments,
references to settings defined at the top, and some basic logic
like converting qualities from a list of strings to a comma-seperated string.

To avoid requiring jsonnet to be installed, we use the official jsonnet docker image
in the generate script.
6 years ago
Mike Lang 25185f8f1f travis.yml: Make script into individual lines
Setting -eu fucks up travis's scripts, so instead we should feed it everything
command-by-command so it can fail out using its own logic.
6 years ago
Mike Lang 4dc00052f6 Add .travis.yaml to set up CI
Nothing fancy, just build the images and push them,
and if it's a push to master then also build latest.
6 years ago
Mike Lang 18aadd6b82 restreamer: Also have an endpoint for generating cut videos on demand
This is mainly just for testing until we get the database and proper cutter up,
but it might prove useful to have in the long run too.

This code will probably end up being totally rewritten,
as it uses the most naive form of cutting and reencoding,
and it has a whole bunch of http-serving specifics intertwined with the cutting logic.
6 years ago
Christopher Usher b42202434f Minor Fixes as sugged by ekimekim 6 years ago
Christopher Usher 0b524a72cb docstings and a few minor feature additions to the backfiller 6 years ago
Christopher Usher a59f6e1569 ignore tempuary files 6 years ago
Christopher Usher 3b0342b872 added options to limit range of hours backfilled and to randomise hours backfilled 6 years ago
Christopher Usher fec0975d18 fixed white space and the like 6 years ago
Christopher Usher afd948576d Forgot to try to remove temporary file 6 years ago
Christopher Usher 3cdfaad664 moved rename, ensure_directory and jitter to common
Move a few useful functions in downloader used in the backfiller to common
6 years ago
Christopher Usher 7d26997b1f modifications to the backfiller in response to ekimekim's comments 6 years ago
Christopher Usher ba52bf7f5d hopefully more robust 6 years ago
Christopher Usher 50bcb84c0c Moving things around to make the backfiller a bit more like a proper package 6 years ago
Christopher Usher 494725fe34 Getting close to something I can show ekimekim 6 years ago
Christopher Usher 5615c1bdb0 Chipping away at backfiller
I'm going to have to learn to write better commit messages
6 years ago
Christopher Usher 2fb17fff59 much closer to being functional 6 years ago
Christopher Usher 05fed36ac8 a few ideas extra 6 years ago
Christopher Usher 0e7ba25b76 start of a rough prototype of the backfiller 6 years ago
Mike Lang 97d77e19d6 restreamer: Add CORS headers to all responses
TBH I'm not sure why this is needed (i'm completely clueless about browser stuff),
but apparently thrimbletrimmer needs it.
6 years ago
Mike Lang 941b9b017e build script: Add ability to push to remote repository after building 6 years ago
Mike Lang afe19ca33e restreamer: Implement graceful stop on SIGTERM 6 years ago
Mike Lang 7ffa90c7e6 restreamer: Make docker image work, fix missing dependencies
setup.py and Dockerfile were both totally out of whack
6 years ago
Mike Lang 1dce14bf77 downloader: Fix and improve the stop mechanism, stop on SIGTERM
Allows for graceful shutdown
6 years ago
Mike Lang 7b10429846 downloader: Dockerfile fixes to make it work 6 years ago
Mike Lang 6c3501db6f downloader: Fix dateutil lib, which is actually called python-dateutil 6 years ago
Mike Lang 7257fb9b73 downloader: Include channel name in path, instead of assuming it's already in base_dir
Previously, downloader would put files under BASE_DIR/VARIANT/HOUR/FILE.ts
now, it will put files under BASE_DIR/STREAM/VARIANT/HOUR/FILE.ts

This brings downloader in line with restreamer's concept of base_dir
6 years ago
Christopher Usher 84097f4bbb
Merge pull request #11 from ekimekim/chrusher/comment
Added a comment to highlight recursion
6 years ago
Christopher Usher efe30c1942 Added a comment to highlight recursion 6 years ago
Christopher Usher 62b184e333
Merge pull request #10 from ekimekim/add-license-1
Licence under MIT
6 years ago
Mike Lang d0caf79768
Licence under MIT
Closes #9
6 years ago
Christopher Usher 9782a3ebd1
Merge pull request #6 from ekimekim/mike/restreamer/improvements
restreamer: Multiple improvements and general "finishing"
6 years ago
Mike Lang b4e627f382 restreamer: When generating playlists, include discontinuities, timestamps and endlist
This fills out the incomplete playlist generation functionality to handle holes
and communicate extra information. See comments for details.
6 years ago
Mike Lang 201959888a restreamer: More accurate target duration in playlist 6 years ago
Mike Lang e34f04cf57 restreamer: Harden generate_media_playlist to handle weird inputs and defaults 6 years ago
Mike Lang 6fa74608fb common: Improve some docs to note types of things that are ambiguous 6 years ago
Mike Lang 8f5a98a906 restreamer: Don't offer a variant on the master playlist if it's outside requested time range
This prevents clients from picking a variant that they then can't play any content for.
In general we expect the same content to be available on all variants being captured,
but if the set of captured variants changes we still want to handle that gracefully.
6 years ago
Mike Lang 3bbe1ed32d Prefer longer duration on multiple segments 6 years ago
Christopher Usher 4981c6521b
Merge pull request #5 from ekimekim/mike/restreamer/initial
Initial work on restreamer
6 years ago
Mike Lang 5942091d1a restreamer: Cleanup around argument processing 6 years ago
Mike Lang a1fa60828d Basic media playlist generation, missing special cases 6 years ago
Mike Lang 75c9793eac Remove central config file as it's more trouble than it's worth
Simpler and easier for testing to stick to configuration via CLI args.
We'll worry about deployment later.
6 years ago
Mike Lang 031dd60897 downloader: Fix some typos around the max age calculation 6 years ago
Mike Lang 9e115f8a42 restreamer: Also add ability to list known hours so we know where to start replicating from 6 years ago
Mike Lang bab2d15d6e Initial implementation of the restreamer
Supports serving segments, listing segments for an hour, and generating playlists so it can stream.
6 years ago
Mike Lang ee8f8f6571 restreamer: Initial skeleton 6 years ago
Mike Lang 0df8288013 common: Implement code for parsing paths and picking the best sequence of segments
This is needed by both the restreamer and the cutter, hence its inclusion in common.

The algorithm is pretty simple - it takes the 'best' segment per start time by full first,
then length of partial. All the other complexity is mainly just around detecting and reporting holes,
and being inclusive of start/end points.
6 years ago