Commit Graph

1430 Commits (1ba8597957432c1ee998ea1cf7c8f99de9816a55)
 

Author SHA1 Message Date
Christopher Usher b7a57d4766 reset is now it's own method 6 years ago
Christopher Usher fe5b10f86b Fixes the state transitions of the trimbleshim to allow video links to be changed or removed 6 years ago
Mike Lang 5c84e8dfab restreamer: Fix wrong name for parse function
derp
6 years ago
Mike Lang cca4d52b7d Don't error when encountering a temp-type segment
These can happen if a downloader or backfiller dies suddenly.
We treat it similarly to partial but lacking any hash.

At some point in the future we should probably have something
to find any temp segments, hash them and rename them to partials.
6 years ago
Mike Lang f8d10dacdf Audit and fix all usage of dateutil
We wrap direct dateutil calls to handle two distinct cases:

* `common.dateutil.parse()`: We want to handle arbitrary timestamps including tz info,
then convert them to UTC.

This is used in HLS parsing, and for command line input for backfiller

* `common.dateutil.parse_utc_only()`: We want to only handle UTC timestamps,
but datetime.strptime isn't flexible enough (eg. can't handle missing fractional component).

This is used for restreamer request params.
6 years ago
Mike Lang 5b2a1ef6b7 cutter: Implement actual cut methods
Each method is fairly complicated, but is self-contained and can be examined independently.

cut_jobs in particular contains several extra helpers and directs control flow
via some iterators. This is unfortunately nessecary due to the requests interface.
6 years ago
Mike Lang ae809c696c cutter: Outline of how main cutter run loop works
This commit only lays out the main loop, showing the high-level flow
and defining shared utilities. This is for clarity.

The actual methods that do the work will be implemented seperately.
6 years ago
Mike Lang 80c1a66aa0 cutter: Implement TranscodeChecker
It runs on an interval, fetching all videos in TRANSCODING from the DB,
checking them against youtube, and then updating any that are done.

It should be noted that youtube somewhat lies about what being "done" means,
but this is a better approximation than nothing.
6 years ago
Mike Lang 3ce8360a1e cutter: Add database manager and connections
One connection each for transcode checker and cutter.
We don't need more than one each since both workers only ever do one thing at once.
6 years ago
Mike Lang fdd245a6d9 cutter: Add lightweight youtube client
Provides basic youtube api calls, and gets passed into both transcode checker and cutter.

The official youtube client library is many orders of magnitude larger and more complicated,
and can't actually do what we want (stream an upload of unknown size).
6 years ago
Mike Lang dfc64481a6 Port existing cutting code from restreamer into common
Note this moves over the 'experimental' cutter and deletes the original cutter
that concatenates entire videos before cutting.
We may eventually want to revive that method if the experimental cutter turns out
to introduce too many issues.

We move most of the code over verbatim, but adjust it such that it acts
as a generic iterator that can be used in a variety of contexts.

Some other changes made during the move include telling ffmpeg to be quieter
(don't output version info and junk, only log if something goes wrong),
and avoiding errors during cleanup.
6 years ago
Mike Lang 3d9ba77745 common: add allow_holes option to get_best_segments() to abort early if holes found
This is a performance optimization, allowing us to fail out early (potentially avoiding a LOT
of work) if we know we're going to reject any result that contains holes.

We add a new exception ContainsHoles that is raised in this condition.
6 years ago
Mike Lang e4b6110fd7 cutter: Add initial outline
The cutter has two jobs:
* To cut videos, taking them through states EDITED -> TRANSCODING
* To monitor TRANSCODING videos for when they're complete

We run these as separate greenlets with their own DB connections,
and if either dies we gracefully shut down the other.
6 years ago
Christopher Usher f43c699e05 updated thrimshim to handle all non-null edit columns 6 years ago
Mike Lang e383613954 database: Add constraints on edit inputs that they must be non-NULL if state != UNEDITED
This should help prevent changing state to EDITED with any of these fields unset,
which would blow up the cutter.

We also fix up upload_location, which was set up as a sheet input (NOT NULL DEFAULT ''),
and add a similar constraint saying any DONE columns must have non-NULL video link.
6 years ago
Christopher Usher d23de10b3e a few small fixes to ekim's comments 6 years ago
Christopher Usher 1d09e28b1e fixes to ekimekim's suggestions 6 years ago
Christopher Usher c81d538a79 thrimshim seems to be working 6 years ago
Christopher Usher e4fc878577 logic of the post 6 years ago
Christopher Usher 4c5b6e4cda GET working 6 years ago
Christopher Usher 5faa70dfc2 getting thrimshim to build and run is a minor success 6 years ago
Christopher Usher 57597c94cd hopefully some progress on the thrimshim 6 years ago
Christopher Usher be41be7878 Initial thrimshim commit 6 years ago
Christopher Usher 072e51f287 Renaming a variable that should have been part of the last commit 6 years ago
Christopher Usher 61107346c8 Fixed backing off on exceptions and some more documenation 6 years ago
Christopher Usher 728adb7c1d improvements suggested by ekim 6 years ago
Christopher Usher 530b9f7d5e more improvements based on ekims comments 6 years ago
Chris Usher 332e03de80 started in on ekim's comments 6 years ago
Christopher Usher 2857d3fb9f comments and some whitespace handling 6 years ago
Christopher Usher 4e6dbe1c74 Added localhost option to backfill to avoid backfilling the local machine 6 years ago
Christopher Usher ade0ad3d18 rewrite of get_nodes to allow getting list of files from a file 6 years ago
Christopher Usher 23fea7b154 bug fixing after testing 6 years ago
Christopher Usher 149974ce54 added multiple streams by largely copy and pasting the code from the
downloader
6 years ago
Christopher Usher e4364b75b1 options to change where the node list is coming from 6 years ago
Christopher Usher baae0f1ac1 bug fix in arg list 6 years ago
Christopher Usher 65143a8ca2 more flexability for start time 6 years ago
Christopher Usher a8cb1ff370 fixed start not propagating to list_hours plus some refactorting 6 years ago
Christopher Usher 57bb74632f I should test these changes soon 6 years ago
Christopher Usher 64bc76c48b error handling I guess 6 years ago
Christopher Usher 09368d92e1 fixes and improvements suggested by ekimekim
* simplied the backfiller local - now just a full backfill every couple minuteso
6 years ago
Christopher Usher 4eac6189ce backfiller working in parallel 6 years ago
Christopher Usher f4385ad4e3 hopefully did break anything with this refactor 6 years ago
Christopher Usher 1f53fa8d29 Bug fixes and logging improvements to the backfiller 6 years ago
Christopher Usher c9f6ee95c5 clean up for new gevent based backfiller. 6 years ago
Christopher Usher 7d9a5b4626 added workers and a worker manager 6 years ago
Christopher Usher be8d40d1ba Move the code for calculating hours outside the code that backfills 6 years ago
Chris Usher ed58b6e44d reintroduced a start time for the backfiller; more logging 6 years ago
Mike Lang 292188ad7c database: Remove retry_on_conflict helper and default to autocommit
All our usage was of a single query anyway, so autocommit is easier to handle.
You can still opt into a longer transaction using the transaction() helper.
6 years ago
Mike Lang 73640ed4ab database: Add column video_id for storing upload-location-specific metadata for identifying video
ie. for youtube, the video id.
6 years ago
Mike Lang dc2eb6ed74 Add some common database code
This code manages the database connections, setting their isolation level correctly
and ensuring the idempotent schema is applied before they're used.

Applying the schema on startup means we don't need to deal with the database's state,
setting it up before running, running migrations etc. However, it does put constraints on
the changes we can safely make.

Our use of seralizable isolation means that all transactions can be treated as fully
independent - the server must behave as though they'd been run seperately in some valid order.
This will give us the least surprising results when multiple connections try to modify the same
data, though we'll need to deal with occasional transaction commit failures due to conficts.
6 years ago