wubloader

Commit Graph

Author	SHA1	Message	Date
Christopher Usher	f43c699e05	updated thrimshim to handle all non-null edit columns	6 years ago
Mike Lang	e383613954	database: Add constraints on edit inputs that they must be non-NULL if state != UNEDITED This should help prevent changing state to EDITED with any of these fields unset, which would blow up the cutter. We also fix up upload_location, which was set up as a sheet input (NOT NULL DEFAULT ''), and add a similar constraint saying any DONE columns must have non-NULL video link.	6 years ago
Christopher Usher	d23de10b3e	a few small fixes to ekim's comments	6 years ago
Christopher Usher	1d09e28b1e	fixes to ekimekim's suggestions	6 years ago
Christopher Usher	c81d538a79	thrimshim seems to be working	6 years ago
Christopher Usher	e4fc878577	logic of the post	6 years ago
Christopher Usher	4c5b6e4cda	GET working	6 years ago
Christopher Usher	5faa70dfc2	getting thrimshim to build and run is a minor success	6 years ago
Christopher Usher	57597c94cd	hopefully some progress on the thrimshim	6 years ago
Christopher Usher	be41be7878	Initial thrimshim commit	6 years ago
Christopher Usher	072e51f287	Renaming a variable that should have been part of the last commit	6 years ago
Christopher Usher	61107346c8	Fixed backing off on exceptions and some more documenation	6 years ago
Christopher Usher	728adb7c1d	improvements suggested by ekim	6 years ago
Christopher Usher	530b9f7d5e	more improvements based on ekims comments	6 years ago
Chris Usher	332e03de80	started in on ekim's comments	6 years ago
Christopher Usher	2857d3fb9f	comments and some whitespace handling	6 years ago
Christopher Usher	4e6dbe1c74	Added localhost option to backfill to avoid backfilling the local machine	6 years ago
Christopher Usher	ade0ad3d18	rewrite of get_nodes to allow getting list of files from a file	6 years ago
Christopher Usher	23fea7b154	bug fixing after testing	6 years ago
Christopher Usher	149974ce54	added multiple streams by largely copy and pasting the code from the downloader	6 years ago
Christopher Usher	e4364b75b1	options to change where the node list is coming from	6 years ago
Christopher Usher	baae0f1ac1	bug fix in arg list	6 years ago
Christopher Usher	65143a8ca2	more flexability for start time	6 years ago
Christopher Usher	a8cb1ff370	fixed start not propagating to list_hours plus some refactorting	6 years ago
Christopher Usher	57bb74632f	I should test these changes soon	6 years ago
Christopher Usher	64bc76c48b	error handling I guess	6 years ago
Christopher Usher	09368d92e1	fixes and improvements suggested by ekimekim * simplied the backfiller local - now just a full backfill every couple minuteso	6 years ago
Christopher Usher	4eac6189ce	backfiller working in parallel	6 years ago
Christopher Usher	f4385ad4e3	hopefully did break anything with this refactor	6 years ago
Christopher Usher	1f53fa8d29	Bug fixes and logging improvements to the backfiller	6 years ago
Christopher Usher	c9f6ee95c5	clean up for new gevent based backfiller.	6 years ago
Christopher Usher	7d9a5b4626	added workers and a worker manager	6 years ago
Christopher Usher	be8d40d1ba	Move the code for calculating hours outside the code that backfills	6 years ago
Chris Usher	ed58b6e44d	reintroduced a start time for the backfiller; more logging	6 years ago
Mike Lang	292188ad7c	database: Remove retry_on_conflict helper and default to autocommit All our usage was of a single query anyway, so autocommit is easier to handle. You can still opt into a longer transaction using the transaction() helper.	6 years ago
Mike Lang	73640ed4ab	database: Add column video_id for storing upload-location-specific metadata for identifying video ie. for youtube, the video id.	6 years ago
Mike Lang	dc2eb6ed74	Add some common database code This code manages the database connections, setting their isolation level correctly and ensuring the idempotent schema is applied before they're used. Applying the schema on startup means we don't need to deal with the database's state, setting it up before running, running migrations etc. However, it does put constraints on the changes we can safely make. Our use of seralizable isolation means that all transactions can be treated as fully independent - the server must behave as though they'd been run seperately in some valid order. This will give us the least surprising results when multiple connections try to modify the same data, though we'll need to deal with occasional transaction commit failures due to conficts.	6 years ago
Mike Lang	cea66a4bbf	database: Rename start/end to event_start/end, add channel and quality * Had to rename `end` as `end` is a reserved word in postgres SQL. `event_end` is more consistent with `video_end` anyway. Updated `start` to match. * Added ability to specify channel and stream quality in the editor, which may prove useful if we have issues with a particular stream quality, or if content needs to be captured from other channels.	6 years ago
MasterGunner	7423f8c4ef	Update DATABASE.md Changed upload_location to be edit input.	6 years ago
MasterGunner	d89458c27d	Update DATABASE.md Changed allow_holes and uploader_whitelist to be edit inputs - there's no need for them to come from the sheet; and we'll have an admin dashboard for modifying them if needed.	6 years ago
Mike Lang	437d38e646	DATABASE.md: Add image_links column This solves the problem of rows which don't need a full cut video, but we'd like to link to an image or a short gif or clip of it. It is a sheet input that is only used in the output sheet, so it doesn't affect the wubloader itself.	6 years ago
Mike Lang	df66553b38	downloader: Start backdoor later so workers is in locals	6 years ago
Mike Lang	86da9d9fe8	downloader: Support watching multiple channels This is useful eg. for watching db_admin or other testing channels in addition to the main channel.	6 years ago
Mike Lang	f0d9aa82c2	Ignore segments that are marked as ads * Checks for the SCTE35-OUT/SCTE35-IN marks in the HLS stream that indicate an ad start/end * Ignores those segments completely * Doesn't mark the StreamWorker as up until it sees the first non-ad segment Some other operational notes: * The main risk this adds is that re-connecting / refreshing master playlist takes longer. If all downloaders are doing this at the same time (ie. because the stream only just came up, or during a deployment rollout), all downloaders might be waiting for ads to finish and you'll miss segments. * We should run more downloaders to compensate. This also increases the chance at least one of them won't get any ads, so we get everything right from stream-up. * The other mitigation we can do is have geographically diverse downloaders. This decreases the risk that they all get served an ad, and at least at time of writing it seems that no in-stream ads are served outside of these regions: > US, Canada, Germany, France, Sweden, Belgium, Poland, Norway, Finland, Denmark, Netherlands, Italy, Spain, Switzerland, Austria, Portugal, UK, Australia, New Zealand	6 years ago
Mike Lang	ef0a78fce3	Updates to database states and columns Split UPLOADED into TRANSCODING and DONE, to represent the time after upload that youtube is transcoding the video and it's not viewable. Any cutter can poll for the state of a transcoding video and mark it as done. Add some extra sheet input columns.	7 years ago
Mike Lang	6b3a0fea9f	Add a doc covering how the database is used I fully expect the exact list of sheet inputs, edit inputs and outputs to change. The important thing I wanted to codify here was the state machine and the behaviour of the cutters.	7 years ago
Mike Lang	c0f94059aa	downloader: Stop retrying in SegmentGetter after a long timeout In resource contention scenarios, all calls can start failing due to not being able to read the response in a timely manner. This means SegmentGetters never stop retrying, leading to further contention and a feedback loop. We attempt to put at least some cap on this scenario by giving up if an amount of time has elapsed to the point that we know our URL couldn't be valid anymore. Since we don't actually know how long segment URLs are valid, we are very conservative about this time, for now setting it to 20min.	7 years ago
Mike Lang	81aee0ee1e	Increase hard timeout for getting segment headers When we're under CPU or disk contention, doing other work can become very slow. We want to avoid spurious errors in this situation as this causes further retries and further contention. One easy way to do this is to increase the time we have to finish fetching headers.	7 years ago
Mike Lang	787b9002ab	restreamer: Use correct name for dateutil	7 years ago
Mike Lang	3a1e4b0aef	restreamer: Fix missing dependency This was hidden because common included it	7 years ago

1 2 3 4

167 Commits (f43c699e056f9d826b1eca9e2a89938fa154eae0) All Branches Search

167 Commits (f43c699e056f9d826b1eca9e2a89938fa154eae0)

All Branches