Commit Graph

66 Commits (ee4a68af504d1215ce2992816225bfbe16456912)

Author SHA1 Message Date
Mike Lang ee4a68af50 clear up confusion with empty string vs None 3 months ago
Mike Lang 3e873ca5f6 wip: fixes 3 months ago
Mike Lang eebfa5885b sheetsync: pass in event id instead of event name 3 months ago
Mike Lang cf41f572f5 Fix streamlog formatting 3 months ago
Mike Lang 986a1db964 sheetsync: Change how options are specified to allow multiple backends / syncs 3 months ago
Mike Lang 74869de89d Implement reverse sync mode
This is a mode where all data flows one-way from the database to the sheet.
It is intended to be used to populate an empty sheet from database events,
possibly sourced from somewhere else.

To make this work, a few changes were required:
* Track which ids we've seen so we know what events were not matched with a row
* Allow `row` to be None in sync_rows
* When it is, call the middleware to create a new row with a new id
* In sheets, this is implemented by tracking the last empty rows we saw, and claiming them as needed.
3 months ago
Mike Lang 85de9757f7 sheetsync: Remove pick_worksheets() from middleware api
Instead, get_rows() makes that decision internally if needed.
3 months ago
Mike Lang 17463d70fe sheetsync: Remove worksheet from middleware apis
since it's now baked into the row dict
3 months ago
Mike Lang eec58f2651 sheetsync: Always have sheet name as part of row dict 3 months ago
Mike Lang fa9a4b70bb bugfix 3 months ago
Mike Lang ca3f92c0b6 sheetsync: Use streamlog section instead of deriving day from start time 3 months ago
Mike Lang 071cd29f4d sheetsync: Implement Streamlog middleware 3 months ago
Mike Lang d064522d60 sheetsync: Move edit url management into Sheets middleware
As streamlog doesn't require it.
3 months ago
Mike Lang be111ccb2a Change database primary key from UUID to TEXT
We still store uuids, but in text form.
This allows us to store non-UUID ids for systems that have other ids.
3 months ago
Mike Lang 72f7c59a77 Sheetsync: Split into the main loop logic + sheets-specific middleware
NOTE ON CONFLICTS

In master, we moved sheets.py to common as it only contained a generic client.
Now sheets.py also contains specific sheetsync stuff.

Our resolution:
- Keep the generic version in common
- Keep the old version verbatim (including the now-redundant generic client) in sheetsync

We will move the sheetsync implementation to the generic client after the rebase is complete.
3 months ago
Mike Lang 0e5bf1a0fe sheetsync: Split playlist runloop from normal sheets 3 months ago
Mike Lang a16259e892 sheetsync: Move id allocation out of sync_row() 3 months ago
Mike Lang 256e0f7ba1 sheetsync: Move row_index variable into row dict 3 months ago
Mike Lang c5c9075f9e Basic streamlog api 3 months ago
Mike Lang c2d2f4b85c Revert "sheetsync: Support archive sheet"
This reverts commit b93597c274.
3 months ago
Mike Lang 4c87ad6735 Revert "sheetsync: unmapped columns aren't a problem."
This reverts commit 5256577d00.
3 months ago
ZeldaZach 8bbc72184c Support hot reload of Zulip Schedule
- Move sheets API into common dir, since multi use
- Live download from Google Sheets using Config
- Falls back on old schedule if new one can't be downloaded for some reason
4 months ago
Mike Lang 5256577d00 sheetsync: unmapped columns aren't a problem. 1 year ago
Mike Lang b93597c274 sheetsync: Support archive sheet 1 year ago
Mike Lang 1596feef1f sheetsync: Treat end time "--" as same as start time
This is a common idiom, which we previously treated like a blank end time
(no end time set yet) but it makes more sense to treat as "same as start".
1 year ago
Mike Lang 92ea0fbb77 sheetsync: even more hard-coded columns in database fetch 2 years ago
Mike Lang 29e6b9ead3 lists aren't sets 2 years ago
Mike Lang 546572a697 sheetsync: Don't pull the entire row from the database
only the columns you need.

This matters because the thumbnail columns are very large and
we're transfering GB of data every time.
2 years ago
Mike Lang db843c8f63 sheetsync: Report sync duration 2 years ago
Mike Lang 7dfb7b2544 sheetsync: Fix a bug where only show-in-description playlists were detected
Because a blank 5th column would make sheetsync ignore the row.
2 years ago
Mike Lang dd8385ccd8 sheetsync: Special case "<all>" in playlist tags to mean []
this avoids empty string meaning [] which is dangerous since it's easy to write accidentially.
2 years ago
Mike Lang e7d1212085 fix typo 2 years ago
Mike Lang 32c72d6eb7 sheetsync: correct parsing for updated playlists 2 years ago
Mike Lang 34a33fdeb6 partially implement playlist links in video descriptions
We make them conceptually "part of the footer" so they're updated only when the video
is otherwise updated (which would generally mean MODIFIED).
2 years ago
Mike Lang 36017aaccd sheetsync: Show unlisted videos in DONE state as UNLISTED instead
We don't actually want to represent them as a different state in the backend, but showing
them differently on the sheet is helpful to humans.
2 years ago
Mike Lang 467edf3d19 Read dynamic playlist manager config from sheet
The sheetsync loads playlist ids and tags into a new table `playlists`.
playlist manager reads this table and merges it with the playlists given on the command line.
3 years ago
Mike Lang aab8cf2f0f Set up plumbing for multi-range videos and implement no-transition fast cut videos only
This is the simplest case as we can just cut each range like we already do,
then concat the results.

We still allow for the full design in the database and cutter, but error out if transitions
is ever anything but hard cuts or if it's a full cut.

We also update the restreamer to allow accepting ranges, however for usability we still allow
the old "just one start and end" args.

Note this changes the thrimshim API to give and take the new "video_ranges" and "video_transitions" columns.
3 years ago
Mike Lang 8f24c2eae1 py3 fixes for sheetsync 3 years ago
HubbeKing 86f7823348 Replace calls to gevent.signal() with gevent.signal_handler()
gevent.signal() was removed in gevent 1.5a4, see http://www.gevent.org/api/gevent.signal.html
Removed on Feb 5th, see https://github.com/gevent/gevent/pull/1530
4 years ago
Mike Lang b9cd76b1a2 Add non-static implict tags in sheetsync
In order for the upcoming playlist manager to be able to use the DB `tags` column to know
what tags a video has, all the tags it needs need to be present.

Previously, this was a problem because the day and category tags only get added at the cutter
and so wouldn't be listed.

This moves them so they are added when parsing the row in sheetsync.
It also adds the poster moment tag if poster moment is checked.

Note that fully static tags that go on all videos are still only added in cutter,
but the playlist manager doesn't need to care about those (since by definition
they will match every video).
4 years ago
Mike Lang 29571fb60b Add tags column to sheetsync
New tags column shunts all columns after it right by 1.

Note we parse tags by splitting on commas then discarding whitespace.
If this would create an empty string tag, it is ignored.
Example: "foo, bar baz,a,,bc " -> ["foo", "bar baz", "a", "bc"]
4 years ago
Mike Lang b85296a81e sheetsync: Move column indexes to match updated sheet
New tags column shunts all columns after it right by 1.
We will later want to parse that, but for now we ignore it.
4 years ago
Mike Lang 59d0fa3e40 sheetsync: Don't mis-parse blank as bad time 5 years ago
Mike Lang ab157afe20 sheetsync: Clear event counts before each update
Otherwise, no count of 0 ever gets set, and things are left showing
values when they shouldn't.
5 years ago
Mike Lang 89a9e5554c sheetsync: Record counts of rows in the DB, segmented by various columns
This lets us view a number of useful graphs in dashboards, eg. rows by state,
errored rows, rows by day, rows by category, meltdowns per day, fraction of
events that are poster moments by category.

Sheetsync was the natural place to do this since it was already periodically scanning
the entire events table.
5 years ago
Mike Lang 1c0f3a627b sheetsync: Log what worksheets got synced
it's kinda important
5 years ago
Mike Lang 8b25f8be95 sheetsync: Inject an error into the error column if we fail to parse an input column 5 years ago
Mike Lang 8dc7b80de9 sheetsync: Improve timing of main loop
Instead of always waiting 5 seconds between runs,
wait until 5 seconds after the previous run started.

This ensures we actually run every 5sec and not every 5sec + how long it takes to run
5 years ago
Mike Lang cda8078f64 sheetsync: Only check the most recently changed two sheets most times
Only check the other sheets every 4th time (20sec instead of 5sec).

This elminiates a huge source of unnessecary reads, which prevents us from going over
our API limit.
5 years ago
Mike Lang 4f6f4cad8b sheetsync: Fix typos with metrics 5 years ago