Commit Graph

1044 Commits (c3d5405ebc12a51cf0475878e684d32354dcbdf1)
 

Author SHA1 Message Date
Christopher Usher 6858c2e2de starting on logging and monitoring 5 years ago
Mike Lang 1721fbd92e fix dashboards for channel/quality naming 5 years ago
Mike Lang 04ef0d3823 fix a few remaining usages of StreamWorker.stream instead of .quality 5 years ago
Christopher Usher 361e577474 fixes based on ekimekims suggestions 5 years ago
Christopher Usher 732c56d502 typo in a comment 5 years ago
Christopher Usher 3564643613 refactoring downloader 5 years ago
Christopher Usher b959853593 refactored to channel and quality 5 years ago
Christopher Usher 720684a388 refactoring to have consistent terminology 5 years ago
Christopher Usher 6d38250674 starting to refactor stream to channel and variant to quality 5 years ago
Mike Lang a2b21966b9
Merge pull request #65 from ekimekim/mike/dashboards
Add grafana dashboards as jsonnet code
5 years ago
Mike Lang f7b591e78b sheetsync: Log more information on HTTPError
The api gives additional detail that we want to know when debugging.
5 years ago
Mike Lang 73d5941e05 downloader: Track timestamp of latest segment
This gives us a "stream delay" metric.

Prom doesn't have any native way to check the current value of a metric,
in order to take max(). It only offers increment and set.

We reach into some internals to do this in a hacky way,
but the cleaner way would be to track the value ourselves and have a prom callback
that gets the value.

Sigh, I hate this prom library. I might write my own that's less dumb.
5 years ago
Mike Lang e4d3e418c8 transcode checker: longer retry while waiting for videos to finish
but still check db often.
This prevents us from using too much api quota on these checks,
while still letting us spot new videos quickly.
5 years ago
Mike Lang 1f15900b6f cutter: At least for now, don't auto-retry errors
This leads to rapidly exhausting our upload limit since even a fast failed request
costs the same amount of usage quota as a 1-hour long video.
5 years ago
Mike Lang fbef4725d7 cutter: Handle case where we are told to stop while looking for candidates
Previously, it would return None and things would break. Now the None is handled
correctly, and is documented.
5 years ago
Mike Lang 5cec6ec96e cutter: Reconnect after any error that might be a database error
After certain kinds of DB error (eg. lost conn), we need to make a new conn
to have things work again. To be safe, we just do it after every error where it might
be a problem.
5 years ago
Mike Lang fea9ff6c1d cutter: Fix dockerfile, which was missing ffmpeg dependency 5 years ago
Mike Lang f50276bd01 backfiller: Expose recent_cutoff as CLI arg and increase it to 120s default
In testing, GDQ's stream delay went up over 1min, which caused backfillers to backfill
segments at the same time they were downloaded. We increase the window for now,
and also make it configurable.
5 years ago
MasterGunner 6fa9d9d388
Merge pull request #64 from ekimekim/gunner/additional-thrimbletrimmer-integration
Gunner/additional thrimbletrimmer integration
5 years ago
MasterGunner 2e953eddde Cleanup from Ekim's comments, removed auth placeholder until I know what I'm doing. 5 years ago
Mike Lang ca925ae2e6 dashboard: Add some extra detail sections for backfiller and downloader 5 years ago
MasterGunner 6a171130e8 Updated Get All Rows route. 5 years ago
Mike Lang 39e7a5c2e6 Add overview dashboard 5 years ago
MasterGunner 736f0e0fe4 Adding get_all_row and auth function stubs 5 years ago
Mike Lang 41fffc2809
Merge pull request #62 from ekimekim/mike/monitoring
Scripts for running prometheus/grafana for monitoring
5 years ago
MasterGunner 4423ddee3c
Update SecurityModel.md
Simplified the document based on our discussions.
5 years ago
Mike Lang 612e34b88d
Merge pull request #61 from ekimekim/mike/backfiller/concurrent
backfiller: Allow multiple concurrent segment downloads
5 years ago
Mike Lang 29040a166c backfiller: Allow multiple concurrent segment downloads
This will signifigantly increase throughput when downloading
large ranges of segments.

The max concurrency is exposed as a cli arg.

We also slightly modify the logged info, so it reports segments downloaded,
not just number of missing segments (which we might skip downloading for various reasons).
5 years ago
Christopher Usher ec5a545fd2 Merge branch 'mike/sheetsync/fix-db-error' 5 years ago
Mike Lang 7273ee071e monitoring fixes 5 years ago
Mike Lang 5a6d443efd grafana: View-only anonymous access 5 years ago
Christopher Usher 980875b6f3 Merge branch 'mike/sheetsync/fix-db-error' of https://github.com/ekimekim/wubloader into mike/sheetsync/fix-db-error 5 years ago
Christopher Usher 37bad7d5ed Also reset database connection on error in the backfiller 5 years ago
Christopher Usher 28f350dd46 Also reset database connection on error in the backfiller 5 years ago
Mike Lang e048db0d94 cutter: Fix a failure mode where we never recover from a DB conn failure in TranscodeChecker
Since we never got a new conn after failure, we would just keep erroring with
"connection already closed" errors.

This isn't applicable to the main cutter loops since a DB failure there will restart the process.
5 years ago
Mike Lang fe68e98804 sheetsync: Fix a failure mode where we never recover from a DB conn failure
Since we never got a new conn after failure, we would just keep erroring with
"connection already closed" errors.
5 years ago
Mike Lang a767760f02 Add some existing scripts for setting up prometheus 5 years ago
Mike Lang 90eb2a4f13
Merge pull request #59 from ekimekim/mike/fixes
Some misc fixes from cutter and backfiller, see commits
5 years ago
Mike Lang 51efeb1f12
Merge pull request #58 from ekimekim/mike/nginx-dns-hack
Fix nginx when some services are disabled
5 years ago
Mike Lang 7179fcacec Backfiller: ignore temp segments
To make this work, we make type a proper segment field.

We also tell get_best_segments to ignore temp segments, since they might go away
before we can actually use them.
5 years ago
Mike Lang 85c110ccb4 cutter: Fix typo from when we moved to the client model instead of auth headers 5 years ago
Mike Lang 3fa3c73d0e Fix nginx when some services are disabled
nginx tries to resolve everything at startup, which doesn't work
if some of the services aren't present.

we instead generate the config file from a passed in env var, so that only
enabled services are present.
5 years ago
Mike Lang 6d729fa5cc
Merge pull request #57 from ekimekim/mike/compose-bits
Some stuff for making the docker compose setup easier
5 years ago
Mike Lang 6071a2f18d docker_compose: Add a local postgres instance as an optional service
The node hosting the database can then easily run it as part of the stack.
5 years ago
Mike Lang 63eb324ba5 Add nginx service that provides a frontend to all the other services
This allows us to run all the different services and expose all their metrics,
all on one port.
5 years ago
Mike Lang a7a54db726 docker-compose: Restructure for some finer control
Allow enabling/disabling at top of file
Allow no port to be exposed for any service
5 years ago
Mike Lang 499e486b0b
Merge pull request #54 from ekimekim/mike/sheet-sync/initial
sheet sync
5 years ago
Mike Lang 018e920808 sheet-sync: Some fixes 5 years ago
Christopher Usher dd246e1343 ekimekim's suggestions 5 years ago
Christopher Usher 9b28765ff2 Bug fixes to get the database connection working 5 years ago