Christopher Usher
73541f852f
logging and monitoring for thrimshim
5 years ago
Christopher Usher
6c633df3ee
move restreamer.stats to common.stats
5 years ago
Christopher Usher
6858c2e2de
starting on logging and monitoring
5 years ago
Mike Lang
1721fbd92e
fix dashboards for channel/quality naming
5 years ago
Mike Lang
04ef0d3823
fix a few remaining usages of StreamWorker.stream instead of .quality
5 years ago
Christopher Usher
361e577474
fixes based on ekimekims suggestions
5 years ago
Christopher Usher
732c56d502
typo in a comment
5 years ago
Christopher Usher
3564643613
refactoring downloader
5 years ago
Christopher Usher
b959853593
refactored to channel and quality
5 years ago
Christopher Usher
720684a388
refactoring to have consistent terminology
5 years ago
Christopher Usher
6d38250674
starting to refactor stream to channel and variant to quality
5 years ago
Mike Lang
a2b21966b9
Merge pull request #65 from ekimekim/mike/dashboards
...
Add grafana dashboards as jsonnet code
5 years ago
Mike Lang
f7b591e78b
sheetsync: Log more information on HTTPError
...
The api gives additional detail that we want to know when debugging.
5 years ago
Mike Lang
73d5941e05
downloader: Track timestamp of latest segment
...
This gives us a "stream delay" metric.
Prom doesn't have any native way to check the current value of a metric,
in order to take max(). It only offers increment and set.
We reach into some internals to do this in a hacky way,
but the cleaner way would be to track the value ourselves and have a prom callback
that gets the value.
Sigh, I hate this prom library. I might write my own that's less dumb.
5 years ago
Mike Lang
e4d3e418c8
transcode checker: longer retry while waiting for videos to finish
...
but still check db often.
This prevents us from using too much api quota on these checks,
while still letting us spot new videos quickly.
5 years ago
Mike Lang
1f15900b6f
cutter: At least for now, don't auto-retry errors
...
This leads to rapidly exhausting our upload limit since even a fast failed request
costs the same amount of usage quota as a 1-hour long video.
5 years ago
Mike Lang
fbef4725d7
cutter: Handle case where we are told to stop while looking for candidates
...
Previously, it would return None and things would break. Now the None is handled
correctly, and is documented.
5 years ago
Mike Lang
5cec6ec96e
cutter: Reconnect after any error that might be a database error
...
After certain kinds of DB error (eg. lost conn), we need to make a new conn
to have things work again. To be safe, we just do it after every error where it might
be a problem.
5 years ago
Mike Lang
fea9ff6c1d
cutter: Fix dockerfile, which was missing ffmpeg dependency
5 years ago
Mike Lang
f50276bd01
backfiller: Expose recent_cutoff as CLI arg and increase it to 120s default
...
In testing, GDQ's stream delay went up over 1min, which caused backfillers to backfill
segments at the same time they were downloaded. We increase the window for now,
and also make it configurable.
5 years ago
MasterGunner
6fa9d9d388
Merge pull request #64 from ekimekim/gunner/additional-thrimbletrimmer-integration
...
Gunner/additional thrimbletrimmer integration
5 years ago
MasterGunner
2e953eddde
Cleanup from Ekim's comments, removed auth placeholder until I know what I'm doing.
5 years ago
Mike Lang
ca925ae2e6
dashboard: Add some extra detail sections for backfiller and downloader
5 years ago
MasterGunner
6a171130e8
Updated Get All Rows route.
5 years ago
Mike Lang
39e7a5c2e6
Add overview dashboard
5 years ago
MasterGunner
736f0e0fe4
Adding get_all_row and auth function stubs
5 years ago
Mike Lang
41fffc2809
Merge pull request #62 from ekimekim/mike/monitoring
...
Scripts for running prometheus/grafana for monitoring
5 years ago
MasterGunner
4423ddee3c
Update SecurityModel.md
...
Simplified the document based on our discussions.
5 years ago
Mike Lang
612e34b88d
Merge pull request #61 from ekimekim/mike/backfiller/concurrent
...
backfiller: Allow multiple concurrent segment downloads
5 years ago
Mike Lang
29040a166c
backfiller: Allow multiple concurrent segment downloads
...
This will signifigantly increase throughput when downloading
large ranges of segments.
The max concurrency is exposed as a cli arg.
We also slightly modify the logged info, so it reports segments downloaded,
not just number of missing segments (which we might skip downloading for various reasons).
5 years ago
Christopher Usher
ec5a545fd2
Merge branch 'mike/sheetsync/fix-db-error'
5 years ago
Mike Lang
7273ee071e
monitoring fixes
5 years ago
Mike Lang
5a6d443efd
grafana: View-only anonymous access
5 years ago
Christopher Usher
980875b6f3
Merge branch 'mike/sheetsync/fix-db-error' of https://github.com/ekimekim/wubloader into mike/sheetsync/fix-db-error
5 years ago
Christopher Usher
37bad7d5ed
Also reset database connection on error in the backfiller
5 years ago
Christopher Usher
28f350dd46
Also reset database connection on error in the backfiller
5 years ago
Mike Lang
e048db0d94
cutter: Fix a failure mode where we never recover from a DB conn failure in TranscodeChecker
...
Since we never got a new conn after failure, we would just keep erroring with
"connection already closed" errors.
This isn't applicable to the main cutter loops since a DB failure there will restart the process.
5 years ago
Mike Lang
fe68e98804
sheetsync: Fix a failure mode where we never recover from a DB conn failure
...
Since we never got a new conn after failure, we would just keep erroring with
"connection already closed" errors.
5 years ago
Mike Lang
a767760f02
Add some existing scripts for setting up prometheus
5 years ago
Mike Lang
90eb2a4f13
Merge pull request #59 from ekimekim/mike/fixes
...
Some misc fixes from cutter and backfiller, see commits
5 years ago
Mike Lang
51efeb1f12
Merge pull request #58 from ekimekim/mike/nginx-dns-hack
...
Fix nginx when some services are disabled
5 years ago
Mike Lang
7179fcacec
Backfiller: ignore temp segments
...
To make this work, we make type a proper segment field.
We also tell get_best_segments to ignore temp segments, since they might go away
before we can actually use them.
5 years ago
Mike Lang
85c110ccb4
cutter: Fix typo from when we moved to the client model instead of auth headers
5 years ago
Mike Lang
3fa3c73d0e
Fix nginx when some services are disabled
...
nginx tries to resolve everything at startup, which doesn't work
if some of the services aren't present.
we instead generate the config file from a passed in env var, so that only
enabled services are present.
5 years ago
Mike Lang
6d729fa5cc
Merge pull request #57 from ekimekim/mike/compose-bits
...
Some stuff for making the docker compose setup easier
5 years ago
Mike Lang
6071a2f18d
docker_compose: Add a local postgres instance as an optional service
...
The node hosting the database can then easily run it as part of the stack.
5 years ago
Mike Lang
63eb324ba5
Add nginx service that provides a frontend to all the other services
...
This allows us to run all the different services and expose all their metrics,
all on one port.
5 years ago
Mike Lang
a7a54db726
docker-compose: Restructure for some finer control
...
Allow enabling/disabling at top of file
Allow no port to be exposed for any service
5 years ago
Mike Lang
499e486b0b
Merge pull request #54 from ekimekim/mike/sheet-sync/initial
...
sheet sync
5 years ago
Mike Lang
018e920808
sheet-sync: Some fixes
5 years ago