Commit Graph

11 Commits (ae808eedde018ab971162f5ae96d8278692a37d6)

Author SHA1 Message Date
Mike Lang 3e69000058 py3 fixes for common 3 years ago
Mike Lang b029250c1c Disable stacksampler by default
It causes problems due to the sheer number of unique metrics emitted, which makes
the prometheus endpoint be very expensive / fail a lot.

The data is not useful enough to justify the cost.
4 years ago
Mike Lang 6b602592f5 Allow disabling of stacksampling with an env var
This gives an easy way to do so across all services without adding new options.

Reasons to do so might be to avoid overheads or because your prometheus metrics grow too large.
5 years ago
Mike Lang 4d5157cdb5 Fix a mistake with allowing reuse of name in @timed() 5 years ago
Mike Lang a7f5d1c545 Fix issues with metrics gathering for cut functions
* Need to allow timed() to have multiple callers with same name
* "type" label is reserved, use "cut_type" instead
5 years ago
Christopher Usher 76bc629720 moved flask monitoring to its own module 5 years ago
Christopher Usher 6c633df3ee move restreamer.stats to common.stats 5 years ago
Mike Lang bcdb268ce8 Also need to replace locks on the counter float values to prevent deadlocks
See comment for full details
6 years ago
Mike Lang 10cca18922 Fix a deadlock due to signal interactions with prometheus client
The prometheus client uses a threading.Lock() to prevent shared access to
certain metric state. This lock is taken as part of doing collection, as well
as during metric.labels().

We hit a deadlock where our stack sampler signal arrived during a collection,
when the lock was held. This meant that flamegraph.labels() blocked forever,
and the lock was never released, hanging all metrics collection.

Our solution is a hack, which is to reach into the internals of our metric object
and replace its lock with a dummy one. This is reasonably safe, but only as long as
the prometheus_client internal structure doesn't change signfigiantly.
6 years ago
Mike Lang b9c2921242 common.stats: Add a stacksampler that records sampled stacks to prometheus
This can then be used to generate flamegraphs
6 years ago
Mike Lang 6f84a23ba6 common: Split stats-related stuff into its own module
We still import them into __init__.py so they're accessible externally just the same
6 years ago