Commit Graph

21 Commits (10cca18922c9bc84009c393124413ad11f1ee15b)

Author SHA1 Message Date
Mike Lang 10cca18922 Fix a deadlock due to signal interactions with prometheus client
The prometheus client uses a threading.Lock() to prevent shared access to
certain metric state. This lock is taken as part of doing collection, as well
as during metric.labels().

We hit a deadlock where our stack sampler signal arrived during a collection,
when the lock was held. This meant that flamegraph.labels() blocked forever,
and the lock was never released, hanging all metrics collection.

Our solution is a hack, which is to reach into the internals of our metric object
and replace its lock with a dummy one. This is reasonably safe, but only as long as
the prometheus_client internal structure doesn't change signfigiantly.
6 years ago
Mike Lang b9c2921242 common.stats: Add a stacksampler that records sampled stacks to prometheus
This can then be used to generate flamegraphs
6 years ago
Mike Lang 5175b099af common: Split segment-related stuff into its own module
We still import them into __init__.py so they're accessible externally just the same
6 years ago
Mike Lang 6f84a23ba6 common: Split stats-related stuff into its own module
We still import them into __init__.py so they're accessible externally just the same
6 years ago
Mike Lang 8fe2fec958 common: convert from module to package 6 years ago
Mike Lang d90f01b8ce common: Create general function for timing things, and use it to time get_best_segments
The function is quite customizable and therefore quite complex, but it allows us to
easily annotate a function to be timed with labels based on input and output,
as well as normalize results based on amount of work done to get a better
picture of the actual amount of time taken per unit of work.
This will help us monitor for performance issues.
6 years ago
Mike Lang b0ded641c3 Add a logging handler which counts logs for prometheus stats
This isn't as good as having a full centralised logging system, but should
suffice to know if anything funny is happening.
6 years ago
Christopher Usher 3fcd374449 Moved encode_strings to common 6 years ago
Mike Lang 6815924097 Fix some bugs and linter errors introduced by backfiller
I ran `pyflakes` on the repo and found these bugs:

```
./common/common.py:289: undefined name 'random'
./downloader/downloader/main.py:7: 'random' imported but unused
./backfiller/backfiller/main.py:150: undefined name 'variant'
./backfiller/backfiller/main.py:158: undefined name 'timedelta'
./backfiller/backfiller/main.py:171: undefined name 'sort'
./backfiller/backfiller/main.py:173: undefined name 'sort'
```
(ok, the "imported but unused" one isn't a bug, but the rest are)

This fixes those, as well as a further issue I saw with sorting of hours.

Iterables are not sortable. As an obvious example, what if your iterable was infinite?
As a result, any attempt to sort an iterable that is not already a friendly type like a list
or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable
and putting it into a form that python will let us sort. It also avoids the nasty side-effect
of mutating the list that gets passed into us, which the caller may not expect. Consider this example:

```
>>> my_hours = ["one", "two", "three"]
>>> print my_hours
["one", "two", "three"]
>>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward')
>>> print my_hours
["one", "three", "two"]
```

Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours
(which is an api call for a particular variant), but at a time when we weren't dealing with a single
variant. My solution was to get a list of hours for ALL variants, and take the union.
6 years ago
Christopher Usher fec0975d18 fixed white space and the like 6 years ago
Christopher Usher 3cdfaad664 moved rename, ensure_directory and jitter to common
Move a few useful functions in downloader used in the backfiller to common
6 years ago
Mike Lang 6fa74608fb common: Improve some docs to note types of things that are ambiguous 6 years ago
Mike Lang 3bbe1ed32d Prefer longer duration on multiple segments 6 years ago
Christopher Usher 4981c6521b
Merge pull request #5 from ekimekim/mike/restreamer/initial
Initial work on restreamer
6 years ago
Mike Lang 75c9793eac Remove central config file as it's more trouble than it's worth
Simpler and easier for testing to stick to configuration via CLI args.
We'll worry about deployment later.
6 years ago
Mike Lang 0df8288013 common: Implement code for parsing paths and picking the best sequence of segments
This is needed by both the restreamer and the cutter, hence its inclusion in common.

The algorithm is pretty simple - it takes the 'best' segment per start time by full first,
then length of partial. All the other complexity is mainly just around detecting and reporting holes,
and being inclusive of start/end points.
6 years ago
Christopher Usher 8f462f5926 Fixed format_bustime docsting 6 years ago
Christopher Usher 4c22edf2e6 Fixed negative times in format_bustime 6 years ago
Mike Lang d7641aecf5 common: Fix bugs and issues with bustime utils 6 years ago
Mike Lang 048277b003 common: Basic config and bustime code 6 years ago
Mike Lang a361ab7a63 Add a common package for common bits in multiple components 6 years ago