Commit Graph

11 Commits (4e6dbe1c7469dc104add8ad99040b93c33e073d9)

Author SHA1 Message Date
Mike Lang 292188ad7c database: Remove retry_on_conflict helper and default to autocommit
All our usage was of a single query anyway, so autocommit is easier to handle.
You can still opt into a longer transaction using the transaction() helper.
6 years ago
Mike Lang 73640ed4ab database: Add column video_id for storing upload-location-specific metadata for identifying video
ie. for youtube, the video id.
6 years ago
Mike Lang dc2eb6ed74 Add some common database code
This code manages the database connections, setting their isolation level correctly
and ensuring the idempotent schema is applied before they're used.

Applying the schema on startup means we don't need to deal with the database's state,
setting it up before running, running migrations etc. However, it does put constraints on
the changes we can safely make.

Our use of seralizable isolation means that all transactions can be treated as fully
independent - the server must behave as though they'd been run seperately in some valid order.
This will give us the least surprising results when multiple connections try to modify the same
data, though we'll need to deal with occasional transaction commit failures due to conficts.
6 years ago
Mike Lang 997c1242b2 get_best_segments: Let other things run
get_best_segments can sometimes take a very long time,
we don't want to stop other work from happening while it's ongoing.
So we ask gevent to run other things until there's no other work to do,
then we do one hour, then check back with gevent again.

In combination with the performance improvements, this should mean we don't block
other things from running for more than a few hundred ms at most.
6 years ago
Mike Lang bf08aa29b8 parse_segment_path: Use datetime.strptime instead of dateutil.parser
strptime is much faster but can't handle as varied formats.
But in this case we fully control the format, so there's no reason not to use it.

Profiling suggests we spend about 80% of our time in get_best_segments just parsing dates,
so this is a signifigant performance gain.
6 years ago
Mike Lang bcdb268ce8 Also need to replace locks on the counter float values to prevent deadlocks
See comment for full details
6 years ago
Mike Lang 10cca18922 Fix a deadlock due to signal interactions with prometheus client
The prometheus client uses a threading.Lock() to prevent shared access to
certain metric state. This lock is taken as part of doing collection, as well
as during metric.labels().

We hit a deadlock where our stack sampler signal arrived during a collection,
when the lock was held. This meant that flamegraph.labels() blocked forever,
and the lock was never released, hanging all metrics collection.

Our solution is a hack, which is to reach into the internals of our metric object
and replace its lock with a dummy one. This is reasonably safe, but only as long as
the prometheus_client internal structure doesn't change signfigiantly.
6 years ago
Mike Lang b9c2921242 common.stats: Add a stacksampler that records sampled stacks to prometheus
This can then be used to generate flamegraphs
6 years ago
Mike Lang 5175b099af common: Split segment-related stuff into its own module
We still import them into __init__.py so they're accessible externally just the same
6 years ago
Mike Lang 6f84a23ba6 common: Split stats-related stuff into its own module
We still import them into __init__.py so they're accessible externally just the same
6 years ago
Mike Lang 8fe2fec958 common: convert from module to package 6 years ago