Despite our best efforts, this was causing chunked responses to be fully
buffered into memory as a side effect.
This is really bad because responses can be VERY large.
This is intended mainly for travis CI, because by default it doesn't cache any layers
between builds.
By pulling likely-reusable builds (all parents of the current commit),
we take a fixed cost slowdown but in many cases should see a dramatic speed increase
overall, since we won't need to re-build anything that hasn't changed.
This isn't needed for local builds, where docker will do this on its own
with any previously-built images.
Since these tend to happen around stream endings, etc,
we don't want them to be crazy noisy and cause us to disregard real problems.
We can use the segment coverage to see in metrics if there are overlaps.
This lets us view a number of useful graphs in dashboards, eg. rows by state,
errored rows, rows by day, rows by category, meltdowns per day, fraction of
events that are poster moments by category.
Sheetsync was the natural place to do this since it was already periodically scanning
the entire events table.
This gives an easy way to do so across all services without adding new options.
Reasons to do so might be to avoid overheads or because your prometheus metrics grow too large.
Otherwise if the containers get restarted and change ip, nginx hits the wrong ip.
We do this via a hack where we make all references indirect through a variable.
Since nginx only resolves this at request time, it always does a dns request.
To be clear, this is an awful hack.
It means that any implicit str/unicode coersion will use the utf-8 encoding,
which is basically always what you want.
However, it is possible that some badly-written libraries might be relying
on the default encoding being ascii, and will do weird things as a result.
Finally, it's especially hacky to be doing this as part of importing a library.
Normally you're meant to do this as part of a sitecustomize.py in your python system directory,
and the function is deleted before passing control to normal code (this is why we need
to reload() to get it back).
Instead of always waiting 5 seconds between runs,
wait until 5 seconds after the previous run started.
This ensures we actually run every 5sec and not every 5sec + how long it takes to run