Commit Graph

121 Commits (3ffab22f1cf3dc4478f8052cce9b50d6fac1c19d)

Author SHA1 Message Date
Christopher Usher 013ad65c68 added a Dockerfile for the backfiller 6 years ago
Christopher Usher 48d11045d4 Change to backfiller.main to backfill the last 3 hours on start up before doing a full backfill 6 years ago
Christopher Usher 176633bf7d More messing around with backfill_node to allow finer grained control of order segments are fetched 6 years ago
Christopher Usher 3a7624b107 added a setup file for the backfiller 6 years ago
Christopher Usher ba499fe835 added more logging to backfiller 6 years ago
Mike Lang 6815924097 Fix some bugs and linter errors introduced by backfiller
I ran `pyflakes` on the repo and found these bugs:

```
./common/common.py:289: undefined name 'random'
./downloader/downloader/main.py:7: 'random' imported but unused
./backfiller/backfiller/main.py:150: undefined name 'variant'
./backfiller/backfiller/main.py:158: undefined name 'timedelta'
./backfiller/backfiller/main.py:171: undefined name 'sort'
./backfiller/backfiller/main.py:173: undefined name 'sort'
```
(ok, the "imported but unused" one isn't a bug, but the rest are)

This fixes those, as well as a further issue I saw with sorting of hours.

Iterables are not sortable. As an obvious example, what if your iterable was infinite?
As a result, any attempt to sort an iterable that is not already a friendly type like a list
or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable
and putting it into a form that python will let us sort. It also avoids the nasty side-effect
of mutating the list that gets passed into us, which the caller may not expect. Consider this example:

```
>>> my_hours = ["one", "two", "three"]
>>> print my_hours
["one", "two", "three"]
>>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward')
>>> print my_hours
["one", "three", "two"]
```

Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours
(which is an api call for a particular variant), but at a time when we weren't dealing with a single
variant. My solution was to get a list of hours for ALL variants, and take the union.
6 years ago
Christopher Usher b42202434f Minor Fixes as sugged by ekimekim 6 years ago
Christopher Usher 0b524a72cb docstings and a few minor feature additions to the backfiller 6 years ago
Christopher Usher a59f6e1569 ignore tempuary files 6 years ago
Christopher Usher 3b0342b872 added options to limit range of hours backfilled and to randomise hours backfilled 6 years ago
Christopher Usher fec0975d18 fixed white space and the like 6 years ago
Christopher Usher afd948576d Forgot to try to remove temporary file 6 years ago
Christopher Usher 3cdfaad664 moved rename, ensure_directory and jitter to common
Move a few useful functions in downloader used in the backfiller to common
6 years ago
Christopher Usher 7d26997b1f modifications to the backfiller in response to ekimekim's comments 6 years ago
Christopher Usher ba52bf7f5d hopefully more robust 6 years ago
Christopher Usher 50bcb84c0c Moving things around to make the backfiller a bit more like a proper package 6 years ago
Christopher Usher 494725fe34 Getting close to something I can show ekimekim 6 years ago
Christopher Usher 5615c1bdb0 Chipping away at backfiller
I'm going to have to learn to write better commit messages
6 years ago
Christopher Usher 2fb17fff59 much closer to being functional 6 years ago
Christopher Usher 05fed36ac8 a few ideas extra 6 years ago
Christopher Usher 0e7ba25b76 start of a rough prototype of the backfiller 6 years ago