Fix some bugs and linter errors introduced by backfiller

I ran `pyflakes` on the repo and found these bugs:

```
./common/common.py:289: undefined name 'random'
./downloader/downloader/main.py:7: 'random' imported but unused
./backfiller/backfiller/main.py:150: undefined name 'variant'
./backfiller/backfiller/main.py:158: undefined name 'timedelta'
./backfiller/backfiller/main.py:171: undefined name 'sort'
./backfiller/backfiller/main.py:173: undefined name 'sort'
```
(ok, the "imported but unused" one isn't a bug, but the rest are)

This fixes those, as well as a further issue I saw with sorting of hours.

Iterables are not sortable. As an obvious example, what if your iterable was infinite?
As a result, any attempt to sort an iterable that is not already a friendly type like a list
or tuple will result in an error. We avoid this by coercing to list, fully realising the iterable
and putting it into a form that python will let us sort. It also avoids the nasty side-effect
of mutating the list that gets passed into us, which the caller may not expect. Consider this example:

```
>>> my_hours = ["one", "two", "three"]
>>> print my_hours
["one", "two", "three"]
>>> backfill_node(base_dir, node, stream, variants, hours=my_hours, order='forward')
>>> print my_hours
["one", "three", "two"]
```

Also, one of the linter errors was non-trivial to fix - we were trying to get a list of hours
(which is an api call for a particular variant), but at a time when we weren't dealing with a single
variant. My solution was to get a list of hours for ALL variants, and take the union.
pull/21/head
Mike Lang 6 years ago committed by Christopher Usher
parent 78a9a4e525
commit 6815924097

@ -147,15 +147,19 @@ def backfill_node(base_dir, node, stream, variants, hours=None, start=None,
seconds to prioritise letting the downloader grab these segments.""" seconds to prioritise letting the downloader grab these segments."""
if hours is None: if hours is None:
hours = list_remote_hours(node, stream, variant) # gather all available hours from all variants and take the union
hours = set().union(*[
list_remote_hours(node, stream, variant)
for variant in variants
])
elif is_iterable(hours): elif is_iterable(hours):
pass # hours already in desired format hours = list(hours) # coerce to list so it can be sorted
else: else:
n_hours = hours n_hours = hours
if n_hours < 1: if n_hours < 1:
raise ValueError('Number of hours has to be 1 or greater') raise ValueError('Number of hours has to be 1 or greater')
now = datetime.datetime.utcnow() now = datetime.datetime.utcnow()
hours = [(now - i * timedelta(hours=1)).strftime(HOUR_FMT) for i in range(n_hours)] hours = [(now - i * datetime.timedelta(hours=1)).strftime(HOUR_FMT) for i in range(n_hours)]
if start is not None: if start is not None:
hours = [hour for hour in hours if hour >= start] hours = [hour for hour in hours if hour >= start]
@ -168,9 +172,9 @@ def backfill_node(base_dir, node, stream, variants, hours=None, start=None,
if order == 'random': if order == 'random':
random.shuffle(hours) random.shuffle(hours)
elif order == 'forward': elif order == 'forward':
sort(hours) hours.sort()
elif order == 'reverse': elif order == 'reverse':
sort(hours, reverse=True) hours.sort(reverse=True)
for variant in variants: for variant in variants:

@ -8,6 +8,7 @@ import errno
import itertools import itertools
import logging import logging
import os import os
import random
import sys import sys
from collections import namedtuple from collections import namedtuple

@ -4,7 +4,6 @@ import errno
import hashlib import hashlib
import logging import logging
import os import os
import random
import signal import signal
import sys import sys
import uuid import uuid

Loading…
Cancel
Save