In some formats, most notably DASH, there is a "initialization data" that is required
in order to play the segment. The data is common to all segments so it is served as a seperate URL
under EXT-X-MAP. However, redundant copies of this data are benign and it's very small, so
we just put it in front of EVERY segment so that we can play every one independently (but
concatenating them still works).
We use a very simple cache to avoid downloading it again for every segment.
* Checks for the SCTE35-OUT/SCTE35-IN marks in the HLS stream that indicate an ad start/end
* Ignores those segments completely
* Doesn't mark the StreamWorker as up until it sees the first non-ad segment
Some other operational notes:
* The main risk this adds is that re-connecting / refreshing master playlist takes longer.
If all downloaders are doing this at the same time (ie. because the stream only just came up,
or during a deployment rollout), all downloaders might be waiting for ads to finish and
you'll miss segments.
* We should run more downloaders to compensate. This also increases the chance at least one of
them won't get any ads, so we get everything right from stream-up.
* The other mitigation we can do is have geographically diverse downloaders. This decreases the risk
that they all get served an ad, and at least at time of writing it seems that no in-stream ads
are served outside of these regions:
> US, Canada, Germany, France, Sweden, Belgium, Poland, Norway, Finland, Denmark, Netherlands, Italy, Spain, Switzerland, Austria, Portugal, UK, Australia, New Zealand
This is a useful library and we might as well use it.
Copying it over and slightly modifying it to work was easier than importing all of streamlink.
The original version may be found at 30043408c7/src/streamlink/stream/hls_playlist.py