Merge branch 'yt-dlp:master' into playsuisse-handle-locale-parameters-from-urls

pull/12466/head
v3DJG6GL 1 week ago committed by GitHub
commit b022b7a23e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -758,3 +758,5 @@ somini
thedenv
vallovic
arabcoders
mireq
mlabeeb03

@ -4,6 +4,27 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2025.03.31
#### Core changes
- [Add `--compat-options 2024`](https://github.com/yt-dlp/yt-dlp/commit/22e34adbd741e1c7072015debd615dc3fb71c401) ([#12789](https://github.com/yt-dlp/yt-dlp/issues/12789)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **francaisfacile**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb321cfdc3fd4400598ddb12a15862bc2ac8fc10) ([#12787](https://github.com/yt-dlp/yt-dlp/issues/12787)) by [mlabeeb03](https://github.com/mlabeeb03)
- **generic**: [Validate response before checking m3u8 live status](https://github.com/yt-dlp/yt-dlp/commit/9a1ec1d36e172d252714cef712a6d091e0a0c4f2) ([#12784](https://github.com/yt-dlp/yt-dlp/issues/12784)) by [bashonly](https://github.com/bashonly)
- **microsoftlearnepisode**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/d63696f23a341ee36a3237ccb5d5e14b34c2c579) ([#12799](https://github.com/yt-dlp/yt-dlp/issues/12799)) by [bashonly](https://github.com/bashonly)
- **mlbtv**: [Fix radio-only extraction](https://github.com/yt-dlp/yt-dlp/commit/f033d86b96b36f8c5289dd7c3304f42d4d9f6ff4) ([#12792](https://github.com/yt-dlp/yt-dlp/issues/12792)) by [bashonly](https://github.com/bashonly)
- **on24**: [Support `mainEvent` URLs](https://github.com/yt-dlp/yt-dlp/commit/e465b078ead75472fcb7b86f6ccaf2b5d3bc4c21) ([#12800](https://github.com/yt-dlp/yt-dlp/issues/12800)) by [bashonly](https://github.com/bashonly)
- **sbs**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/29560359120f28adaaac67c86fa8442eb72daa0d) ([#12785](https://github.com/yt-dlp/yt-dlp/issues/12785)) by [bashonly](https://github.com/bashonly)
- **stvr**: [Rename extractor from RTVS to STVR](https://github.com/yt-dlp/yt-dlp/commit/5fc521cbd0ce7b2410d0935369558838728e205d) ([#12788](https://github.com/yt-dlp/yt-dlp/issues/12788)) by [mireq](https://github.com/mireq)
- **twitch**: clips: [Extract portrait formats](https://github.com/yt-dlp/yt-dlp/commit/61046c31612b30c749cbdae934b7fe26abe659d7) ([#12763](https://github.com/yt-dlp/yt-dlp/issues/12763)) by [DmitryScaletta](https://github.com/DmitryScaletta)
- **youtube**
- [Add `player_js_variant` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/07f04005e40ebdb368920c511e36e98af0077ed3) ([#12767](https://github.com/yt-dlp/yt-dlp/issues/12767)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlist continuation extraction](https://github.com/yt-dlp/yt-dlp/commit/6a6d97b2cbc78f818de05cc96edcdcfd52caa259) ([#12777](https://github.com/yt-dlp/yt-dlp/issues/12777)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**: Miscellaneous: [5e457af](https://github.com/yt-dlp/yt-dlp/commit/5e457af57fae9645b1b8fa0ed689229c8fb9656b) by [bashonly](https://github.com/bashonly)
### 2025.03.27
#### Core changes

@ -1782,6 +1782,7 @@ The following extractors use this feature:
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
#### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
@ -2218,7 +2219,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* The upload dates extracted from YouTube are in UTC.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@ -2237,9 +2238,10 @@ For ease of use, a few more compat options are available:
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
* `--compat-options 2024`: Currently does nothing. Use this to enable all future compat options
The following compat options restore vulnerable behavior from before security patches:

@ -472,6 +472,7 @@ The only reliable way to check if a site is supported is to try it.
- **FoxNewsVideo**
- **FoxSports**
- **fptplay**: fptplay.vn
- **FrancaisFacile**
- **FranceCulture**
- **FranceInter**
- **francetv**
@ -1251,7 +1252,6 @@ The only reliable way to check if a site is supported is to try it.
- **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams
- **rtve.es:television**
- **RTVS**
- **rtvslo.si**
- **rtvslo.si:show**
- **RudoVideo**
@ -1407,6 +1407,7 @@ The only reliable way to check if a site is supported is to try it.
- **StretchInternet**
- **Stripchat**
- **stv:player**
- **stvr**: Slovak Television and Radio (formerly RTVS)
- **Subsplash**
- **subsplash:playlist**
- **Substack**

@ -683,6 +683,7 @@ from .foxnews import (
)
from .foxsports import FoxSportsIE
from .fptplay import FptplayIE
from .francaisfacile import FrancaisFacileIE
from .franceinter import FranceInterIE
from .francetv import (
FranceTVIE,
@ -1738,6 +1739,7 @@ from .roosterteeth import (
RoosterTeethSeriesIE,
)
from .rottentomatoes import RottenTomatoesIE
from .roya import RoyaLiveIE
from .rozhlas import (
MujRozhlasIE,
RozhlasIE,

@ -0,0 +1,87 @@
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
float_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class FrancaisFacileIE(InfoExtractor):
_VALID_URL = r'https?://francaisfacile\.rfi\.fr/[a-z]{2}/(?:actualit%C3%A9|podcasts/[^/#?]+)/(?P<id>[^/#?]+)'
_TESTS = [{
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250305-r%C3%A9concilier-les-jeunes-avec-la-lecture-gr%C3%A2ce-aux-r%C3%A9seaux-sociaux',
'md5': '4f33674cb205744345cc835991100afa',
'info_dict': {
'id': 'WBMZ58952-FLE-FR-20250305',
'display_id': '20250305-réconcilier-les-jeunes-avec-la-lecture-grâce-aux-réseaux-sociaux',
'title': 'Réconcilier les jeunes avec la lecture grâce aux réseaux sociaux',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/05/6b6af52a-f9ba-11ef-a1f8-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:b903c63d8585bd59e8cc4d5f80c4272d',
'duration': 103.15,
'timestamp': 1741177984,
'upload_date': '20250305',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250307-argentine-le-sac-d-un-alpiniste-retrouv%C3%A9-40-ans-apr%C3%A8s-sa-mort',
'md5': 'b8c3a63652d4ae8e8092dda5700c1cd9',
'info_dict': {
'id': 'WBMZ59102-FLE-FR-20250307',
'display_id': '20250307-argentine-le-sac-d-un-alpiniste-retrouvé-40-ans-après-sa-mort',
'title': 'Argentine: le sac d\'un alpiniste retrouvé 40 ans après sa mort',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/07/8edf4082-fb46-11ef-8a37-005056bf762b.mp3',
'ext': 'mp3',
'description': 'md5:7fd088fbdf4a943bb68cf82462160dca',
'duration': 117.74,
'timestamp': 1741352789,
'upload_date': '20250307',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/podcasts/un-mot-une-histoire/20250317-le-mot-de-david-foenkinos-peut-%C3%AAtre',
'md5': 'db83c2cc2589b4c24571c6b6cf14f5f1',
'info_dict': {
'id': 'WBMZ59441-FLE-FR-20250317',
'display_id': '20250317-le-mot-de-david-foenkinos-peut-être',
'title': 'Le mot de David Foenkinos: «peut-être» - Un mot, une histoire',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/17/4ca6cbbe-0315-11f0-a85b-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:3fe35fae035803df696bfa7af2496e49',
'duration': 198.96,
'timestamp': 1742210897,
'upload_date': '20250317',
},
}]
def _real_extract(self, url):
display_id = urllib.parse.unquote(self._match_id(url))
try: # yt-dlp's default user-agents are too old and blocked by the site
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient
webpage = self._download_webpage(url, display_id, impersonate=True)
data = self._search_json(
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',
webpage, 'audio data', display_id)
return {
'id': data['mediaId'],
'display_id': display_id,
'vcodec': 'none',
'title': self._html_extract_title(webpage),
**self._search_json_ld(webpage, display_id, fatal=False),
**traverse_obj(data, {
'title': ('title', {str}),
'url': ('sources', ..., 'url', {url_or_none}, any),
'duration': ('sources', ..., 'duration', {float_or_none}, any),
}),
}

@ -2214,10 +2214,21 @@ class GenericIE(InfoExtractor):
if is_live is not None:
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
return
headers = m3u8_format.get('http_headers') or info.get('http_headers')
duration = self._extract_m3u8_vod_duration(
m3u8_format['url'], info.get('id'), note='Checking m3u8 live status',
errnote='Failed to download m3u8 media playlist', headers=headers)
headers = m3u8_format.get('http_headers') or info.get('http_headers') or {}
display_id = info.get('id')
urlh = self._request_webpage(
m3u8_format['url'], display_id, 'Checking m3u8 live status', errnote=False,
headers={**headers, 'Accept-Encoding': 'identity'}, fatal=False)
if urlh is False:
return
first_bytes = urlh.read(512)
if not first_bytes.startswith(b'#EXTM3U'):
return
m3u8_doc = self._webpage_read_content(
urlh, urlh.url, display_id, prefix=first_bytes, fatal=False, errnote=False)
if not m3u8_doc:
return
duration = self._parse_m3u8_vod_duration(m3u8_doc, display_id)
if not duration:
info['live_status'] = 'is_live'
info['duration'] = info.get('duration') or duration

@ -4,6 +4,7 @@ from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
parse_resolution,
traverse_obj,
unified_timestamp,
url_basename,
@ -83,8 +84,8 @@ class MicrosoftMediusBaseIE(InfoExtractor):
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
return subtitles
def _extract_ism(self, ism_url, video_id):
formats = self._extract_ism_formats(ism_url, video_id)
def _extract_ism(self, ism_url, video_id, fatal=True):
formats = self._extract_ism_formats(ism_url, video_id, fatal=fatal)
for fmt in formats:
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
fmt['language_preference'] = -10
@ -218,9 +219,21 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
'timestamp': 1676339547,
'upload_date': '20230214',
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.*\.png',
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.+\.png',
'subtitles': 'count:14',
},
}, {
'url': 'https://learn.microsoft.com/en-gb/shows/on-demand-instructor-led-training-series/az-900-module-1',
'info_dict': {
'id': '4fe10f7c-d83c-463b-ac0e-c30a8195e01b',
'ext': 'mp4',
'title': 'AZ-900 Cloud fundamentals (1 of 6)',
'description': 'md5:3c2212ce865e9142f402c766441bd5c9',
'thumbnail': r're:https://.+/.+\.jpg',
'timestamp': 1706605184,
'upload_date': '20240130',
},
'params': {'format': 'bv[protocol=https]'},
}]
def _real_extract(self, url):
@ -230,9 +243,32 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
video_info = self._download_json(
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
formats = []
if ism_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoUrl', {url_or_none})):
formats.extend(self._extract_ism(ism_url, video_id, fatal=False))
if hls_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoHLSUrl', {url_or_none})):
formats.extend(self._extract_m3u8_formats(hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
if mpd_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoDashUrl', {url_or_none})):
formats.extend(self._extract_mpd_formats(mpd_url, video_id, mpd_id='dash', fatal=False))
for key in ('low', 'medium', 'high'):
if video_url := traverse_obj(video_info, ('publicVideo', f'{key}QualityVideoUrl', {url_or_none})):
formats.append({
'url': video_url,
'format_id': f'video-http-{key}',
'acodec': 'none',
**parse_resolution(video_url),
})
if audio_url := traverse_obj(video_info, ('publicVideo', 'audioUrl', {url_or_none})):
formats.append({
'url': audio_url,
'format_id': 'audio-http',
'vcodec': 'none',
})
return {
'id': entry_id,
'formats': self._extract_ism(video_info['publicVideo']['adaptiveVideoUrl'], video_id),
'formats': formats,
'subtitles': self._sub_to_dict(traverse_obj(video_info, (
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
'tag': ('language', {str}),

@ -449,9 +449,7 @@ mutation initPlaybackSession(
if not (m3u8_url and token):
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
if 'not entitled' in errors:
raise ExtractorError(errors, expected=True)
elif errors: # Only warn when 'blacked out' since radio formats are available
if errors: # Only warn when 'blacked out' or 'not entitled'; radio formats may be available
self.report_warning(f'API returned errors for {format_id}: {errors}')
else:
self.report_warning(f'No formats available for {format_id} broadcast; skipping')

@ -11,12 +11,15 @@ class On24IE(InfoExtractor):
IE_NAME = 'on24'
IE_DESC = 'ON24'
_VALID_URL = r'''(?x)
https?://event\.on24\.com/(?:
wcc/r/(?P<id_1>\d{7})/(?P<key_1>[0-9A-F]{32})|
eventRegistration/(?:console/EventConsoleApollo|EventLobbyServlet\?target=lobby30)
\.jsp\?(?:[^/#?]*&)?eventid=(?P<id_2>\d{7})[^/#?]*&key=(?P<key_2>[0-9A-F]{32})
)'''
_ID_RE = r'(?P<id>\d{7})'
_KEY_RE = r'(?P<key>[0-9A-F]{32})'
_URL_BASE_RE = r'https?://event\.on24\.com'
_URL_QUERY_RE = rf'(?:[^#]*&)?eventid={_ID_RE}&(?:[^#]+&)?key={_KEY_RE}'
_VALID_URL = [
rf'{_URL_BASE_RE}/wcc/r/{_ID_RE}/{_KEY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/console/(?:EventConsoleApollo\.jsp|apollox/mainEvent/?)\?{_URL_QUERY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/EventLobbyServlet/?\?{_URL_QUERY_RE}',
]
_TESTS = [{
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
@ -34,12 +37,16 @@ class On24IE(InfoExtractor):
}, {
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=3543176&key=BC0F6B968B67C34B50D461D40FDB3E18&groupId=3143628',
'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/console/apollox/mainEvent?&eventid=4843671&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=4EAC9B5C564CC98FF29E619B06A2F743&newConsole=true&nxChe=true&newTabCon=true&consoleEarEventConsole=false&consoleEarCloudApi=false&text_language_id=en&playerwidth=748&playerheight=526&referrer=https%3A%2F%2Fevent.on24.com%2Finterface%2Fregistration%2Fautoreg%2Findex.html%3Fsessionid%3D1%26eventid%3D4843671%26key%3D4EAC9B5C564CC98FF29E619B06A2F743%26email%3D000a3e42-7952-4dd6-8f8a-34c38ea3cf02%2540platform%26firstname%3Ds%26lastname%3Ds%26deletecookie%3Dtrue%26event_email%3DN%26marketing_email%3DN%26std1%3D0642572014177%26std2%3D0642572014179%26std3%3D550165f7-a44e-4725-9fe6-716f89908c2b%26std4%3D0&eventuserid=745776448&contenttype=A&mediametricsessionid=640613707&mediametricid=6810717&usercd=745776448&mode=launch',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
event_id = mobj.group('id_1') or mobj.group('id_2')
event_key = mobj.group('key_1') or mobj.group('key_2')
event_id, event_key = self._match_valid_url(url).group('id', 'key')
event_data = self._download_json(
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',

@ -0,0 +1,43 @@
from .common import InfoExtractor
from ..utils.traversal import traverse_obj
class RoyaLiveIE(InfoExtractor):
_VALID_URL = r'https?://roya\.tv/live-stream/(?P<id>\d+)'
_TESTS = [{
'url': 'https://roya.tv/live-stream/1',
'info_dict': {
'id': '1',
'title': r're:Roya TV \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'ext': 'mp4',
'live_status': 'is_live',
},
}, {
'url': 'https://roya.tv/live-stream/21',
'info_dict': {
'id': '21',
'title': r're:Roya News \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'ext': 'mp4',
'live_status': 'is_live',
},
}, {
'url': 'https://roya.tv/live-stream/10000',
'only_matching': True,
}]
def _real_extract(self, url):
media_id = self._match_id(url)
stream_url = self._download_json(
f'https://ticket.roya-tv.com/api/v5/fastchannel/{media_id}', media_id)['data']['secured_url']
title = traverse_obj(
self._download_json('https://backend.roya.tv/api/v01/channels/schedule-pagination', media_id, fatal=False),
('data', 0, 'channel', lambda _, v: str(v['id']) == media_id, 'title', {str}, any))
return {
'id': media_id,
'formats': self._extract_m3u8_formats(stream_url, media_id, 'mp4', m3u8_id='hls', live=True),
'title': title,
'is_live': True,
}

@ -9,7 +9,9 @@ from ..utils import (
class RTVSIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rtvs\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
IE_NAME = 'stvr'
IE_DESC = 'Slovak Television and Radio (formerly RTVS)'
_VALID_URL = r'https?://(?:www\.)?(?:rtvs|stvr)\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
_TESTS = [{
# radio archive
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
@ -19,7 +21,7 @@ class RTVSIE(InfoExtractor):
'ext': 'mp3',
'title': 'Ostrov pokladov 1 časť.mp3',
'duration': 2854,
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0000/b1R8.rtvs.jpg',
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0000/rtvs-00009383.png',
'display_id': '135331',
},
}, {
@ -30,7 +32,7 @@ class RTVSIE(InfoExtractor):
'ext': 'mp4',
'title': 'Amaro Džives - Náš deň',
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.',
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
'timestamp': 1428555900,
'upload_date': '20150409',
'duration': 4986,
@ -47,8 +49,11 @@ class RTVSIE(InfoExtractor):
'display_id': '307655',
'duration': 831,
'upload_date': '20211111',
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0916/robin.jpg',
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0916/robin.jpg',
},
}, {
'url': 'https://www.stvr.sk/radio/archiv/11224/414872',
'only_matching': True,
}]
def _real_extract(self, url):

@ -122,6 +122,15 @@ class SBSIE(InfoExtractor):
if traverse_obj(media, ('partOfSeries', {dict})):
media['epName'] = traverse_obj(media, ('title', {str}))
# Need to set different language for forced subs or else they have priority over full subs
fixed_subtitles = {}
for lang, subs in subtitles.items():
for sub in subs:
fixed_lang = lang
if sub['url'].lower().endswith('_fe.vtt'):
fixed_lang += '-forced'
fixed_subtitles.setdefault(fixed_lang, []).append(sub)
return {
'id': video_id,
**traverse_obj(media, {
@ -151,6 +160,6 @@ class SBSIE(InfoExtractor):
}),
}),
'formats': formats,
'subtitles': subtitles,
'subtitles': fixed_subtitles,
'uploader': 'SBSC',
}

@ -14,19 +14,20 @@ from ..utils import (
dict_get,
float_or_none,
int_or_none,
join_nonempty,
make_archive_id,
parse_duration,
parse_iso8601,
parse_qs,
qualities,
str_or_none,
traverse_obj,
try_get,
unified_timestamp,
update_url_query,
url_or_none,
urljoin,
)
from ..utils.traversal import traverse_obj, value
class TwitchBaseIE(InfoExtractor):
@ -42,10 +43,10 @@ class TwitchBaseIE(InfoExtractor):
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
'ShareClipRenderStatus': 'f130048a462a0ac86bb54d653c968c514e9ab9ca94db52368c1179e97b0f16eb',
'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9',
'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962',
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
'VideoAccessToken_Clip': '36b89d2507fce29e5ca551df756d27c1cfe079e2609642b4390aa4c35796eb11',
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad',
'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41',
@ -1083,16 +1084,44 @@ class TwitchClipsIE(TwitchBaseIE):
'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
'md5': '761769e1eafce0ffebfb4089cb3847cd',
'info_dict': {
'id': '42850523',
'id': '396245304',
'display_id': 'FaintLightGullWholeWheat',
'ext': 'mp4',
'title': 'EA Play 2016 Live from the Novo Theatre',
'duration': 32,
'view_count': int,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1465767393,
'upload_date': '20160612',
'creator': 'EA',
'uploader': 'stereotype_',
'uploader_id': '43566419',
'creators': ['EA'],
'channel': 'EA',
'channel_id': '25163635',
'channel_is_verified': False,
'channel_follower_count': int,
'uploader': 'EA',
'uploader_id': '25163635',
},
}, {
'url': 'https://www.twitch.tv/xqc/clip/CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
'md5': 'e90fe616b36e722a8cfa562547c543f0',
'info_dict': {
'id': '3207364882',
'display_id': 'CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
'ext': 'mp4',
'title': 'A day in the life of xQc',
'duration': 60,
'view_count': int,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1742869615,
'upload_date': '20250325',
'creators': ['xQc'],
'channel': 'xQc',
'channel_id': '71092938',
'channel_is_verified': True,
'channel_follower_count': int,
'uploader': 'xQc',
'uploader_id': '71092938',
'categories': ['Just Chatting'],
},
}, {
# multiple formats
@ -1116,16 +1145,14 @@ class TwitchClipsIE(TwitchBaseIE):
}]
def _real_extract(self, url):
video_id = self._match_id(url)
slug = self._match_id(url)
clip = self._download_gql(
video_id, [{
'operationName': 'VideoAccessToken_Clip',
'variables': {
'slug': video_id,
},
slug, [{
'operationName': 'ShareClipRenderStatus',
'variables': {'slug': slug},
}],
'Downloading clip access token GraphQL')[0]['data']['clip']
'Downloading clip GraphQL')[0]['data']['clip']
if not clip:
raise ExtractorError(
@ -1135,81 +1162,71 @@ class TwitchClipsIE(TwitchBaseIE):
'sig': clip['playbackAccessToken']['signature'],
'token': clip['playbackAccessToken']['value'],
}
data = self._download_base_gql(
video_id, {
'query': '''{
clip(slug: "%s") {
broadcaster {
displayName
}
createdAt
curator {
displayName
id
}
durationSeconds
id
tiny: thumbnailURL(width: 86, height: 45)
small: thumbnailURL(width: 260, height: 147)
medium: thumbnailURL(width: 480, height: 272)
title
videoQualities {
frameRate
quality
sourceURL
}
viewCount
}
}''' % video_id}, 'Downloading clip GraphQL', fatal=False) # noqa: UP031
if data:
clip = try_get(data, lambda x: x['data']['clip'], dict) or clip
asset_default = traverse_obj(clip, ('assets', 0, {dict})) or {}
asset_portrait = traverse_obj(clip, ('assets', 1, {dict})) or {}
formats = []
for option in clip.get('videoQualities', []):
if not isinstance(option, dict):
continue
source = url_or_none(option.get('sourceURL'))
if not source:
continue
default_aspect_ratio = float_or_none(asset_default.get('aspectRatio'))
formats.extend(traverse_obj(asset_default, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']), {
'url': ('sourceURL', {update_url_query(query=access_query)}),
'format_id': ('quality', {str}),
'height': ('quality', {int_or_none}),
'fps': ('frameRate', {float_or_none}),
'aspect_ratio': {value(default_aspect_ratio)},
})))
portrait_aspect_ratio = float_or_none(asset_portrait.get('aspectRatio'))
for source in traverse_obj(asset_portrait, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']))):
formats.append({
'url': update_url_query(source, access_query),
'format_id': option.get('quality'),
'height': int_or_none(option.get('quality')),
'fps': int_or_none(option.get('frameRate')),
'url': update_url_query(source['sourceURL'], access_query),
'format_id': join_nonempty('portrait', source.get('quality')),
'height': int_or_none(source.get('quality')),
'fps': float_or_none(source.get('frameRate')),
'aspect_ratio': portrait_aspect_ratio,
'quality': -2,
})
thumbnails = []
for thumbnail_id in ('tiny', 'small', 'medium'):
thumbnail_url = clip.get(thumbnail_id)
if not thumbnail_url:
continue
thumb = {
'id': thumbnail_id,
'url': thumbnail_url,
}
mobj = re.search(r'-(\d+)x(\d+)\.', thumbnail_url)
if mobj:
thumb.update({
'height': int(mobj.group(2)),
'width': int(mobj.group(1)),
})
thumbnails.append(thumb)
thumb_asset_default_url = url_or_none(asset_default.get('thumbnailURL'))
if thumb_asset_default_url:
thumbnails.append({
'id': 'default',
'url': thumb_asset_default_url,
'preference': 0,
})
if thumb_asset_portrait_url := url_or_none(asset_portrait.get('thumbnailURL')):
thumbnails.append({
'id': 'portrait',
'url': thumb_asset_portrait_url,
'preference': -1,
})
thumb_default_url = url_or_none(clip.get('thumbnailURL'))
if thumb_default_url and thumb_default_url != thumb_asset_default_url:
thumbnails.append({
'id': 'small',
'url': thumb_default_url,
'preference': -2,
})
old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None)
return {
'id': clip.get('id') or video_id,
'id': clip.get('id') or slug,
'_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None,
'display_id': video_id,
'title': clip.get('title'),
'display_id': slug,
'formats': formats,
'duration': int_or_none(clip.get('durationSeconds')),
'view_count': int_or_none(clip.get('viewCount')),
'timestamp': unified_timestamp(clip.get('createdAt')),
'thumbnails': thumbnails,
'creator': try_get(clip, lambda x: x['broadcaster']['displayName'], str),
'uploader': try_get(clip, lambda x: x['curator']['displayName'], str),
'uploader_id': try_get(clip, lambda x: x['curator']['id'], str),
**traverse_obj(clip, {
'title': ('title', {str}),
'duration': ('durationSeconds', {int_or_none}),
'view_count': ('viewCount', {int_or_none}),
'timestamp': ('createdAt', {parse_iso8601}),
'creators': ('broadcaster', 'displayName', {str}, filter, all),
'channel': ('broadcaster', 'displayName', {str}),
'channel_id': ('broadcaster', 'id', {str}),
'channel_follower_count': ('broadcaster', 'followers', 'totalCount', {int_or_none}),
'channel_is_verified': ('broadcaster', 'isPartner', {bool}),
'uploader': ('broadcaster', 'displayName', {str}),
'uploader_id': ('broadcaster', 'id', {str}),
'categories': ('game', 'displayName', {str}, filter, all, filter),
}),
}

@ -544,7 +544,7 @@ class VKIE(VKBaseIE):
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
'duration': ('duration', {int_or_none}),
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
'title': ('text', {str}),
'title': ('text', {unescapeHTML}),
'start_time': 'time',
}),
}),

@ -1761,6 +1761,16 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
},
]
_PLAYER_JS_VARIANT_MAP = {
'main': 'player_ias.vflset/en_US/base.js',
'tce': 'player_ias_tce.vflset/en_US/base.js',
'tv': 'tv-player-ias.vflset/tv-player-ias.js',
'tv_es6': 'tv-player-es6.vflset/tv-player-es6.js',
'phone': 'player-plasma-ias-phone-en_US.vflset/base.js',
'tablet': 'player-plasma-ias-tablet-en_US.vflset/base.js',
}
_INVERSE_PLAYER_JS_VARIANT_MAP = {v: k for k, v in _PLAYER_JS_VARIANT_MAP.items()}
@classmethod
def suitable(cls, url):
from yt_dlp.utils import parse_qs
@ -1940,6 +1950,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
get_all=False, expected_type=str)
if not player_url:
return
requested_js_variant = self._configuration_arg('player_js_variant', [''])[0] or 'actual'
if requested_js_variant in self._PLAYER_JS_VARIANT_MAP:
player_id = self._extract_player_info(player_url)
original_url = player_url
player_url = f'/s/player/{player_id}/{self._PLAYER_JS_VARIANT_MAP[requested_js_variant]}'
if original_url != player_url:
self.write_debug(
f'Forcing "{requested_js_variant}" player JS variant for player {player_id}\n'
f' original url = {original_url}', only_once=True)
elif requested_js_variant != 'actual':
self.report_warning(
f'Invalid player JS variant name "{requested_js_variant}" requested. '
f'Valid choices are: {", ".join(self._PLAYER_JS_VARIANT_MAP)}', only_once=True)
return urljoin('https://www.youtube.com', player_url)
def _download_player_url(self, video_id, fatal=False):
@ -1954,6 +1979,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if player_version:
return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js'
def _player_js_cache_key(self, player_url):
player_id = self._extract_player_info(player_url)
player_path = remove_start(urllib.parse.urlparse(player_url).path, f'/s/player/{player_id}/')
variant = self._INVERSE_PLAYER_JS_VARIANT_MAP.get(player_path)
if not variant:
self.write_debug(
f'Unable to determine player JS variant\n'
f' player = {player_url}', only_once=True)
variant = re.sub(r'[^a-zA-Z0-9]', '_', remove_end(player_path, '.js'))
return join_nonempty(player_id, variant)
def _signature_cache_id(self, example_sig):
""" Return a string representation of a signature """
return '.'.join(str(len(part)) for part in example_sig.split('.'))
@ -1969,25 +2005,24 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return id_m.group('id')
def _load_player(self, video_id, player_url, fatal=True):
player_id = self._extract_player_info(player_url)
if player_id not in self._code_cache:
player_js_key = self._player_js_cache_key(player_url)
if player_js_key not in self._code_cache:
code = self._download_webpage(
player_url, video_id, fatal=fatal,
note='Downloading player ' + player_id,
errnote=f'Download of {player_url} failed')
note=f'Downloading player {player_js_key}',
errnote=f'Download of {player_js_key} failed')
if code:
self._code_cache[player_id] = code
return self._code_cache.get(player_id)
self._code_cache[player_js_key] = code
return self._code_cache.get(player_js_key)
def _extract_signature_function(self, video_id, player_url, example_sig):
player_id = self._extract_player_info(player_url)
# Read from filesystem cache
func_id = f'js_{player_id}_{self._signature_cache_id(example_sig)}'
func_id = join_nonempty(
self._player_js_cache_key(player_url), self._signature_cache_id(example_sig))
assert os.path.basename(func_id) == func_id
self.write_debug(f'Extracting signature function {func_id}')
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.27'), None
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.31'), None
if not cache_spec:
code = self._load_player(video_id, player_url)
@ -2085,22 +2120,22 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return ret
return inner
def _load_nsig_code_from_cache(self, player_id):
cache_id = ('nsig code', player_id)
def _load_nsig_code_from_cache(self, player_url):
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
if func_code := self._player_cache.get(cache_id):
return func_code
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2025.03.27')
func_code = self.cache.load(*cache_id, min_ver='2025.03.31')
if func_code:
self._player_cache[cache_id] = func_code
return func_code
def _store_nsig_code_to_cache(self, player_id, func_code):
cache_id = ('nsig code', player_id)
def _store_nsig_code_to_cache(self, player_url, func_code):
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
if cache_id not in self._player_cache:
self.cache.store('youtube-nsig', player_id, func_code)
self.cache.store(*cache_id, func_code)
self._player_cache[cache_id] = func_code
def _decrypt_signature(self, s, video_id, player_url):
@ -2144,7 +2179,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self.write_debug(f'Decrypted nsig {s} => {ret}')
# Only cache nsig func JS code to disk if successful, and only once
self._store_nsig_code_to_cache(player_id, func_code)
self._store_nsig_code_to_cache(player_url, func_code)
return ret
def _extract_n_function_name(self, jscode, player_url=None):
@ -2263,7 +2298,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_n_function_code(self, video_id, player_url):
player_id = self._extract_player_info(player_url)
func_code = self._load_nsig_code_from_cache(player_id)
func_code = self._load_nsig_code_from_cache(player_url)
jscode = func_code or self._load_player(video_id, player_url)
jsi = JSInterpreter(jscode)
@ -3226,7 +3261,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if player_url:
self.report_warning(
f'nsig extraction failed: Some formats may be missing\n'
f' n = {query["n"][0]} ; player = {player_url}',
f' n = {query["n"][0]} ; player = {player_url}\n'
f' {bug_reports_message(before="")}',
video_id=video_id, only_once=True)
self.write_debug(e, only_once=True)
else:
@ -3244,7 +3280,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
is_damaged = try_call(lambda: format_duration < duration // 2)
if is_damaged:
self.report_warning(
f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True)
'Some formats are possibly damaged. They will be deprioritized', video_id, only_once=True)
po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN)

@ -500,7 +500,8 @@ def create_parser():
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
'2023': ['prefer-vp9-sort'],
'2023': ['2024', 'prefer-vp9-sort'],
'2024': [],
},
}, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc '

@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2025.03.27'
__version__ = '2025.03.31'
RELEASE_GIT_HEAD = '48be862b32648bff5b3e553e40fca4dcc6e88b28'
RELEASE_GIT_HEAD = '5e457af57fae9645b1b8fa0ed689229c8fb9656b'
VARIANT = None
@ -12,4 +12,4 @@ CHANNEL = 'stable'
ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.03.27'
_pkg_version = '2025.03.31'

Loading…
Cancel
Save