Merge branch 'yt-dlp:master' into telemb

pull/13195/head
doe1080 3 months ago committed by GitHub
commit ce2fb298f3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -775,3 +775,7 @@ GeoffreyFrogeye
Pawka Pawka
v3DJG6GL v3DJG6GL
yozel yozel
brian6932
iednod55
maxbin123
nullpos

@ -4,6 +4,61 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2025.06.09
#### Extractor changes
- [Improve JSON LD thumbnails extraction](https://github.com/yt-dlp/yt-dlp/commit/85c8a405e3651dc041b758f4744d4fb3c4c55e01) ([#13368](https://github.com/yt-dlp/yt-dlp/issues/13368)) by [bashonly](https://github.com/bashonly), [doe1080](https://github.com/doe1080)
- **10play**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/6d265388c6e943419ac99e9151cf75a3265f980f) ([#13349](https://github.com/yt-dlp/yt-dlp/issues/13349)) by [bashonly](https://github.com/bashonly)
- **adobepass**
- [Add Fubo MSO](https://github.com/yt-dlp/yt-dlp/commit/eee90acc47d7f8de24afaa8b0271ccaefdf6e88c) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [maxbin123](https://github.com/maxbin123)
- [Always add newer user-agent when required](https://github.com/yt-dlp/yt-dlp/commit/0ee1102268cf31b07f8a8318a47424c66b2f7378) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- [Fix Philo MSO authentication](https://github.com/yt-dlp/yt-dlp/commit/943083edcd3df45aaa597a6967bc6c95b720f54c) ([#13335](https://github.com/yt-dlp/yt-dlp/issues/13335)) by [Sipherdrakon](https://github.com/Sipherdrakon)
- [Rework to require software statement](https://github.com/yt-dlp/yt-dlp/commit/711c5d5d098fee2992a1a624b1c4b30364b91426) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly), [maxbin123](https://github.com/maxbin123)
- [Validate login URL before sending credentials](https://github.com/yt-dlp/yt-dlp/commit/89c1b349ad81318d9d3bea76c01c891696e58d38) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **aenetworks**
- [Fix playlist extractors](https://github.com/yt-dlp/yt-dlp/commit/f37d599a697e82fe68b423865897d55bae34f373) ([#13408](https://github.com/yt-dlp/yt-dlp/issues/13408)) by [Sipherdrakon](https://github.com/Sipherdrakon)
- [Fix provider-locked content extraction](https://github.com/yt-dlp/yt-dlp/commit/6693d6603358ae6beca834dbd822a7917498b813) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [maxbin123](https://github.com/maxbin123)
- **bilibilibangumi**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/13e55162719528d42d2133e16b65ff59a667a6e4) ([#13416](https://github.com/yt-dlp/yt-dlp/issues/13416)) by [c-basalt](https://github.com/c-basalt)
- **brightcove**: new: [Adapt to new AdobePass requirement](https://github.com/yt-dlp/yt-dlp/commit/98f8eec956e3b16cb66a3d49cc71af3807db795e) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **cu.ntv.co.jp**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/aa863ddab9b1d104678e9cf39bb76f5b14fca660) ([#13302](https://github.com/yt-dlp/yt-dlp/issues/13302)) by [doe1080](https://github.com/doe1080), [nullpos](https://github.com/nullpos)
- **go**: [Fix provider-locked content extraction](https://github.com/yt-dlp/yt-dlp/commit/2e5bf002dad16f5ce35aa2023d392c9e518fcd8f) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly), [maxbin123](https://github.com/maxbin123)
- **nbc**: [Rework and adapt extractors to new AdobePass flow](https://github.com/yt-dlp/yt-dlp/commit/2d7949d5642bc37d1e71bf00c9a55260e5505d58) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **nobelprize**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/97ddfefeb4faba6e61cd80996c16952b8eab16f3) ([#13205](https://github.com/yt-dlp/yt-dlp/issues/13205)) by [doe1080](https://github.com/doe1080)
- **odnoklassniki**: [Detect and raise when login is required](https://github.com/yt-dlp/yt-dlp/commit/148a1eb4c59e127965396c7a6e6acf1979de459e) ([#13361](https://github.com/yt-dlp/yt-dlp/issues/13361)) by [bashonly](https://github.com/bashonly)
- **patreon**: [Fix m3u8 formats extraction](https://github.com/yt-dlp/yt-dlp/commit/e0d6c0822930f6e63f574d46d946a58b73ecd10c) ([#13266](https://github.com/yt-dlp/yt-dlp/issues/13266)) by [bashonly](https://github.com/bashonly) (With fixes in [1a8a03e](https://github.com/yt-dlp/yt-dlp/commit/1a8a03ea8d827107319a18076ee3505090667c5a))
- **podchaser**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/538eb305673c26bff6a2b12f1c96375fe02ce41a) ([#13271](https://github.com/yt-dlp/yt-dlp/issues/13271)) by [bashonly](https://github.com/bashonly)
- **sr**: mediathek: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/e3c605a61f4cc2de9059f37434fa108c3c20f58e) ([#13294](https://github.com/yt-dlp/yt-dlp/issues/13294)) by [doe1080](https://github.com/doe1080)
- **stacommu**: [Avoid partial stream formats](https://github.com/yt-dlp/yt-dlp/commit/5d96527be80dc1ed1702d9cd548ff86de570ad70) ([#13412](https://github.com/yt-dlp/yt-dlp/issues/13412)) by [bashonly](https://github.com/bashonly)
- **startrek**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/a8bf0011bde92b3f1324a98bfbd38932fd3ebe18) ([#13188](https://github.com/yt-dlp/yt-dlp/issues/13188)) by [doe1080](https://github.com/doe1080)
- **svt**: play: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e1b6062f8c4a3fa33c65269d48d09ec78de765a2) ([#13329](https://github.com/yt-dlp/yt-dlp/issues/13329)) by [barsnick](https://github.com/barsnick), [bashonly](https://github.com/bashonly)
- **telecinco**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/03dba2012d9bd3f402fa8c2f122afba89bbd22a4) ([#13379](https://github.com/yt-dlp/yt-dlp/issues/13379)) by [bashonly](https://github.com/bashonly)
- **theplatform**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/ed108b3ea481c6a4b5215a9302ba92d74baa2425) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **toutiao**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/f8051e3a61686c5db1de5f5746366ecfbc3ad20c) ([#13246](https://github.com/yt-dlp/yt-dlp/issues/13246)) by [doe1080](https://github.com/doe1080)
- **turner**: [Adapt extractors to new AdobePass flow](https://github.com/yt-dlp/yt-dlp/commit/0daddc780d3ac5bebc3a3ec5b884d9243cbc0745) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **twitcasting**: [Fix password-protected livestream support](https://github.com/yt-dlp/yt-dlp/commit/52f9729c9a92ad4656d746ff0b1acecb87b3e96d) ([#13097](https://github.com/yt-dlp/yt-dlp/issues/13097)) by [bashonly](https://github.com/bashonly)
- **twitter**: broadcast: [Support events URLs](https://github.com/yt-dlp/yt-dlp/commit/7794374de8afb20499b023107e2abfd4e6b93ee4) ([#13248](https://github.com/yt-dlp/yt-dlp/issues/13248)) by [doe1080](https://github.com/doe1080)
- **umg**: de: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/4e7c1ea346b510280218b47e8653dbbca3a69870) ([#13373](https://github.com/yt-dlp/yt-dlp/issues/13373)) by [doe1080](https://github.com/doe1080)
- **vice**: [Mark extractors as broken](https://github.com/yt-dlp/yt-dlp/commit/6121559e027a04574690799c1776bc42bb51af31) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [bashonly](https://github.com/bashonly)
- **vimeo**: [Extract subtitles from player subdomain](https://github.com/yt-dlp/yt-dlp/commit/c723c4e5e78263df178dbe69844a3d05f3ef9e35) ([#13350](https://github.com/yt-dlp/yt-dlp/issues/13350)) by [bashonly](https://github.com/bashonly)
- **watchespn**: [Fix provider-locked content extraction](https://github.com/yt-dlp/yt-dlp/commit/b094747e93cfb0a2c53007120e37d0d84d41f030) ([#13131](https://github.com/yt-dlp/yt-dlp/issues/13131)) by [maxbin123](https://github.com/maxbin123)
- **weverse**: [Support login with oauth refresh tokens](https://github.com/yt-dlp/yt-dlp/commit/3fe72e9eea38d9a58211cde42cfaa577ce020e2c) ([#13284](https://github.com/yt-dlp/yt-dlp/issues/13284)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Add `tv_simply` player client](https://github.com/yt-dlp/yt-dlp/commit/1fd0e88b67db53ad163393d6965f68e908fa70e3) ([#13389](https://github.com/yt-dlp/yt-dlp/issues/13389)) by [gamer191](https://github.com/gamer191)
- [Extract srt subtitles](https://github.com/yt-dlp/yt-dlp/commit/231349786e8c42089c2e079ec94c0ea866c37999) ([#13411](https://github.com/yt-dlp/yt-dlp/issues/13411)) by [gamer191](https://github.com/gamer191)
- [Fix `--mark-watched` support](https://github.com/yt-dlp/yt-dlp/commit/b5be29fa58ec98226e11621fd9c58585bcff6879) ([#13222](https://github.com/yt-dlp/yt-dlp/issues/13222)) by [brian6932](https://github.com/brian6932), [iednod55](https://github.com/iednod55)
- [Fix automatic captions for some client combinations](https://github.com/yt-dlp/yt-dlp/commit/53ea743a9c158f8ca2d75a09ca44ba68606042d8) ([#13268](https://github.com/yt-dlp/yt-dlp/issues/13268)) by [bashonly](https://github.com/bashonly)
- [Improve signature extraction debug output](https://github.com/yt-dlp/yt-dlp/commit/d30a49742cfa22e61c47df4ac0e7334d648fb85d) ([#13327](https://github.com/yt-dlp/yt-dlp/issues/13327)) by [bashonly](https://github.com/bashonly)
- [Rework nsig function name extraction](https://github.com/yt-dlp/yt-dlp/commit/9e38b273b7ac942e7e9fc05a651ed810ab7d30ba) ([#13403](https://github.com/yt-dlp/yt-dlp/issues/13403)) by [Grub4K](https://github.com/Grub4K)
- [nsig code improvements and cleanup](https://github.com/yt-dlp/yt-dlp/commit/f7bbf5a617f9ab54ef51eaef99be36e175b5e9c3) ([#13280](https://github.com/yt-dlp/yt-dlp/issues/13280)) by [bashonly](https://github.com/bashonly)
- **zdf**: [Fix language extraction and format sorting](https://github.com/yt-dlp/yt-dlp/commit/db162b76f6bdece50babe2e0cacfe56888c2e125) ([#13313](https://github.com/yt-dlp/yt-dlp/issues/13313)) by [InvalidUsernameException](https://github.com/InvalidUsernameException)
#### Misc. changes
- **build**
- [Exclude `pkg_resources` from being collected](https://github.com/yt-dlp/yt-dlp/commit/cc749a8a3b8b6e5c05318868c72a403f376a1b38) ([#13320](https://github.com/yt-dlp/yt-dlp/issues/13320)) by [bashonly](https://github.com/bashonly)
- [Fix macOS requirements caching](https://github.com/yt-dlp/yt-dlp/commit/201812100f315c6727a4418698d5b4e8a79863d4) ([#13328](https://github.com/yt-dlp/yt-dlp/issues/13328)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [339614a](https://github.com/yt-dlp/yt-dlp/commit/339614a173c74b42d63e858c446a9cae262a13af) by [bashonly](https://github.com/bashonly)
- **test**: postprocessors: [Remove binary thumbnail test data](https://github.com/yt-dlp/yt-dlp/commit/a9b370069838e84d44ac7ad095d657003665885a) ([#13341](https://github.com/yt-dlp/yt-dlp/issues/13341)) by [bashonly](https://github.com/bashonly)
### 2025.05.22 ### 2025.05.22
#### Core changes #### Core changes

@ -1795,9 +1795,9 @@ Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` be
The following extractors use this feature: The following extractors use this feature:
#### youtube #### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes * `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube/_base.py](https://github.com/yt-dlp/yt-dlp/blob/415b4c9f955b1a0391204bd24a7132590e7b3bdb/yt_dlp/extractor/youtube/_base.py#L402-L409) for the list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively * `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios` * `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv`, `tv_simply` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details * `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp. * `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual` * `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`

@ -5,6 +5,8 @@ If a site is not listed here, it might still be supported by yt-dlp's embed extr
Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them. Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them.
The only reliable way to check if a site is supported is to try it. The only reliable way to check if a site is supported is to try it.
- **10play**: [*10play*](## "netrc machine")
- **10play:season**
- **17live** - **17live**
- **17live:clip** - **17live:clip**
- **17live:vod** - **17live:vod**
@ -295,7 +297,7 @@ The only reliable way to check if a site is supported is to try it.
- **CNNIndonesia** - **CNNIndonesia**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralTV** - **ComedyCentralTV**
- **ConanClassic** - **ConanClassic**: (**Currently broken**)
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv** - **CONtv**
- **CookingChannel** - **CookingChannel**
@ -317,7 +319,7 @@ The only reliable way to check if a site is supported is to try it.
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
- **CTV** - **CTV**
- **CTVNews** - **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network - **cu.ntv.co.jp**: 日テレ無料TADA!
- **CultureUnplugged** - **CultureUnplugged**
- **curiositystream**: [*curiositystream*](## "netrc machine") - **curiositystream**: [*curiositystream*](## "netrc machine")
- **curiositystream:collections**: [*curiositystream*](## "netrc machine") - **curiositystream:collections**: [*curiositystream*](## "netrc machine")
@ -882,19 +884,19 @@ The only reliable way to check if a site is supported is to try it.
- **Naver** - **Naver**
- **Naver:live** - **Naver:live**
- **navernow** - **navernow**
- **nba** - **nba**: (**Currently broken**)
- **nba:channel** - **nba:channel**: (**Currently broken**)
- **nba:embed** - **nba:embed**: (**Currently broken**)
- **nba:watch** - **nba:watch**: (**Currently broken**)
- **nba:watch:collection** - **nba:watch:collection**: (**Currently broken**)
- **nba:watch:embed** - **nba:watch:embed**: (**Currently broken**)
- **NBC** - **NBC**
- **NBCNews** - **NBCNews**
- **nbcolympics** - **nbcolympics**
- **nbcolympics:stream** - **nbcolympics:stream**: (**Currently broken**)
- **NBCSports** - **NBCSports**: (**Currently broken**)
- **NBCSportsStream** - **NBCSportsStream**: (**Currently broken**)
- **NBCSportsVPlayer** - **NBCSportsVPlayer**: (**Currently broken**)
- **NBCStations** - **NBCStations**
- **ndr**: NDR.de - Norddeutscher Rundfunk - **ndr**: NDR.de - Norddeutscher Rundfunk
- **ndr:embed** - **ndr:embed**
@ -970,7 +972,7 @@ The only reliable way to check if a site is supported is to try it.
- **Nitter** - **Nitter**
- **njoy**: N-JOY - **njoy**: N-JOY
- **njoy:embed** - **njoy:embed**
- **NobelPrize**: (**Currently broken**) - **NobelPrize**
- **NoicePodcast** - **NoicePodcast**
- **NonkTube** - **NonkTube**
- **NoodleMagazine** - **NoodleMagazine**
@ -1393,14 +1395,14 @@ The only reliable way to check if a site is supported is to try it.
- **SpreakerShow** - **SpreakerShow**
- **SpringboardPlatform** - **SpringboardPlatform**
- **SproutVideo** - **SproutVideo**
- **sr:mediathek**: Saarländischer Rundfunk (**Currently broken**) - **sr:mediathek**: Saarländischer Rundfunk
- **SRGSSR** - **SRGSSR**
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites - **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
- **StacommuLive**: [*stacommu*](## "netrc machine") - **StacommuLive**: [*stacommu*](## "netrc machine")
- **StacommuVOD**: [*stacommu*](## "netrc machine") - **StacommuVOD**: [*stacommu*](## "netrc machine")
- **StagePlusVODConcert**: [*stageplus*](## "netrc machine") - **StagePlusVODConcert**: [*stageplus*](## "netrc machine")
- **stanfordoc**: Stanford Open ClassRoom - **stanfordoc**: Stanford Open ClassRoom
- **StarTrek**: (**Currently broken**) - **startrek**: STAR TREK
- **startv** - **startv**
- **Steam** - **Steam**
- **SteamCommunityBroadcast** - **SteamCommunityBroadcast**
@ -1423,12 +1425,11 @@ The only reliable way to check if a site is supported is to try it.
- **SunPorno** - **SunPorno**
- **sverigesradio:episode** - **sverigesradio:episode**
- **sverigesradio:publication** - **sverigesradio:publication**
- **SVT** - **svt:page**
- **SVTPage** - **svt:play**: SVT Play and Öppet arkiv
- **SVTPlay**: SVT Play and Öppet arkiv - **svt:play:series**
- **SVTSeries**
- **SwearnetEpisode** - **SwearnetEpisode**
- **Syfy**: (**Currently broken**) - **Syfy**
- **SYVDK** - **SYVDK**
- **SztvHu** - **SztvHu**
- **t-online.de**: (**Currently broken**) - **t-online.de**: (**Currently broken**)
@ -1472,8 +1473,6 @@ The only reliable way to check if a site is supported is to try it.
- **Telewebion**: (**Currently broken**) - **Telewebion**: (**Currently broken**)
- **Tempo** - **Tempo**
- **TennisTV**: [*tennistv*](## "netrc machine") - **TennisTV**: [*tennistv*](## "netrc machine")
- **TenPlay**: [*10play*](## "netrc machine")
- **TenPlaySeason**
- **TF1** - **TF1**
- **TFO** - **TFO**
- **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine") - **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine")
@ -1511,6 +1510,7 @@ The only reliable way to check if a site is supported is to try it.
- **tokfm:podcast** - **tokfm:podcast**
- **ToonGoggles** - **ToonGoggles**
- **tou.tv**: [*toutv*](## "netrc machine") - **tou.tv**: [*toutv*](## "netrc machine")
- **toutiao**: 今日头条
- **Toypics**: Toypics video (**Currently broken**) - **Toypics**: Toypics video (**Currently broken**)
- **ToypicsUser**: Toypics user profile (**Currently broken**) - **ToypicsUser**: Toypics user profile (**Currently broken**)
- **TrailerAddict**: (**Currently broken**) - **TrailerAddict**: (**Currently broken**)
@ -1600,7 +1600,7 @@ The only reliable way to check if a site is supported is to try it.
- **UKTVPlay** - **UKTVPlay**
- **UlizaPlayer** - **UlizaPlayer**
- **UlizaPortal**: ulizaportal.jp - **UlizaPortal**: ulizaportal.jp
- **umg:de**: Universal Music Deutschland (**Currently broken**) - **umg:de**: Universal Music Deutschland
- **Unistra** - **Unistra**
- **Unity**: (**Currently broken**) - **Unity**: (**Currently broken**)
- **uol.com.br** - **uol.com.br**
@ -1623,9 +1623,9 @@ The only reliable way to check if a site is supported is to try it.
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet - **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com** - **vh1.com**
- **vhx:embed**: [*vimeo*](## "netrc machine") - **vhx:embed**: [*vimeo*](## "netrc machine")
- **vice** - **vice**: (**Currently broken**)
- **vice:article** - **vice:article**: (**Currently broken**)
- **vice:show** - **vice:show**: (**Currently broken**)
- **Viddler** - **Viddler**
- **Videa** - **Videa**
- **video.arnes.si**: Arnes Video - **video.arnes.si**: Arnes Video

@ -1947,6 +1947,137 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
with self.assertWarns(DeprecationWarning): with self.assertWarns(DeprecationWarning):
self.assertEqual(self.ie._search_nextjs_data('', None, default='{}'), {}) self.assertEqual(self.ie._search_nextjs_data('', None, default='{}'), {})
def test_search_nuxt_json(self):
HTML_TMPL = '<script data-ssr="true" id="__NUXT_DATA__" type="application/json">[{}]</script>'
VALID_DATA = '''
["ShallowReactive",1],
{"data":2,"state":21,"once":25,"_errors":28,"_server_errors":30},
["ShallowReactive",3],
{"$abcdef123456":4},
{"podcast":5,"activeEpisodeData":7},
{"podcast":6,"seasons":14},
{"title":10,"id":11},
["Reactive",8],
{"episode":9,"creators":18,"empty_list":20},
{"title":12,"id":13,"refs":34,"empty_refs":35},
"Series Title",
"podcast-id-01",
"Episode Title",
"episode-id-99",
[15,16,17],
1,
2,
3,
[19],
"Podcast Creator",
[],
{"$ssite-config":22},
{"env":23,"name":24,"map":26,"numbers":14},
"production",
"podcast-website",
["Set"],
["Reactive",27],
["Map"],
["ShallowReactive",29],
{},
["NuxtError",31],
{"status":32,"message":33},
503,
"Service Unavailable",
[36,37],
[38,39],
["Ref",40],
["ShallowRef",41],
["EmptyRef",42],
["EmptyShallowRef",43],
"ref",
"shallow_ref",
"{\\"ref\\":1}",
"{\\"shallow_ref\\":2}"
'''
PAYLOAD = {
'data': {
'$abcdef123456': {
'podcast': {
'podcast': {
'title': 'Series Title',
'id': 'podcast-id-01',
},
'seasons': [1, 2, 3],
},
'activeEpisodeData': {
'episode': {
'title': 'Episode Title',
'id': 'episode-id-99',
'refs': ['ref', 'shallow_ref'],
'empty_refs': [{'ref': 1}, {'shallow_ref': 2}],
},
'creators': ['Podcast Creator'],
'empty_list': [],
},
},
},
'state': {
'$ssite-config': {
'env': 'production',
'name': 'podcast-website',
'map': [],
'numbers': [1, 2, 3],
},
},
'once': [],
'_errors': {},
'_server_errors': {
'status': 503,
'message': 'Service Unavailable',
},
}
PARTIALLY_INVALID = [(
'''
{"data":1},
{"invalid_raw_list":2},
[15,16,17]
''',
{'data': {'invalid_raw_list': [None, None, None]}},
), (
'''
{"data":1},
["EmptyRef",2],
"not valid JSON"
''',
{'data': None},
), (
'''
{"data":1},
["EmptyShallowRef",2],
"not valid JSON"
''',
{'data': None},
)]
INVALID = [
'''
[]
''',
'''
["unsupported",1],
{"data":2},
{}
''',
]
DEFAULT = object()
self.assertEqual(self.ie._search_nuxt_json(HTML_TMPL.format(VALID_DATA), None), PAYLOAD)
self.assertEqual(self.ie._search_nuxt_json('', None, fatal=False), {})
self.assertIs(self.ie._search_nuxt_json('', None, default=DEFAULT), DEFAULT)
for data, expected in PARTIALLY_INVALID:
self.assertEqual(
self.ie._search_nuxt_json(HTML_TMPL.format(data), None, fatal=False), expected)
for data in INVALID:
self.assertIs(
self.ie._search_nuxt_json(HTML_TMPL.format(data), None, default=DEFAULT), DEFAULT)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

@ -0,0 +1,235 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import datetime as dt
import json
import math
import re
import unittest
from yt_dlp.utils.jslib import devalue
TEST_CASES_EQUALS = [{
'name': 'int',
'unparsed': [-42],
'parsed': -42,
}, {
'name': 'str',
'unparsed': ['woo!!!'],
'parsed': 'woo!!!',
}, {
'name': 'Number',
'unparsed': [['Object', 42]],
'parsed': 42,
}, {
'name': 'String',
'unparsed': [['Object', 'yar']],
'parsed': 'yar',
}, {
'name': 'Infinity',
'unparsed': -4,
'parsed': math.inf,
}, {
'name': 'negative Infinity',
'unparsed': -5,
'parsed': -math.inf,
}, {
'name': 'negative zero',
'unparsed': -6,
'parsed': -0.0,
}, {
'name': 'RegExp',
'unparsed': [['RegExp', 'regexp', 'gim']], # XXX: flags are ignored
'parsed': re.compile('regexp'),
}, {
'name': 'Date',
'unparsed': [['Date', '2001-09-09T01:46:40.000Z']],
'parsed': dt.datetime.fromtimestamp(1e9, tz=dt.timezone.utc),
}, {
'name': 'Array',
'unparsed': [[1, 2, 3], 'a', 'b', 'c'],
'parsed': ['a', 'b', 'c'],
}, {
'name': 'Array (empty)',
'unparsed': [[]],
'parsed': [],
}, {
'name': 'Array (sparse)',
'unparsed': [[-2, 1, -2], 'b'],
'parsed': [None, 'b', None],
}, {
'name': 'Object',
'unparsed': [{'foo': 1, 'x-y': 2}, 'bar', 'z'],
'parsed': {'foo': 'bar', 'x-y': 'z'},
}, {
'name': 'Set',
'unparsed': [['Set', 1, 2, 3], 1, 2, 3],
'parsed': [1, 2, 3],
}, {
'name': 'Map',
'unparsed': [['Map', 1, 2], 'a', 'b'],
'parsed': [['a', 'b']],
}, {
'name': 'BigInt',
'unparsed': [['BigInt', '1']],
'parsed': 1,
}, {
'name': 'Uint8Array',
'unparsed': [['Uint8Array', 'AQID']],
'parsed': [1, 2, 3],
}, {
'name': 'ArrayBuffer',
'unparsed': [['ArrayBuffer', 'AQID']],
'parsed': [1, 2, 3],
}, {
'name': 'str (repetition)',
'unparsed': [[1, 1], 'a string'],
'parsed': ['a string', 'a string'],
}, {
'name': 'None (repetition)',
'unparsed': [[1, 1], None],
'parsed': [None, None],
}, {
'name': 'dict (repetition)',
'unparsed': [[1, 1], {}],
'parsed': [{}, {}],
}, {
'name': 'Object without prototype',
'unparsed': [['null']],
'parsed': {},
}, {
'name': 'cross-realm POJO',
'unparsed': [{}],
'parsed': {},
}]
TEST_CASES_IS = [{
'name': 'bool',
'unparsed': [True],
'parsed': True,
}, {
'name': 'Boolean',
'unparsed': [['Object', False]],
'parsed': False,
}, {
'name': 'undefined',
'unparsed': -1,
'parsed': None,
}, {
'name': 'null',
'unparsed': [None],
'parsed': None,
}, {
'name': 'NaN',
'unparsed': -3,
'parsed': math.nan,
}]
TEST_CASES_INVALID = [{
'name': 'empty string',
'unparsed': '',
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'hole',
'unparsed': -2,
'error': ValueError,
'pattern': r'invalid integer input',
}, {
'name': 'string',
'unparsed': 'hello',
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'number',
'unparsed': 42,
'error': ValueError,
'pattern': r'invalid integer input',
}, {
'name': 'boolean',
'unparsed': True,
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'null',
'unparsed': None,
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'object',
'unparsed': {},
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'empty array',
'unparsed': [],
'error': ValueError,
'pattern': r'expected a non-empty list as input',
}, {
'name': 'Python negative indexing',
'unparsed': [[1, 2, 3, 4, 5, 6, 7, -7], 1, 2, 3, 4, 5, 6, 7],
'error': IndexError,
'pattern': r'invalid index: -7',
}]
class TestDevalue(unittest.TestCase):
def test_devalue_parse_equals(self):
for tc in TEST_CASES_EQUALS:
self.assertEqual(devalue.parse(tc['unparsed']), tc['parsed'], tc['name'])
def test_devalue_parse_is(self):
for tc in TEST_CASES_IS:
self.assertIs(devalue.parse(tc['unparsed']), tc['parsed'], tc['name'])
def test_devalue_parse_invalid(self):
for tc in TEST_CASES_INVALID:
with self.assertRaisesRegex(tc['error'], tc['pattern'], msg=tc['name']):
devalue.parse(tc['unparsed'])
def test_devalue_parse_cyclical(self):
name = 'Map (cyclical)'
result = devalue.parse([['Map', 1, 0], 'self'])
self.assertEqual(result[0][0], 'self', name)
self.assertIs(result, result[0][1], name)
name = 'Set (cyclical)'
result = devalue.parse([['Set', 0, 1], 42])
self.assertEqual(result[1], 42, name)
self.assertIs(result, result[0], name)
result = devalue.parse([[0]])
self.assertIs(result, result[0], 'Array (cyclical)')
name = 'Object (cyclical)'
result = devalue.parse([{'self': 0}])
self.assertIs(result, result['self'], name)
name = 'Object with null prototype (cyclical)'
result = devalue.parse([['null', 'self', 0]])
self.assertIs(result, result['self'], name)
name = 'Objects (cyclical)'
result = devalue.parse([[1, 2], {'second': 2}, {'first': 1}])
self.assertIs(result[0], result[1]['first'], name)
self.assertIs(result[1], result[0]['second'], name)
def test_devalue_parse_revivers(self):
self.assertEqual(
devalue.parse([['indirect', 1], {'a': 2}, 'b'], revivers={'indirect': lambda x: x}),
{'a': 'b'}, 'revivers (indirect)')
self.assertEqual(
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': lambda x: json.loads(x)}),
{'a': 0}, 'revivers (parse)')
if __name__ == '__main__':
unittest.main()

@ -11,7 +11,7 @@ class TestGetWebPoContentBinding:
@pytest.mark.parametrize('client_name, context, is_authenticated, expected', [ @pytest.mark.parametrize('client_name, context, is_authenticated, expected', [
*[(client, context, is_authenticated, expected) for client in [ *[(client, context, is_authenticated, expected) for client in [
'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER'] 'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER', 'TVHTML5_SIMPLY']
for context, is_authenticated, expected in [ for context, is_authenticated, expected in [
(PoTokenContext.GVS, False, ('example-visitor-data', ContentBindingType.VISITOR_DATA)), (PoTokenContext.GVS, False, ('example-visitor-data', ContentBindingType.VISITOR_DATA)),
(PoTokenContext.PLAYER, False, ('example-video-id', ContentBindingType.VIDEO_ID)), (PoTokenContext.PLAYER, False, ('example-video-id', ContentBindingType.VIDEO_ID)),

@ -49,7 +49,7 @@ class TestWebPoPCSP:
@pytest.mark.parametrize('client_name, context, is_authenticated, remote_host, source_address, request_proxy, expected', [ @pytest.mark.parametrize('client_name, context, is_authenticated, remote_host, source_address, request_proxy, expected', [
*[(client, context, is_authenticated, remote_host, source_address, request_proxy, expected) for client in [ *[(client, context, is_authenticated, remote_host, source_address, request_proxy, expected) for client in [
'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER'] 'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER', 'TVHTML5_SIMPLY']
for context, is_authenticated, remote_host, source_address, request_proxy, expected in [ for context, is_authenticated, remote_host, source_address, request_proxy, expected in [
(PoTokenContext.GVS, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'visitor_id'}), (PoTokenContext.GVS, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'visitor_id'}),
(PoTokenContext.PLAYER, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'video_id'}), (PoTokenContext.PLAYER, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'video_id'}),

@ -416,18 +416,8 @@ class TestTraversal:
'`any` should allow further branching' '`any` should allow further branching'
def test_traversal_morsel(self): def test_traversal_morsel(self):
values = {
'expires': 'a',
'path': 'b',
'comment': 'c',
'domain': 'd',
'max-age': 'e',
'secure': 'f',
'httponly': 'g',
'version': 'h',
'samesite': 'i',
}
morsel = http.cookies.Morsel() morsel = http.cookies.Morsel()
values = dict(zip(morsel, 'abcdefghijklmnop'))
morsel.set('item_key', 'item_value', 'coded_value') morsel.set('item_key', 'item_value', 'coded_value')
morsel.update(values) morsel.update(values)
values['key'] = 'item_key' values['key'] = 'item_key'

@ -320,6 +320,14 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/59b252b9/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/59b252b9/player_ias.vflset/en_US/base.js',
'D3XWVpYgwhLLKNK4AGX', 'aZrQ1qWJ5yv5h', 'D3XWVpYgwhLLKNK4AGX', 'aZrQ1qWJ5yv5h',
), ),
(
'https://www.youtube.com/s/player/fc2a56a5/player_ias.vflset/en_US/base.js',
'qTKWg_Il804jd2kAC', 'OtUAm2W6gyzJjB9u',
),
(
'https://www.youtube.com/s/player/fc2a56a5/tv-player-ias.vflset/tv-player-ias.js',
'qTKWg_Il804jd2kAC', 'OtUAm2W6gyzJjB9u',
),
] ]

@ -490,7 +490,7 @@ class YoutubeDL:
The template is mapped on a dictionary with keys 'progress' and 'info' The template is mapped on a dictionary with keys 'progress' and 'info'
retry_sleep_functions: Dictionary of functions that takes the number of attempts retry_sleep_functions: Dictionary of functions that takes the number of attempts
as argument and returns the time to sleep in seconds. as argument and returns the time to sleep in seconds.
Allowed keys are 'http', 'fragment', 'file_access' Allowed keys are 'http', 'fragment', 'file_access', 'extractor'
download_ranges: A callback function that gets called for every video with download_ranges: A callback function that gets called for every video with
the signature (info_dict, ydl) -> Iterable[Section]. the signature (info_dict, ydl) -> Iterable[Section].
Only the returned sections will be downloaded. Only the returned sections will be downloaded.

@ -1,3 +1,5 @@
import json
from .theplatform import ThePlatformIE from .theplatform import ThePlatformIE
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
@ -6,7 +8,6 @@ from ..utils import (
remove_start, remove_start,
traverse_obj, traverse_obj,
update_url_query, update_url_query,
urlencode_postdata,
) )
@ -204,18 +205,19 @@ class AENetworksIE(AENetworksBaseIE):
class AENetworksListBaseIE(AENetworksBaseIE): class AENetworksListBaseIE(AENetworksBaseIE):
def _call_api(self, resource, slug, brand, fields): def _call_api(self, resource, slug, brand, fields):
return self._download_json( return self._download_json(
'https://yoga.appsvcs.aetnd.com/graphql', 'https://yoga.appsvcs.aetnd.com/graphql', slug,
slug, query={'brand': brand}, data=urlencode_postdata({ query={'brand': brand}, headers={'Content-Type': 'application/json'},
data=json.dumps({
'query': '''{ 'query': '''{
%s(slug: "%s") { %s(slug: "%s") {
%s %s
} }
}''' % (resource, slug, fields), # noqa: UP031 }''' % (resource, slug, fields), # noqa: UP031
}))['data'][resource] }).encode())['data'][resource]
def _real_extract(self, url): def _real_extract(self, url):
domain, slug = self._match_valid_url(url).groups() domain, slug = self._match_valid_url(url).groups()
_, brand = self._DOMAIN_MAP[domain] _, brand, _ = self._DOMAIN_MAP[domain]
playlist = self._call_api(self._RESOURCE, slug, brand, self._FIELDS) playlist = self._call_api(self._RESOURCE, slug, brand, self._FIELDS)
base_url = f'http://watch.{domain}' base_url = f'http://watch.{domain}'

@ -816,6 +816,26 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
'upload_date': '20111104', 'upload_date': '20111104',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$', 'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
}, },
}, {
'note': 'new playurlSSRData scheme',
'url': 'https://www.bilibili.com/bangumi/play/ep678060',
'info_dict': {
'id': '678060',
'ext': 'mp4',
'series': '去你家吃饭好吗',
'series_id': '6198',
'season': '第二季',
'season_id': '42542',
'season_number': 2,
'episode': '吴老二:你家大公鸡养不熟,能煮熟吗…',
'episode_id': '678060',
'episode_number': 61,
'title': '一只小九九丫 吴老二:你家大公鸡养不熟,能煮熟吗…',
'duration': 266.123,
'timestamp': 1663315904,
'upload_date': '20220916',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
},
}, { }, {
'url': 'https://www.bilibili.com/bangumi/play/ep267851', 'url': 'https://www.bilibili.com/bangumi/play/ep267851',
'info_dict': { 'info_dict': {
@ -879,13 +899,27 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
'Extracting episode', query={'fnval': 12240, 'ep_id': episode_id}, 'Extracting episode', query={'fnval': 12240, 'ep_id': episode_id},
headers=headers)) headers=headers))
geo_blocked = traverse_obj(play_info, (
'raw', 'data', 'plugins', lambda _, v: v['name'] == 'AreaLimitPanel', 'config', 'is_block', {bool}, any))
premium_only = play_info.get('code') == -10403 premium_only = play_info.get('code') == -10403
play_info = traverse_obj(play_info, ('result', 'video_info', {dict})) or {}
formats = self.extract_formats(play_info) video_info = traverse_obj(play_info, (('result', ('raw', 'data')), 'video_info', {dict}, any)) or {}
if not formats and (premium_only or '成为大会员抢先看' in webpage or '开通大会员观看' in webpage): formats = self.extract_formats(video_info)
if not formats:
if geo_blocked:
self.raise_geo_restricted()
elif premium_only or '成为大会员抢先看' in webpage or '开通大会员观看' in webpage:
self.raise_login_required('This video is for premium members only') self.raise_login_required('This video is for premium members only')
if traverse_obj(play_info, ((
('result', 'play_check', 'play_detail'), # 'PLAY_PREVIEW' vs 'PLAY_WHOLE'
('raw', 'data', 'play_video_type'), # 'preview' vs 'whole'
), any, {lambda x: x in ('PLAY_PREVIEW', 'preview')})):
self.report_warning(
'Only preview format is available, '
f'you have to become a premium member to access full video. {self._login_hint()}')
bangumi_info = self._download_json( bangumi_info = self._download_json(
'https://api.bilibili.com/pgc/view/web/season', episode_id, 'Get episode details', 'https://api.bilibili.com/pgc/view/web/season', episode_id, 'Get episode details',
query={'ep_id': episode_id}, headers=headers)['result'] query={'ep_id': episode_id}, headers=headers)['result']
@ -922,7 +956,7 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
'season': str_or_none(season_title), 'season': str_or_none(season_title),
'season_id': str_or_none(season_id), 'season_id': str_or_none(season_id),
'season_number': season_number, 'season_number': season_number,
'duration': float_or_none(play_info.get('timelength'), scale=1000), 'duration': float_or_none(video_info.get('timelength'), scale=1000),
'subtitles': self.extract_subtitles(episode_id, episode_info.get('cid'), aid=aid), 'subtitles': self.extract_subtitles(episode_id, episode_info.get('cid'), aid=aid),
'__post_extractor': self.extract_comments(aid), '__post_extractor': self.extract_comments(aid),
'http_headers': {'Referer': url}, 'http_headers': {'Referer': url},

@ -495,8 +495,6 @@ class BrightcoveLegacyIE(InfoExtractor):
class BrightcoveNewBaseIE(AdobePassIE): class BrightcoveNewBaseIE(AdobePassIE):
def _parse_brightcove_metadata(self, json_data, video_id, headers={}): def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
title = json_data['name'].strip()
formats, subtitles = [], {} formats, subtitles = [], {}
sources = json_data.get('sources') or [] sources = json_data.get('sources') or []
for source in sources: for source in sources:
@ -600,16 +598,18 @@ class BrightcoveNewBaseIE(AdobePassIE):
return { return {
'id': video_id, 'id': video_id,
'title': title,
'description': clean_html(json_data.get('description')),
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'duration': duration, 'duration': duration,
'timestamp': parse_iso8601(json_data.get('published_at')),
'uploader_id': json_data.get('account_id'),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'tags': json_data.get('tags', []),
'is_live': is_live, 'is_live': is_live,
**traverse_obj(json_data, {
'title': ('name', {clean_html}),
'description': ('description', {clean_html}),
'tags': ('tags', ..., {str}, filter, all, filter),
'timestamp': ('published_at', {parse_iso8601}),
'uploader_id': ('account_id', {str}),
}),
} }
@ -645,10 +645,7 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'uploader_id': '4036320279001', 'uploader_id': '4036320279001',
'formats': 'mincount:39', 'formats': 'mincount:39',
}, },
'params': { 'skip': '404 Not Found',
# m3u8 download
'skip_download': True,
},
}, { }, {
# playlist stream # playlist stream
'url': 'https://players.brightcove.net/1752604059001/S13cJdUBz_default/index.html?playlistId=5718313430001', 'url': 'https://players.brightcove.net/1752604059001/S13cJdUBz_default/index.html?playlistId=5718313430001',
@ -709,7 +706,6 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': 'TGD_01-032_5', 'title': 'TGD_01-032_5',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'tags': [],
'timestamp': 1646078943, 'timestamp': 1646078943,
'uploader_id': '1569565978001', 'uploader_id': '1569565978001',
'upload_date': '20220228', 'upload_date': '20220228',
@ -721,7 +717,6 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': 'TGD 01-087 (Airs 05.25.22)_Segment 5', 'title': 'TGD 01-087 (Airs 05.25.22)_Segment 5',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'tags': [],
'timestamp': 1651604591, 'timestamp': 1651604591,
'uploader_id': '1569565978001', 'uploader_id': '1569565978001',
'upload_date': '20220503', 'upload_date': '20220503',

@ -101,6 +101,7 @@ from ..utils import (
xpath_with_ns, xpath_with_ns,
) )
from ..utils._utils import _request_dump_filename from ..utils._utils import _request_dump_filename
from ..utils.jslib import devalue
class InfoExtractor: class InfoExtractor:
@ -1795,6 +1796,63 @@ class InfoExtractor:
ret = self._parse_json(js, video_id, transform_source=functools.partial(js_to_json, vars=args), fatal=fatal) ret = self._parse_json(js, video_id, transform_source=functools.partial(js_to_json, vars=args), fatal=fatal)
return traverse_obj(ret, traverse) or {} return traverse_obj(ret, traverse) or {}
def _resolve_nuxt_array(self, array, video_id, *, fatal=True, default=NO_DEFAULT):
"""Resolves Nuxt rich JSON payload arrays"""
# Ref: https://github.com/nuxt/nuxt/commit/9e503be0f2a24f4df72a3ccab2db4d3e63511f57
# https://github.com/nuxt/nuxt/pull/19205
if default is not NO_DEFAULT:
fatal = False
if not isinstance(array, list) or not array:
error_msg = 'Unable to resolve Nuxt JSON data: invalid input'
if fatal:
raise ExtractorError(error_msg, video_id=video_id)
elif default is NO_DEFAULT:
self.report_warning(error_msg, video_id=video_id)
return {} if default is NO_DEFAULT else default
def indirect_reviver(data):
return data
def json_reviver(data):
return json.loads(data)
gen = devalue.parse_iter(array, revivers={
'NuxtError': indirect_reviver,
'EmptyShallowRef': json_reviver,
'EmptyRef': json_reviver,
'ShallowRef': indirect_reviver,
'ShallowReactive': indirect_reviver,
'Ref': indirect_reviver,
'Reactive': indirect_reviver,
})
while True:
try:
error_msg = f'Error resolving Nuxt JSON: {gen.send(None)}'
if fatal:
raise ExtractorError(error_msg, video_id=video_id)
elif default is NO_DEFAULT:
self.report_warning(error_msg, video_id=video_id, only_once=True)
else:
self.write_debug(f'{video_id}: {error_msg}', only_once=True)
except StopIteration as error:
return error.value or ({} if default is NO_DEFAULT else default)
def _search_nuxt_json(self, webpage, video_id, *, fatal=True, default=NO_DEFAULT):
"""Parses metadata from Nuxt rich JSON payloads embedded in HTML"""
passed_default = default is not NO_DEFAULT
array = self._search_json(
r'<script\b[^>]+\bid="__NUXT_DATA__"[^>]*>', webpage,
'Nuxt JSON data', video_id, contains_pattern=r'\[(?s:.+)\]',
fatal=fatal, default=NO_DEFAULT if not passed_default else None)
if not array:
return default if passed_default else {}
return self._resolve_nuxt_array(array, video_id, fatal=fatal, default=default)
@staticmethod @staticmethod
def _hidden_inputs(html): def _hidden_inputs(html):
html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html) html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html)

@ -206,7 +206,7 @@ class DouyuTVIE(DouyuBaseIE):
'is_live': True, 'is_live': True,
**traverse_obj(room, { **traverse_obj(room, {
'display_id': ('url', {str}, {lambda i: i[1:]}), 'display_id': ('url', {str}, {lambda i: i[1:]}),
'title': ('room_name', {unescapeHTML}), 'title': ('room_name', {str}, {unescapeHTML}),
'description': ('show_details', {str}), 'description': ('show_details', {str}),
'uploader': ('nickname', {str}), 'uploader': ('nickname', {str}),
'thumbnail': ('room_src', {url_or_none}), 'thumbnail': ('room_src', {url_or_none}),

@ -64,7 +64,7 @@ class DreiSatIE(ZDFBaseIE):
'title': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1', 'title': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'description': 'md5:bae51bfc22f15563ce3acbf97d2e8844', 'description': 'md5:bae51bfc22f15563ce3acbf97d2e8844',
'duration': 5399.0, 'duration': 5399.0,
'thumbnail': 'https://www.3sat.de/assets/buchmesse-kerkeling-100~original?cb=1743329640903', 'thumbnail': 'https://www.3sat.de/assets/buchmesse-kerkeling-100~original?cb=1747256996338',
'chapters': 'count:24', 'chapters': 'count:24',
'episode': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1', 'episode': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'episode_id': 'POS_1ef236cc-b390-401e-acd0-4fb4b04315fb', 'episode_id': 'POS_1ef236cc-b390-401e-acd0-4fb4b04315fb',

@ -1,32 +1,66 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import js_to_json, traverse_obj from ..utils import (
ExtractorError,
clean_html,
url_or_none,
)
from ..utils.traversal import subs_list_to_dict, traverse_obj
class MonsterSirenHypergryphMusicIE(InfoExtractor): class MonsterSirenHypergryphMusicIE(InfoExtractor):
IE_NAME = 'monstersiren'
IE_DESC = '塞壬唱片'
_API_BASE = 'https://monster-siren.hypergryph.com/api'
_VALID_URL = r'https?://monster-siren\.hypergryph\.com/music/(?P<id>\d+)' _VALID_URL = r'https?://monster-siren\.hypergryph\.com/music/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://monster-siren.hypergryph.com/music/514562', 'url': 'https://monster-siren.hypergryph.com/music/514562',
'info_dict': { 'info_dict': {
'id': '514562', 'id': '514562',
'ext': 'wav', 'ext': 'wav',
'artists': ['塞壬唱片-MSR'],
'album': 'Flame Shadow',
'title': 'Flame Shadow', 'title': 'Flame Shadow',
'album': 'Flame Shadow',
'artists': ['塞壬唱片-MSR'],
'description': 'md5:19e2acfcd1b65b41b29e8079ab948053',
'thumbnail': r're:https?://web\.hycdn\.cn/siren/pic/.+\.jpg',
},
}, {
'url': 'https://monster-siren.hypergryph.com/music/514518',
'info_dict': {
'id': '514518',
'ext': 'wav',
'title': 'Heavenly Me (Instrumental)',
'album': 'Heavenly Me',
'artists': ['塞壬唱片-MSR', 'AIYUE blessed : 理名'],
'description': 'md5:ce790b41c932d1ad72eb791d1d8ae598',
'thumbnail': r're:https?://web\.hycdn\.cn/siren/pic/.+\.jpg',
}, },
}] }]
def _real_extract(self, url): def _real_extract(self, url):
audio_id = self._match_id(url) audio_id = self._match_id(url)
webpage = self._download_webpage(url, audio_id) song = self._download_json(f'{self._API_BASE}/song/{audio_id}', audio_id)
json_data = self._search_json( if traverse_obj(song, 'code') != 0:
r'window\.g_initialProps\s*=', webpage, 'data', audio_id, transform_source=js_to_json) msg = traverse_obj(song, ('msg', {str}, filter))
raise ExtractorError(
msg or 'API returned an error response', expected=bool(msg))
album = None
if album_id := traverse_obj(song, ('data', 'albumCid', {str})):
album = self._download_json(
f'{self._API_BASE}/album/{album_id}/detail', album_id, fatal=False)
return { return {
'id': audio_id, 'id': audio_id,
'title': traverse_obj(json_data, ('player', 'songDetail', 'name')),
'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')),
'ext': 'wav',
'vcodec': 'none', 'vcodec': 'none',
'artists': traverse_obj(json_data, ('player', 'songDetail', 'artists', ...)), **traverse_obj(song, ('data', {
'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name')), 'title': ('name', {str}),
'artists': ('artists', ..., {str}),
'subtitles': ({'url': 'lyricUrl'}, all, {subs_list_to_dict(lang='en')}),
'url': ('sourceUrl', {url_or_none}),
})),
**traverse_obj(album, ('data', {
'album': ('name', {str}),
'description': ('intro', {clean_html}),
'thumbnail': ('coverUrl', {url_or_none}),
})),
} }

@ -1,7 +1,5 @@
from .telecinco import TelecincoBaseIE from .telecinco import TelecincoBaseIE
from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
) )
@ -81,17 +79,7 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_akamai_webpage(url, display_id)
try: # yt-dlp's default user-agents are too old and blocked by akamai
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient to bypass akamai
webpage = self._download_webpage(url, display_id, impersonate=True)
pre_player = self._search_json( pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=', r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player', display_id)['prePlayer'] webpage, 'Pre Player', display_id)['prePlayer']

@ -1,59 +1,57 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext, UnsupportedError,
get_element_by_attribute, clean_html,
int_or_none, int_or_none,
js_to_json, parse_duration,
mimetype2ext, parse_qs,
update_url_query, str_or_none,
update_url,
) )
from ..utils.traversal import find_element, traverse_obj
class NobelPrizeIE(InfoExtractor): class NobelPrizeIE(InfoExtractor):
_WORKING = False _VALID_URL = r'https?://(?:(?:mediaplayer|www)\.)?nobelprize\.org/mediaplayer/'
_VALID_URL = r'https?://(?:www\.)?nobelprize\.org/mediaplayer.*?\bid=(?P<id>\d+)' _TESTS = [{
_TEST = { 'url': 'https://www.nobelprize.org/mediaplayer/?id=2636',
'url': 'http://www.nobelprize.org/mediaplayer/?id=2636',
'md5': '04c81e5714bb36cc4e2232fee1d8157f',
'info_dict': { 'info_dict': {
'id': '2636', 'id': '2636',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Announcement of the 2016 Nobel Prize in Physics', 'title': 'Announcement of the 2016 Nobel Prize in Physics',
'description': 'md5:05beba57f4f5a4bbd4cf2ef28fcff739', 'description': 'md5:1a2d8a6ca80c88fb3b9a326e0b0e8e43',
'duration': 1560.0,
'thumbnail': r're:https?://www\.nobelprize\.org/images/.+\.jpg',
'timestamp': 1504883793,
'upload_date': '20170908',
}, },
} }, {
'url': 'https://mediaplayer.nobelprize.org/mediaplayer/?qid=12693',
'info_dict': {
'id': '12693',
'ext': 'mp4',
'title': 'Nobel Lecture by Peter Higgs',
'description': 'md5:9b12e275dbe3a8138484e70e00673a05',
'duration': 1800.0,
'thumbnail': r're:https?://www\.nobelprize\.org/images/.+\.jpg',
'timestamp': 1504883793,
'upload_date': '20170908',
},
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = traverse_obj(parse_qs(url), (
webpage = self._download_webpage(url, video_id) ('id', 'qid'), -1, {int_or_none}, {str_or_none}, any))
media = self._parse_json(self._search_regex( if not video_id:
r'(?s)var\s*config\s*=\s*({.+?});', webpage, raise UnsupportedError(url)
'config'), video_id, js_to_json)['media'] webpage = self._download_webpage(
title = media['title'] update_url(url, netloc='mediaplayer.nobelprize.org'), video_id)
formats = []
for source in media.get('source', []):
source_src = source.get('src')
if not source_src:
continue
ext = mimetype2ext(source.get('type')) or determine_ext(source_src)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_src, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
update_url_query(source_src, {'hdcore': '3.7.0'}),
video_id, f4m_id='hds', fatal=False))
else:
formats.append({
'url': source_src,
})
return { return {
**self._search_json_ld(webpage, video_id),
'id': video_id, 'id': video_id,
'title': title, 'title': self._html_search_meta('caption', webpage),
'description': get_element_by_attribute('itemprop', 'description', webpage), 'description': traverse_obj(webpage, (
'duration': int_or_none(media.get('duration')), {find_element(tag='span', attr='itemprop', value='description')}, {clean_html})),
'formats': formats, 'duration': parse_duration(self._html_search_meta('duration', webpage)),
} }

@ -1,55 +1,82 @@
from .common import InfoExtractor from .streaks import StreaksBaseIE
from ..utils import ( from ..utils import (
ExtractorError, int_or_none,
smuggle_url, parse_iso8601,
traverse_obj, str_or_none,
url_or_none,
) )
from ..utils.traversal import require, traverse_obj
class NTVCoJpCUIE(InfoExtractor): class NTVCoJpCUIE(StreaksBaseIE):
IE_NAME = 'cu.ntv.co.jp' IE_NAME = 'cu.ntv.co.jp'
IE_DESC = 'Nippon Television Network' IE_DESC = '日テレ無料TADA!'
_VALID_URL = r'https?://cu\.ntv\.co\.jp/(?!program)(?P<id>[^/?&#]+)' _VALID_URL = r'https?://cu\.ntv\.co\.jp/(?!program-list|search)(?P<id>[\w-]+)/?(?:[?#]|$)'
_TEST = { _TESTS = [{
'url': 'https://cu.ntv.co.jp/televiva-chill-gohan_181031/', 'url': 'https://cu.ntv.co.jp/gaki_20250525/',
'info_dict': { 'info_dict': {
'id': '5978891207001', 'id': 'gaki_20250525',
'ext': 'mp4', 'ext': 'mp4',
'title': '桜エビと炒り卵がポイント! 「中華風 エビチリおにぎり」──『美虎』五十嵐美幸', 'title': '放送開始36年!方正ココリコが選ぶ神回&地獄回!',
'upload_date': '20181213', 'cast': 'count:2',
'description': 'md5:1985b51a9abc285df0104d982a325f2a', 'description': 'md5:1e1db556224d627d4d2f74370c650927',
'uploader_id': '3855502814001', 'display_id': 'ref:gaki_20250525',
'timestamp': 1544669941, 'duration': 1450,
'episode': '放送開始36年!方正ココリコが選ぶ神回&地獄回!',
'episode_id': '000000010172808',
'episode_number': 255,
'genres': ['variety'],
'live_status': 'not_live',
'modified_date': '20250525',
'modified_timestamp': 1748145537,
'release_date': '20250525',
'release_timestamp': 1748145539,
'series': 'ダウンタウンのガキの使いやあらへんで!',
'series_id': 'gaki',
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1748145197,
'upload_date': '20250525',
'uploader': '日本テレビ放送網',
'uploader_id': '0x7FE2',
}, },
'params': { }]
# m3u8 download
'skip_download': True,
},
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s'
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
player_config = self._search_nuxt_data(webpage, display_id)
video_id = traverse_obj(player_config, ('movie', 'video_id')) info = self._search_json(
if not video_id: r'window\.app\s*=', webpage, 'video info',
raise ExtractorError('Failed to extract video ID for Brightcove') display_id)['falcorCache']['catalog']['episode'][display_id]['value']
account_id = traverse_obj(player_config, ('player', 'account')) or '3855502814001' media_id = traverse_obj(info, (
title = traverse_obj(player_config, ('movie', 'name')) 'streaks_data', 'mediaid', {str_or_none}, {require('Streaks media ID')}))
if not title: non_phonetic = (lambda _, v: v['is_phonetic'] is False, 'value', {str})
og_title = self._og_search_title(webpage, fatal=False) or traverse_obj(player_config, ('player', 'title'))
if og_title:
title = og_title.split('(', 1)[0].strip()
description = (traverse_obj(player_config, ('movie', 'description'))
or self._html_search_meta(['description', 'og:description'], webpage))
return { return {
'_type': 'url_transparent', **self._extract_from_streaks_api('ntv-tada', media_id, headers={
'id': video_id, 'X-Streaks-Api-Key': 'df497719056b44059a0483b8faad1f4a',
'display_id': display_id, }),
'title': title, **traverse_obj(info, {
'description': description, 'id': ('content_id', {str_or_none}),
'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % (account_id, video_id), {'geo_countries': ['JP']}), 'title': ('title', *non_phonetic, any),
'ie_key': 'BrightcoveNew', 'age_limit': ('is_adult_only_content', {lambda x: 18 if x else None}),
'cast': ('credit', ..., 'name', *non_phonetic),
'genres': ('genre', ..., {str}),
'release_timestamp': ('pub_date', {parse_iso8601}),
'tags': ('tags', ..., {str}),
'thumbnail': ('artwork', ..., 'url', any, {url_or_none}),
}),
**traverse_obj(info, ('tv_episode_info', {
'duration': ('duration', {int_or_none}),
'episode_number': ('episode_number', {int}),
'series': ('parent_show_title', *non_phonetic, any),
'series_id': ('show_content_id', {str}),
})),
**traverse_obj(info, ('custom_data', {
'description': ('program_detail', {str}),
'episode': ('episode_title', {str}),
'episode_id': ('episode_id', {str_or_none}),
'uploader': ('network_name', {str}),
'uploader_id': ('network_id', {str}),
})),
} }

@ -15,7 +15,6 @@ from ..utils import (
str_or_none, str_or_none,
strip_jsonp, strip_jsonp,
traverse_obj, traverse_obj,
unescapeHTML,
url_or_none, url_or_none,
urljoin, urljoin,
) )
@ -425,7 +424,7 @@ class QQMusicPlaylistIE(QQPlaylistBaseIE):
return self.playlist_result(entries, list_id, **traverse_obj(list_json, ('cdlist', 0, { return self.playlist_result(entries, list_id, **traverse_obj(list_json, ('cdlist', 0, {
'title': ('dissname', {str}), 'title': ('dissname', {str}),
'description': ('desc', {unescapeHTML}, {clean_html}), 'description': ('desc', {clean_html}),
}))) })))

@ -1,57 +1,102 @@
from .ard import ARDMediathekBaseIE from .ard import ARDMediathekBaseIE
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
get_element_by_attribute, clean_html,
extract_attributes,
parse_duration,
parse_qs,
unified_strdate,
)
from ..utils.traversal import (
find_element,
require,
traverse_obj,
) )
class SRMediathekIE(ARDMediathekBaseIE): class SRMediathekIE(ARDMediathekBaseIE):
_WORKING = False
IE_NAME = 'sr:mediathek' IE_NAME = 'sr:mediathek'
IE_DESC = 'Saarländischer Rundfunk' IE_DESC = 'Saarländischer Rundfunk'
_VALID_URL = r'https?://sr-mediathek(?:\.sr-online)?\.de/index\.php\?.*?&id=(?P<id>[0-9]+)'
_CLS_COMMON = 'teaser__image__caption__text teaser__image__caption__text--'
_VALID_URL = r'https?://(?:www\.)?sr-mediathek\.de/index\.php\?.*?&id=(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://sr-mediathek.sr-online.de/index.php?seite=7&id=28455', 'url': 'https://www.sr-mediathek.de/index.php?seite=7&id=141317',
'info_dict': { 'info_dict': {
'id': '28455', 'id': '141317',
'ext': 'mp4', 'ext': 'mp4',
'title': 'sportarena (26.10.2014)', 'title': 'Kärnten, da will ich hin!',
'description': 'Ringen: KSV Köllerbach gegen Aachen-Walheim; Frauen-Fußball: 1. FC Saarbrücken gegen Sindelfingen; Motorsport: Rallye in Losheim; dazu: Interview mit Timo Bernhard; Turnen: TG Saar; Reitsport: Deutscher Voltigier-Pokal; Badminton: Interview mit Michael Fuchs ', 'channel': 'SR Fernsehen',
'thumbnail': r're:^https?://.*\.jpg$', 'description': 'md5:7732e71e803379a499732864a572a456',
'duration': 1788.0,
'release_date': '20250525',
'series': 'da will ich hin!',
'series_id': 'DWIH',
'thumbnail': r're:https?://.+\.jpg',
}, },
'skip': 'no longer available',
}, { }, {
'url': 'http://sr-mediathek.sr-online.de/index.php?seite=7&id=37682', 'url': 'https://www.sr-mediathek.de/index.php?seite=7&id=153853',
'info_dict': { 'info_dict': {
'id': '37682', 'id': '153853',
'ext': 'mp4', 'ext': 'mp3',
'title': 'Love, Cakes and Rock\'n\'Roll', 'title': 'Kappes, Klöße, Kokosmilch: Bruschetta mit Nduja',
'description': 'md5:18bf9763631c7d326c22603681e1123d', 'channel': 'SR 3',
}, 'description': 'md5:3935798de3562b10c4070b408a15e225',
'params': { 'duration': 139.0,
# m3u8 download 'release_date': '20250523',
'skip_download': True, 'series': 'Kappes, Klöße, Kokosmilch',
'series_id': 'SR3_KKK_A',
'thumbnail': r're:https?://.+\.jpg',
}, },
}, { }, {
'url': 'http://sr-mediathek.de/index.php?seite=7&id=7480', 'url': 'https://www.sr-mediathek.de/index.php?seite=7&id=31406&pnr=&tbl=pf',
'only_matching': True, 'info_dict': {
'id': '31406',
'ext': 'mp3',
'title': 'Das Leben schwer nehmen, ist einfach zu anstrengend',
'channel': 'SR 1',
'description': 'md5:3e03fd556af831ad984d0add7175fb0c',
'duration': 1769.0,
'release_date': '20230717',
'series': 'Abendrot',
'series_id': 'SR1_AB_P',
'thumbnail': r're:https?://.+\.jpg',
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
description = self._og_search_description(webpage)
if '>Der gew&uuml;nschte Beitrag ist leider nicht mehr verf&uuml;gbar.<' in webpage: if description == 'Der gewünschte Beitrag ist leider nicht mehr vorhanden.':
raise ExtractorError(f'Video {video_id} is no longer available', expected=True) raise ExtractorError(f'Video {video_id} is no longer available', expected=True)
media_collection_url = self._search_regex( player_url = traverse_obj(webpage, (
r'data-mediacollection-ardplayer="([^"]+)"', webpage, 'media collection url') {find_element(tag='div', id=f'player{video_id}', html=True)},
info = self._extract_media_info(media_collection_url, webpage, video_id) {extract_attributes}, 'data-mediacollection-ardplayer',
info.update({ {self._proto_relative_url}, {require('player URL')}))
article = traverse_obj(webpage, (
{find_element(cls='article__content')},
{find_element(tag='p')}, {clean_html}))
return {
**self._extract_media_info(player_url, webpage, video_id),
'id': video_id, 'id': video_id,
'title': get_element_by_attribute('class', 'ardplayer-title', webpage), 'title': traverse_obj(webpage, (
'description': self._og_search_description(webpage), {find_element(cls='ardplayer-title')}, {clean_html})),
'channel': traverse_obj(webpage, (
{find_element(cls=f'{self._CLS_COMMON}subheadline')},
{lambda x: x.split('|')[0]}, {clean_html})),
'description': description,
'duration': parse_duration(self._search_regex(
r'(\d{2}:\d{2}:\d{2})', article, 'duration')),
'release_date': unified_strdate(self._search_regex(
r'(\d{2}\.\d{2}\.\d{4})', article, 'release_date')),
'series': traverse_obj(webpage, (
{find_element(cls=f'{self._CLS_COMMON}headline')}, {clean_html})),
'series_id': traverse_obj(webpage, (
{find_element(cls='teaser__link', html=True)},
{extract_attributes}, 'href', {parse_qs}, 'sen', ..., {str}, any)),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
}) }
return info

@ -4,6 +4,7 @@ from .wrestleuniverse import WrestleUniverseBaseIE
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
traverse_obj, traverse_obj,
url_basename,
url_or_none, url_or_none,
) )
@ -65,9 +66,19 @@ class StacommuBaseIE(WrestleUniverseBaseIE):
hls_info, decrypt = self._call_encrypted_api( hls_info, decrypt = self._call_encrypted_api(
video_id, ':watchArchive', 'stream information', data={'method': 1}) video_id, ':watchArchive', 'stream information', data={'method': 1})
formats = self._get_formats(hls_info, ('hls', 'urls', ..., {url_or_none}), video_id)
for f in formats:
# bitrates are exaggerated in PPV playlists, so avoid wrong/huge filesize_approx values
if f.get('tbr'):
f['tbr'] = int(f['tbr'] / 2.5)
# prefer variants with the same basename as the master playlist to avoid partial streams
f['format_id'] = url_basename(f['url']).partition('.')[0]
if not f['format_id'].startswith(url_basename(f['manifest_url']).partition('.')[0]):
f['preference'] = -10
return { return {
'id': video_id, 'id': video_id,
'formats': self._get_formats(hls_info, ('hls', 'urls', ..., {url_or_none}), video_id), 'formats': formats,
'hls_aes': self._extract_hls_key(hls_info, 'hls', decrypt), 'hls_aes': self._extract_hls_key(hls_info, 'hls', decrypt),
**traverse_obj(video_info, { **traverse_obj(video_info, {
'title': ('displayName', {str}), 'title': ('displayName', {str}),

@ -1,76 +1,76 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none, urljoin from .youtube import YoutubeIE
from ..utils import (
clean_html,
parse_iso8601,
update_url,
url_or_none,
)
from ..utils.traversal import subs_list_to_dict, traverse_obj
class StarTrekIE(InfoExtractor): class StarTrekIE(InfoExtractor):
_WORKING = False IE_NAME = 'startrek'
_VALID_URL = r'(?P<base>https?://(?:intl|www)\.startrek\.com)/videos/(?P<id>[^/]+)' IE_DESC = 'STAR TREK'
_VALID_URL = r'https?://(?:www\.)?startrek\.com(?:/en-(?:ca|un))?/videos/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://intl.startrek.com/videos/watch-welcoming-jess-bush-to-the-ready-room', 'url': 'https://www.startrek.com/en-un/videos/official-trailer-star-trek-lower-decks-season-4',
'md5': '491df5035c9d4dc7f63c79caaf9c839e',
'info_dict': { 'info_dict': {
'id': 'watch-welcoming-jess-bush-to-the-ready-room', 'id': 'official-trailer-star-trek-lower-decks-season-4',
'ext': 'mp4', 'ext': 'mp4',
'title': 'WATCH: Welcoming Jess Bush to The Ready Room', 'title': 'Official Trailer | Star Trek: Lower Decks - Season 4',
'duration': 1888, 'alt_title': 'md5:dd7e3191aaaf9e95db16fc3abd5ef68b',
'timestamp': 1655388000, 'categories': ['TRAILERS'],
'upload_date': '20220616', 'description': 'md5:563d7856ddab99bee7a5e50f45531757',
'description': 'md5:1ffee884e3920afbdd6dd04e926a1221', 'release_date': '20230722',
'thumbnail': r're:https://(?:intl|www)\.startrek\.com/sites/default/files/styles/video_1920x1080/public/images/2022-06/pp_14794_rr_thumb_107_yt_16x9\.jpg(?:\?.+)?', 'release_timestamp': 1690033200,
'subtitles': {'en-US': [{ 'series': 'Star Trek: Lower Decks',
'url': r're:https://(?:intl|www)\.startrek\.com/sites/default/files/video/captions/2022-06/TRR_SNW_107_v4\.vtt', 'series_id': 'star-trek-lower-decks',
}, { 'thumbnail': r're:https?://.+\.(?:jpg|png)',
'url': 'https://media.startrek.com/2022/06/16/2043801155561/1069981_hls/trr_snw_107_v4-c4bfc25d/stream_vtt.m3u8',
}]},
}, },
}, { }, {
'url': 'https://www.startrek.com/videos/watch-ethan-peck-and-gia-sandhu-beam-down-to-the-ready-room', 'url': 'https://www.startrek.com/en-ca/videos/my-first-contact-senator-cory-booker',
'md5': 'f5ad74fbb86e91e0882fc0a333178d1d',
'info_dict': { 'info_dict': {
'id': 'watch-ethan-peck-and-gia-sandhu-beam-down-to-the-ready-room', 'id': 'my-first-contact-senator-cory-booker',
'ext': 'mp4', 'ext': 'mp4',
'title': 'WATCH: Ethan Peck and Gia Sandhu Beam Down to The Ready Room', 'title': 'My First Contact: Senator Cory Booker',
'duration': 1986, 'alt_title': 'md5:fe74a8bdb0afab421c6e159a7680db4d',
'timestamp': 1654221600, 'categories': ['MY FIRST CONTACT'],
'upload_date': '20220603', 'description': 'md5:a3992ab3b3e0395925d71156bbc018ce',
'description': 'md5:b3aa0edacfe119386567362dec8ed51b', 'release_date': '20250401',
'thumbnail': r're:https://www\.startrek\.com/sites/default/files/styles/video_1920x1080/public/images/2022-06/pp_14792_rr_thumb_105_yt_16x9_1.jpg(?:\?.+)?', 'release_timestamp': 1743512400,
'subtitles': {'en-US': [{ 'series': 'Star Trek: The Original Series',
'url': r're:https://(?:intl|www)\.startrek\.com/sites/default/files/video/captions/2022-06/TRR_SNW_105_v5\.vtt', 'series_id': 'star-trek-the-original-series',
}]}, 'thumbnail': r're:https?://.+\.(?:jpg|png)',
}, },
}] }]
def _real_extract(self, url): def _real_extract(self, url):
urlbase, video_id = self._match_valid_url(url).group('base', 'id') video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
player = self._search_regex( page_props = self._search_nextjs_data(webpage, video_id)['props']['pageProps']
r'(<\s*div\s+id\s*=\s*"cvp-player-[^<]+<\s*/div\s*>)', webpage, 'player') video_data = page_props['video']['data']
if youtube_id := video_data.get('youtube_video_id'):
hls = self._html_search_regex(r'\bdata-hls\s*=\s*"([^"]+)"', player, 'HLS URL') return self.url_result(youtube_id, YoutubeIE)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(hls, video_id, 'mp4')
captions = self._html_search_regex(
r'\bdata-captions-url\s*=\s*"([^"]+)"', player, 'captions URL', fatal=False)
if captions:
subtitles.setdefault('en-US', [])[:0] = [{'url': urljoin(urlbase, captions)}]
# NB: Most of the data in the json_ld is undesirable series_id = traverse_obj(video_data, (
json_ld = self._search_json_ld(webpage, video_id, fatal=False) 'series_and_movies', ..., 'series_or_movie', 'slug', {str}, any))
return { return {
'id': video_id, 'id': video_id,
'title': self._html_search_regex( 'series': traverse_obj(page_props, (
r'\bdata-title\s*=\s*"([^"]+)"', player, 'title', json_ld.get('title')), 'queried', 'header', 'tab3', 'slices', ..., 'items',
'description': self._html_search_regex( lambda _, v: v['link']['slug'] == series_id, 'link_copy', {str}, any)),
r'(?s)<\s*div\s+class\s*=\s*"header-body"\s*>(.+?)<\s*/div\s*>', 'series_id': series_id,
webpage, 'description', fatal=False), **traverse_obj(video_data, {
'duration': int_or_none(self._html_search_regex( 'title': ('title', ..., 'text', {clean_html}, any),
r'\bdata-duration\s*=\s*"(\d+)"', player, 'duration', fatal=False)), 'alt_title': ('subhead', ..., 'text', {clean_html}, any),
'formats': formats, 'categories': ('category', 'data', 'category_name', {str.upper}, filter, all),
'subtitles': subtitles, 'description': ('slices', ..., 'primary', 'content', ..., 'text', {clean_html}, any),
'thumbnail': urljoin(urlbase, self._html_search_regex( 'release_timestamp': ('published', {parse_iso8601}),
r'\bdata-poster-url\s*=\s*"([^"]+)"', player, 'thumbnail', fatal=False)), 'subtitles': ({'url': 'legacy_subtitle_file'}, all, {subs_list_to_dict(lang='en')}),
'timestamp': json_ld.get('timestamp'), 'thumbnail': ('poster_frame', 'url', {url_or_none}, {update_url(query=None)}),
'url': ('legacy_video_url', {url_or_none}),
}),
} }

@ -63,6 +63,17 @@ class TelecincoBaseIE(InfoExtractor):
'http_headers': headers, 'http_headers': headers,
} }
def _download_akamai_webpage(self, url, display_id):
try: # yt-dlp's default user-agents are too old and blocked by akamai
return self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient to bypass akamai
return self._download_webpage(url, display_id, impersonate=True)
class TelecincoIE(TelecincoBaseIE): class TelecincoIE(TelecincoBaseIE):
IE_DESC = 'telecinco.es, cuatro.com and mediaset.es' IE_DESC = 'telecinco.es, cuatro.com and mediaset.es'
@ -140,7 +151,7 @@ class TelecincoIE(TelecincoBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_akamai_webpage(url, display_id)
article = self._search_json( article = self._search_json(
r'window\.\$REACTBASE_STATE\.article(?:_multisite)?\s*=', r'window\.\$REACTBASE_STATE\.article(?:_multisite)?\s*=',
webpage, 'article', display_id)['article'] webpage, 'article', display_id)['article']

@ -548,21 +548,21 @@ class VKIE(VKBaseIE):
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
**traverse_obj(mv_data, { **traverse_obj(mv_data, {
'title': ('title', {unescapeHTML}), 'title': ('title', {str}, {unescapeHTML}),
'description': ('desc', {clean_html}, filter), 'description': ('desc', {clean_html}, filter),
'duration': ('duration', {int_or_none}), 'duration': ('duration', {int_or_none}),
'like_count': ('likes', {int_or_none}), 'like_count': ('likes', {int_or_none}),
'comment_count': ('commcount', {int_or_none}), 'comment_count': ('commcount', {int_or_none}),
}), }),
**traverse_obj(data, { **traverse_obj(data, {
'title': ('md_title', {unescapeHTML}), 'title': ('md_title', {str}, {unescapeHTML}),
'description': ('description', {clean_html}, filter), 'description': ('description', {clean_html}, filter),
'thumbnail': ('jpg', {url_or_none}), 'thumbnail': ('jpg', {url_or_none}),
'uploader': ('md_author', {unescapeHTML}), 'uploader': ('md_author', {str}, {unescapeHTML}),
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any), 'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
'duration': ('duration', {int_or_none}), 'duration': ('duration', {int_or_none}),
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), { 'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
'title': ('text', {unescapeHTML}), 'title': ('text', {str}, {unescapeHTML}),
'start_time': 'time', 'start_time': 'time',
}), }),
}), }),

@ -175,6 +175,15 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT_CLIENT_NAME': 7, 'INNERTUBE_CONTEXT_CLIENT_NAME': 7,
'SUPPORTS_COOKIES': True, 'SUPPORTS_COOKIES': True,
}, },
'tv_simply': {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'TVHTML5_SIMPLY',
'clientVersion': '1.0',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 75,
},
# This client now requires sign-in for every video # This client now requires sign-in for every video
# It was previously an age-gate workaround for videos that were `playable_in_embed` # It was previously an age-gate workaround for videos that were `playable_in_embed`
# It may still be useful if signed into an EU account that is not age-verified # It may still be useful if signed into an EU account that is not age-verified

@ -250,7 +250,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'400': {'ext': 'mp4', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'av01.0.12M.08'}, '400': {'ext': 'mp4', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'av01.0.12M.08'},
'401': {'ext': 'mp4', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'av01.0.12M.08'}, '401': {'ext': 'mp4', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'av01.0.12M.08'},
} }
_SUBTITLE_FORMATS = ('json3', 'srv1', 'srv2', 'srv3', 'ttml', 'vtt') _SUBTITLE_FORMATS = ('json3', 'srv1', 'srv2', 'srv3', 'ttml', 'srt', 'vtt')
_DEFAULT_CLIENTS = ('tv', 'ios', 'web') _DEFAULT_CLIENTS = ('tv', 'ios', 'web')
_DEFAULT_AUTHED_CLIENTS = ('tv', 'web') _DEFAULT_AUTHED_CLIENTS = ('tv', 'web')
@ -2229,20 +2229,20 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_n_function_name(self, jscode, player_url=None): def _extract_n_function_name(self, jscode, player_url=None):
varname, global_list = self._interpret_player_js_global_var(jscode, player_url) varname, global_list = self._interpret_player_js_global_var(jscode, player_url)
if debug_str := traverse_obj(global_list, (lambda _, v: v.endswith('-_w8_'), any)): if debug_str := traverse_obj(global_list, (lambda _, v: v.endswith('-_w8_'), any)):
funcname = self._search_regex( pattern = r'''(?x)
r'''(?xs) \{\s*return\s+%s\[%d\]\s*\+\s*(?P<argname>[a-zA-Z0-9_$]+)\s*\}
[;\n](?: ''' % (re.escape(varname), global_list.index(debug_str))
(?P<f>function\s+)| if match := re.search(pattern, jscode):
(?:var\s+)? pattern = r'''(?x)
)(?P<funcname>[a-zA-Z0-9_$]+)\s*(?(f)|=\s*function\s*) \{\s*\)%s\(\s*
\((?P<argname>[a-zA-Z0-9_$]+)\)\s*\{ (?:
(?:(?!\}[;\n]).)+ (?P<funcname_a>[a-zA-Z0-9_$]+)\s*noitcnuf\s*
\}\s*catch\(\s*[a-zA-Z0-9_$]+\s*\)\s* |noitcnuf\s*=\s*(?P<funcname_b>[a-zA-Z0-9_$]+)(?:\s+rav)?
\{\s*return\s+%s\[%d\]\s*\+\s*(?P=argname)\s*\}\s*return\s+[^}]+\}[;\n] )[;\n]
''' % (re.escape(varname), global_list.index(debug_str)), ''' % re.escape(match.group('argname')[::-1])
jscode, 'nsig function name', group='funcname', default=None) if match := re.search(pattern, jscode[match.start()::-1]):
if funcname: a, b = match.group('funcname_a', 'funcname_b')
return funcname return (a or b)[::-1]
self.write_debug(join_nonempty( self.write_debug(join_nonempty(
'Initial search was unable to find nsig function name', 'Initial search was unable to find nsig function name',
player_url and f' player = {player_url}', delim='\n'), only_once=True) player_url and f' player = {player_url}', delim='\n'), only_once=True)

@ -20,6 +20,7 @@ WEBPO_CLIENTS = (
'WEB_EMBEDDED_PLAYER', 'WEB_EMBEDDED_PLAYER',
'WEB_CREATOR', 'WEB_CREATOR',
'WEB_REMIX', 'WEB_REMIX',
'TVHTML5_SIMPLY',
'TVHTML5_SIMPLY_EMBEDDED_PLAYER', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER',
) )

@ -6,6 +6,7 @@ import time
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
ISO639Utils,
determine_ext, determine_ext,
filter_dict, filter_dict,
float_or_none, float_or_none,
@ -118,10 +119,7 @@ class ZDFBaseIE(InfoExtractor):
if ext == 'm3u8': if ext == 'm3u8':
fmts = self._extract_m3u8_formats( fmts = self._extract_m3u8_formats(
format_url, video_id, 'mp4', m3u8_id='hls', fatal=False) format_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
elif ext == 'mpd': elif ext in ('mp4', 'webm'):
fmts = self._extract_mpd_formats(
format_url, video_id, mpd_id='dash', fatal=False)
else:
height = int_or_none(quality.get('highestVerticalResolution')) height = int_or_none(quality.get('highestVerticalResolution'))
width = round(aspect_ratio * height) if aspect_ratio and height else None width = round(aspect_ratio * height) if aspect_ratio and height else None
fmts = [{ fmts = [{
@ -132,16 +130,31 @@ class ZDFBaseIE(InfoExtractor):
'format_id': join_nonempty('http', stream.get('type')), 'format_id': join_nonempty('http', stream.get('type')),
'tbr': int_or_none(self._search_regex(r'_(\d+)k_', format_url, 'tbr', default=None)), 'tbr': int_or_none(self._search_regex(r'_(\d+)k_', format_url, 'tbr', default=None)),
}] }]
else:
self.report_warning(f'Skipping unsupported extension "{ext}"', video_id=video_id)
fmts = []
f_class = variant.get('class') f_class = variant.get('class')
for f in fmts: for f in fmts:
f_lang = ISO639Utils.short2long(
(f.get('language') or variant.get('language') or '').lower())
is_audio_only = f.get('vcodec') == 'none'
formats.append({ formats.append({
**f, **f,
'format_id': join_nonempty(f.get('format_id'), is_dgs and 'dgs'), 'format_id': join_nonempty(f['format_id'], is_dgs and 'dgs'),
'format_note': join_nonempty( 'format_note': join_nonempty(
f_class, is_dgs and 'German Sign Language', f.get('format_note'), delim=', '), not is_audio_only and f_class,
'language': variant.get('language') or f.get('language'), is_dgs and 'German Sign Language',
f.get('format_note'), delim=', '),
'preference': -2 if is_dgs else -1, 'preference': -2 if is_dgs else -1,
'language_preference': 10 if f_class == 'main' else -10 if f_class == 'ad' else -1, 'language': f_lang,
'language_preference': (
-10 if ((is_audio_only and f.get('format_note') == 'Audiodeskription')
or (not is_audio_only and f_class == 'ad'))
else 10 if f_lang == 'deu' and f_class == 'main'
else 5 if f_lang == 'deu'
else 1 if f_class == 'main'
else -1),
}) })
return { return {
@ -333,12 +346,13 @@ class ZDFIE(ZDFBaseIE):
'title': 'Dobrindt schließt Steuererhöhungen aus', 'title': 'Dobrindt schließt Steuererhöhungen aus',
'description': 'md5:9a117646d7b8df6bc902eb543a9c9023', 'description': 'md5:9a117646d7b8df6bc902eb543a9c9023',
'duration': 325, 'duration': 325,
'thumbnail': 'https://www.zdf.de/assets/dobrindt-csu-berlin-direkt-100~1920x1080?cb=1743357653736', 'thumbnail': 'https://www.zdfheute.de/assets/dobrindt-csu-berlin-direkt-100~1920x1080?cb=1743357653736',
'timestamp': 1743374520, 'timestamp': 1743374520,
'upload_date': '20250330', 'upload_date': '20250330',
'_old_archive_ids': ['zdf 250330_clip_2_bdi'], '_old_archive_ids': ['zdf 250330_clip_2_bdi'],
}, },
}, { }, {
# FUNK video (hosted on a different CDN, has atypical PTMD and HLS files)
'url': 'https://www.zdf.de/funk/druck-11790/funk-alles-ist-verzaubert-102.html', 'url': 'https://www.zdf.de/funk/druck-11790/funk-alles-ist-verzaubert-102.html',
'md5': '57af4423db0455a3975d2dc4578536bc', 'md5': '57af4423db0455a3975d2dc4578536bc',
'info_dict': { 'info_dict': {
@ -651,6 +665,7 @@ class ZDFChannelIE(ZDFBaseIE):
'description': 'md5:6edad39189abf8431795d3d6d7f986b3', 'description': 'md5:6edad39189abf8431795d3d6d7f986b3',
}, },
'playlist_count': 242, 'playlist_count': 242,
'skip': 'Video count changes daily, needs support for playlist_maxcount',
}] }]
_PAGE_SIZE = 24 _PAGE_SIZE = 24

@ -0,0 +1 @@
# Utility functions for handling web input based on commonly used JavaScript libraries

@ -0,0 +1,167 @@
from __future__ import annotations
import array
import base64
import datetime as dt
import math
import re
from .._utils import parse_iso8601
TYPE_CHECKING = False
if TYPE_CHECKING:
import collections.abc
import typing
T = typing.TypeVar('T')
_ARRAY_TYPE_LOOKUP = {
'Int8Array': 'b',
'Uint8Array': 'B',
'Uint8ClampedArray': 'B',
'Int16Array': 'h',
'Uint16Array': 'H',
'Int32Array': 'i',
'Uint32Array': 'I',
'Float32Array': 'f',
'Float64Array': 'd',
'BigInt64Array': 'l',
'BigUint64Array': 'L',
'ArrayBuffer': 'B',
}
def parse_iter(parsed: typing.Any, /, *, revivers: dict[str, collections.abc.Callable[[list], typing.Any]] | None = None):
# based on https://github.com/Rich-Harris/devalue/blob/f3fd2aa93d79f21746555671f955a897335edb1b/src/parse.js
resolved = {
-1: None,
-2: None,
-3: math.nan,
-4: math.inf,
-5: -math.inf,
-6: -0.0,
}
if isinstance(parsed, int) and not isinstance(parsed, bool):
if parsed not in resolved or parsed == -2:
raise ValueError('invalid integer input')
return resolved[parsed]
elif not isinstance(parsed, list):
raise ValueError('expected int or list as input')
elif not parsed:
raise ValueError('expected a non-empty list as input')
if revivers is None:
revivers = {}
return_value = [None]
stack: list[tuple] = [(return_value, 0, 0)]
while stack:
target, index, source = stack.pop()
if isinstance(source, tuple):
name, source, reviver = source
try:
resolved[source] = target[index] = reviver(target[index])
except Exception as error:
yield TypeError(f'failed to parse {source} as {name!r}: {error}')
resolved[source] = target[index] = None
continue
if source in resolved:
target[index] = resolved[source]
continue
# guard against Python negative indexing
if source < 0:
yield IndexError(f'invalid index: {source!r}')
continue
try:
value = parsed[source]
except IndexError as error:
yield error
continue
if isinstance(value, list):
if value and isinstance(value[0], str):
# TODO: implement zips `strict=True`
if reviver := revivers.get(value[0]):
if value[1] == source:
# XXX: avoid infinite loop
yield IndexError(f'{value[0]!r} cannot point to itself (index: {source})')
continue
# inverse order: resolve index, revive value
stack.append((target, index, (value[0], value[1], reviver)))
stack.append((target, index, value[1]))
continue
elif value[0] == 'Date':
try:
result = dt.datetime.fromtimestamp(parse_iso8601(value[1]), tz=dt.timezone.utc)
except Exception:
yield ValueError(f'invalid date: {value[1]!r}')
result = None
elif value[0] == 'Set':
result = [None] * (len(value) - 1)
for offset, new_source in enumerate(value[1:]):
stack.append((result, offset, new_source))
elif value[0] == 'Map':
result = []
for key, new_source in zip(*(iter(value[1:]),) * 2):
pair = [None, None]
stack.append((pair, 0, key))
stack.append((pair, 1, new_source))
result.append(pair)
elif value[0] == 'RegExp':
# XXX: use jsinterp to translate regex flags
# currently ignores `value[2]`
result = re.compile(value[1])
elif value[0] == 'Object':
result = value[1]
elif value[0] == 'BigInt':
result = int(value[1])
elif value[0] == 'null':
result = {}
for key, new_source in zip(*(iter(value[1:]),) * 2):
stack.append((result, key, new_source))
elif value[0] in _ARRAY_TYPE_LOOKUP:
typecode = _ARRAY_TYPE_LOOKUP[value[0]]
data = base64.b64decode(value[1])
result = array.array(typecode, data).tolist()
else:
yield TypeError(f'invalid type at {source}: {value[0]!r}')
result = None
else:
result = len(value) * [None]
for offset, new_source in enumerate(value):
stack.append((result, offset, new_source))
elif isinstance(value, dict):
result = {}
for key, new_source in value.items():
stack.append((result, key, new_source))
else:
result = value
target[index] = resolved[source] = result
return return_value[0]
def parse(parsed: typing.Any, /, *, revivers: dict[str, collections.abc.Callable[[typing.Any], typing.Any]] | None = None):
generator = parse_iter(parsed, revivers=revivers)
while True:
try:
raise generator.send(None)
except StopIteration as error:
return error.value

@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2025.05.22' __version__ = '2025.06.09'
RELEASE_GIT_HEAD = '7977b329ed97b216e37bd402f4935f28c00eac9e' RELEASE_GIT_HEAD = '339614a173c74b42d63e858c446a9cae262a13af'
VARIANT = None VARIANT = None
@ -12,4 +12,4 @@ CHANNEL = 'stable'
ORIGIN = 'yt-dlp/yt-dlp' ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.05.22' _pkg_version = '2025.06.09'

Loading…
Cancel
Save