Commit Graph

1044 Commits (e64eaaa97dd00b15ff0ebde17d6d6e99e6a7394e)

Author SHA1 Message Date
Philipp Hagemeister 677c18092d [podomatic] Add extractor
Jaime Marquínez Ferrándiz 3862402ff3 Add an extractor for Clipsyndicate (closes )
Jaime Marquínez Ferrándiz b03d0d064c [imdb] Fix extraction in python 2.6
Using a regular expression because the html cannot be parsed.
Jaime Marquínez Ferrándiz d8d6148628 Add an extractor for Internet Movie Database trailers (closes )
Philipp Hagemeister fc9e1cc697 [clipfish] Use FIFA trailer as testcase ()
Philipp Hagemeister f8f60d2793 [clipfish] Fix imports ()
Philipp Hagemeister 2a275ab007 [zdf] Use _download_xml
Philipp Hagemeister a2e6db365c [zdf] add a pseudo-testcase and fix URL matching
Philipp Hagemeister 9d93e7da6c Merge branch 'master' of github.com:rg3/youtube-dl
Jaime Marquínez Ferrándiz 0e44d8381a [youtube:feeds] Use the 'paging' value from the downloaded json information (fixes )
Jaime Marquínez Ferrándiz 35907e23ec [yahoo] Fix video extraction and use the new format system exclusively
Jaime Marquínez Ferrándiz 76d1700b28 [youtube:playlist] Fix the extraction of the title for some mixes ()
Like https://www.youtube.com/watch?v=g8jDB5xOiuE&list=RDIh2gxLqR7HM
Philipp Hagemeister dcca796ce4 [clipfish] Effect a better error message ()
Filippo Valsorda 4b19e38954 [videopremium] support new .me domain
Jaime Marquínez Ferrándiz 652cdaa269 [youtube:playlist] Add support for YouTube mixes (fixes )
Jaime Marquínez Ferrándiz e26f871228 Use the new '_download_xml' helper in more extractors
Jaime Marquínez Ferrándiz 6e47b51eef [youtube:playlist] Remove the link with index 0
It's not the first video of the playlist, it appears in the 'Play all' button (see the test course for an example)
Philipp Hagemeister fb04e40396 [soundcloud] Support for listing of audio-only files
Philipp Hagemeister 8b134b1062 Merge branch 'master' of github.com:rg3/youtube-dl
Jaime Marquínez Ferrándiz 1a62c18f65 [bambuser] Skip the download in the test
It doesn't respect the 'Range' header.
Philipp Hagemeister 2a15e7063b [soundcloud] Prefer HTTP over RTMP ()
Philipp Hagemeister ea36cbac5e Merge remote-tracking branch 'rbrito/swap-dimensions'
Philipp Hagemeister de79c46c8f [viki] Fix subtitle extraction
Philipp Hagemeister 94ccb6fa2e [viki] Fix subtitles extraction
Philipp Hagemeister 07e4035879 [viki] Fix uploader extraction
Philipp Hagemeister 113577e155 [generic] Improve detection
Allow download of http://goo.gl/7X5tOk
Fixes 
Philipp Hagemeister 79d09f47c2 Merge branch 'opener-to-ydl'
Philipp Hagemeister c059bdd432 Remove quality_name field and improve zdf extractor
Philipp Hagemeister 02dbf93f0e [zdf/common] Use API in ZDF extractor.
This also comes with a lot of extra format fields
Fixes 
Philipp Hagemeister 1fb2bcbbf7 [viki] Make uploader field optional ()
Jaime Marquínez Ferrándiz 66cfab4226 [comedycentral] Add support for comedycentral.com videos (closes )
It's a subclass of MTVIE

The extractor for colbertnation.com and thedailyshow.com is called now ComedyCentralShowsIE
Philipp Hagemeister 6d88bc37a3 [viki] Skip travis test
Also provide a better error message for geoblocked videos.
Philipp Hagemeister b7553b2554 [vik] Clarify output
Philipp Hagemeister e03db0a077 Merge branch 'master' into opener-to-ydl
Jaime Marquínez Ferrándiz 267ed0c5d3 [collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring (fixes )
Uses a new helper method in InfoExtractor: _download_xml
Jaime Marquínez Ferrándiz f459d17018 [youtube] Add an extractor for downloading the watch history (closes )
Jaime Marquínez Ferrándiz dc65dcbb6d [mixcloud] The description field may be missing (fixes )
Jaime Marquínez Ferrándiz d214fdb8fe [brightcove] Don't use 'or' with the xml nodes, use the 'value' attribute instead
Philipp Hagemeister 0c7c19d6bc [clipfish] Add extractor (Fixes )
Philipp Hagemeister 382ed50e0e [viki] Add extractor (fixes )
Philipp Hagemeister 66ec019240 [youtube] do not use variable name twice
Philipp Hagemeister bd49928f7a [niconico] Clarify download
Philipp Hagemeister 23e6d50d73 [bandcamp] Remove unused variable
Philipp Hagemeister 13ebea791f [niconico] Simplify and make work with old Python versions
The website requires SSLv3, otherwise it just times out during SSL negotiation.
Philipp Hagemeister 4c9c57428f Merge remote-tracking branch 'takuya0301/niconico'
Jaime Marquínez Ferrándiz 36de0a0e1a [brightcove] Set the 'videoPlayer' value to the 'videoId' if it's missing in the parameters (fixes )
Philipp Hagemeister e5c146d586 [streamcloud] skip test on travis
Takuya Tsuchida 52ad14aeb0 Add support for niconico
Philipp Hagemeister 081640940e Merge branch 'master' of github.com:rg3/youtube-dl
Philipp Hagemeister 7012b23c94 Match --download-archive during playlist processing (Fixes )
Jaime Marquínez Ferrándiz 9f79463803 [howcast] update test's checksum
Jaime Marquínez Ferrándiz d35dc6d3b5 [bandcamp] move the album test to the album extractor and return a single track instead of a playlist
Philipp Hagemeister 3f8ced5144 Merge remote-tracking branch 'jaimeMF/yt-playlists'
Philipp Hagemeister dca0872056 Move the opener to the YoutubeDL object.
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like .
Philipp Hagemeister 15c3adbb16 Merge branch 'master' of github.com:rg3/youtube-dl
Philipp Hagemeister f143a42fe6 [bandcamp] Skip album test
Jaime Marquínez Ferrándiz 241650c7ff [vimeo] Fix the extraction of vimeo pro and player.vimeo.com videos
Philipp Hagemeister cffa6aa107 [bandcamp] Support trackinfo-style songs (Fixes )
Philipp Hagemeister 02e4ebbbad [streamcloud] Add IE (Fixes )
Philipp Hagemeister ab009f59ef [toutv] Fix a typo
Jaime Marquínez Ferrándiz 0980426559 [bandcamp] add support for albums (reported in )
Jaime Marquínez Ferrándiz 64bb5187f5 [soundcloud] Retrieve the file url using the client_id for the iPhone (fixes )
The desktop's client_id always give the rtmp url, but with the iPhone one it returns the http url if it's available.
Philipp Hagemeister 9e4f50a8ae [sztv] skip test, site is undergoing mid-term maintenance
Philipp Hagemeister 0190eecc00 [nhl] Make NHLVideocenter IE_DESC fit with other descriptions
Philipp Hagemeister ca872a4c0b [spankwire] Fix description search
Philipp Hagemeister f2e87ef4fa [anitube] Skip test (on travis)
Philipp Hagemeister 0ad97bbc05 [spankwire] fix check for description
Philipp Hagemeister c4864091a1 [videopremium] Support new crazy redirect scheme
Philipp Hagemeister 9a98a466b3 [toutv] really skip test
Philipp Hagemeister da6a795fdb [escapist] Fix title search
Philipp Hagemeister c5edcde21f [escapist] upper-case URL
Philipp Hagemeister 15ff3c831e [escapist] Fix syntax error
Philipp Hagemeister 100959a6d9 [escapist] Add support for HD format (Closes )
Philipp Hagemeister 8f05351984 [anitube] Minor fixes ()
Philipp Hagemeister 71791f414c Merge remote-tracking branch 'diffycat/master'
Philipp Hagemeister f3682997d7 Clean up unused imports and other minor mistakes
Philipp Hagemeister cc13cc0251 [teamcoco] Correct error
Philipp Hagemeister 5904088811 Add support for tou.tv (Fixes )
Jaime Marquínez Ferrándiz 69545c2aff [d8] inherit from CanalplusIE
it reuses the same extraction process
Jaime Marquínez Ferrándiz 495da337ae Merge pull request from migbac/master
Add support for d8.tv
Philipp Hagemeister cb7dfeeac4 [youtube] only allow domain name to be upper-case ()
Philipp Hagemeister 4113e6ab56 [auengine] Do not return unnecessary ext
Philipp Hagemeister 9906d397a0 [auengine] Simplify
Philipp Hagemeister 887c6acdf2 Support multiple embedded YouTube URLs (Fixes )
Philipp Hagemeister 83aa529330 Support protocol-independent URLs ()
Philipp Hagemeister fccd377198 Suppor embed-only videos (Fixes )
Philipp Hagemeister 63b7b7224a [MTVIE] Try with RTMP URL if download fails
This fixes youtube-dl http://www.southpark.de/clips/155251/cartman-vs-the-dog-whisperer
rzhxeo 746f491f82 Add support for southpark.de
rzhxeo 1672647ade [SouthParkStudiosIE] Move from _TEST to _TESTS
rzhxeo 90b6bbc38c [SouthParkStudiosIE] Also detect urls without http:// or www
Philipp Hagemeister 1d699755e0 [youtube] Add view_count (Fixes )
Philipp Hagemeister ddf49c6344 [arte] remove two typos
Anton Larionov ba3881dffd Add support for anitube.se ()
Philipp Hagemeister d1c252048b [redtube] Do not test md5, seems to vary
Philipp Hagemeister eab2724138 [gamekings] Do not test md5 sum, precise file changes regularly
Philipp Hagemeister 21ea3e06c9 [gamekings] remove unnecessary import
Philipp Hagemeister 52d703d3d1 [tvp] Skip tests
Philipp Hagemeister ce152341a1 [bambuser] Do not test for MD5, seems to be flaky
Philipp Hagemeister f058e34011 [dailymotion] Fix playlists
Philipp Hagemeister 7150858d49 [spiegel] Implement format selection
Philipp Hagemeister 91c7271aab Add automatic generation of format note based on bitrate and codecs
Philipp Hagemeister fc2ef392be [ted] Fix playlists (Fixes )
Philipp Hagemeister 463a908705 [ted] simplify
Jaime Marquínez Ferrándiz d24ffe1cfa [rtlnow] Remove the test for nitro
The videos expire.
Jaime Marquínez Ferrándiz 78fb87b283 Don't accept '>' inside the content attribute in OpenGraph regexes
Jaime Marquínez Ferrándiz ab2d524780 Improve the OpenGraph regex
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
Jaime Marquínez Ferrándiz 85d61685f1 [tvp] Update the title and the description of the test video
Jaime Marquínez Ferrándiz b9643eed7c [youtube:channel] Fix the extraction of autogenerated channels
The ajax pages are empty, now it looks directly in the channel's /videos page
Philipp Hagemeister 0e145dd541 Merge branch 'master' of github.com:rg3/youtube-dl
Philipp Hagemeister 9f9be844fc [youtube] Fix protocol-independent URLs (Fixes )
Jaime Marquínez Ferrándiz e3b9ab5e18 [soundlcoud] Set the correct extension for the tracks (fixes )
Some tracks are not in mp3 format, they can be wav files.
Jaime Marquínez Ferrándiz c66d2baa9c [livestream] Add an extractor for the original version of livestream (closes )
The two versions use different systems.
Jaime Marquínez Ferrándiz ca715127a2 Don't assume the 'subtitlesformat' is set in the params dict (fixes )
Jaime Marquínez Ferrándiz ea7a7af1d4 [gamekings] Fix the test video checksum
Jaime Marquínez Ferrándiz 880e1c529d [youtube:playlist] Login into youtube if requested (fixes )
Allows to download private playlists
Jaime Marquínez Ferrándiz dcbb45803f [youtube:playlist] Don't use the gdata api (closes )
Parse the playlist pages instead
Philipp Hagemeister c3a3028f9f [tvp] Minor improvements ()
Philipp Hagemeister 6c5ad80cdc Merge remote-tracking branch 'saper/tvp'
Philipp Hagemeister 384b98cd8f [gamekings] Minor fixes ()
Jelle van der Waa eb9b5bffef Add extractor for gamekings.tv
migbac 0bd59f3723 Add support for d8.tv
Jaime Marquínez Ferrándiz 8b8cbd8f6d [vine] Fix uploader extraction
Philipp Hagemeister eb0a839866 [common] Simplify og_search_property
Jaime Marquínez Ferrándiz 0ed05a1d2d Use the 'rtmp_live' field for the live parameter of rtmpdump
Jaime Marquínez Ferrándiz 1008bebade Merge remote-tracking branch 'rzhxeo/rtmpdump_live'
Jaime Marquínez Ferrándiz be6dfd1b49 [ted] Return a single info_dict for talks urls
It failed with the --list-subs option
Jaime Marquínez Ferrándiz 231516b6c9 Merge pull request from iemejia/master
[ted] support for subtitles
Jaime Marquínez Ferrándiz fb53d58dcf Merge pull request from saper/escaped
Fix AssertionError when og property not found
Jaime Marquínez Ferrándiz f470c6c812 [arte] Improve the format sorting
Also use the bitrate.
Prefer normal version and sourds/mal version over original version with subtitles.
Jaime Marquínez Ferrándiz 566d4e0425 [arte] Make sure the format_id is unique (closes )
Include the bitrate and use the height instead of the quality field.
Jaime Marquínez Ferrándiz 81be02d2f9 [cnn] Accept www.cnn.com urls (fixes )
Jaime Marquínez Ferrándiz c2b6a482d5 [brightcove] the format function requires to specify the index in python2.6
Jaime Marquínez Ferrándiz 12c167c881 [soundcloud] Allow to download tracks marked as not 'streamable'
They use the rtmp protocol but if the are marked as 'downloadable' it can use the direct download link.
Jaime Marquínez Ferrándiz 20aafee7fa [kankan] Fix the video url
It now requires two additional parameters, one is a timestamp we get from the getCdnresource_flv page and the other is a key we have to build.
Jaime Marquínez Ferrándiz dd5bcdc4c9 [brightcove] Set the 'Referer' header if the url has the 'linkBaseUrl' parameter (fixes )
Jaime Marquínez Ferrándiz b1a80ec1a9 [xnxx] Accept urls that start with 'www' (fixes )
Jaime Marquínez Ferrándiz 51040b72ed [brightcove] Support redirected urls from bcove.me (fixes )
'bctid' needs to be changed to '@videoPlayer', and 'bckey' to 'playerKey'.
Jaime Marquínez Ferrándiz 4f045eef8f [youtube:channel] Fix the extraction
The page don't include the 'load more' button anymore, now we directly get the 'c4_browse_ajax' pages.
Jaime Marquínez Ferrándiz 5d7b253ea0 Add an extractor for eitb.tv (fixes )
The BrighcoveExperience object doesn't contain the video id, the extractor adds it and passes the url to BrightcoveIE.
Jaime Marquínez Ferrándiz b0759f0c19 [brightcove] Extract all the available formats
Jaime Marquínez Ferrándiz 065472936a Add an extractor for space.com (fixes )
It uses Brightcove, but requires some special process for getting a url with the playerKey field in some videos
Jaime Marquínez Ferrándiz fc4a0c2aec [brightcove] Change the 'videoId' or 'videoID' field to '@videoPlayer' (fixes )
It seems to be needed when using the htmlFederated page
Jaime Marquínez Ferrándiz eeb165e674 [brightcove] Add the extraction of the url from generic
Jaime Marquínez Ferrándiz 9ee2b5f6f2 tests: don't run the test if any of the extractors listed in the 'add_ie' field is marked as not working
Marcin Cieślak 5137ebac0b [tvp] Telewizja Polska: new extractor for tvp.pl, fixes
Thanks-To: mplonski

https://github.com/mplonski/linux/blob/master/tvp-dl.py
Marcin Cieślak a8eeb0597b Fix AssertionError when og property not found
On tvp.pl some webpages contain OpenGraph
metadata and some don't.

If og property is not found, _og_search_description
fails with

WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
  File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
    youtube_dl.main()
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
    _real_main(argv)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
    retcode = ydl.download(all_urls)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
    videos = self.extract_info(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
    ie_result = ie.extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
    return self._real_extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
    info['description'] = self._og_search_description(webpage)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
    return self._og_search_property('description', html, fatal=False, **kargs)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
    return unescapeHTML(escaped)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
    assert type(s) == type(u'')
AssertionError

The patch allows me to use:

  try:
    info['description'] = self._og_search_description(webpage)
    info['thumbnail'] = self._og_search_thumbnail(webpage)
  except RegexNotFoundError:
    pass
Ismaël Mejía 4ed3e51080 [ted] fixed error in case of no subtitles present
I created a test, but I leave it commented since TED videos get
new subtitles frequently.
rzhxeo 2dcf7d8f99 [GenericIE] Also detect youtube if src url of iframe is embedded in ' instaed of "
Jaime Marquínez Ferrándiz 19b0668251 [canal2c] Accept more urls (fixes )
The url only needs to have the 'idVideo' field in the query, in any position.
We have to set the 'void=oui' in the webpage url, so that we get the file name.
Jaime Marquínez Ferrándiz e7e6b54d8a [teamcoco] Parse the xml file and extract all the formats