Commit Graph

218 Commits (799756a3b3c794284ca52b9af482e1f03fc46833)

Author SHA1 Message Date
Sergey M․ 6562d34a8c
[utils] Improve mimetype2ext
Yen Chi Hsuan 70852b47ca
[utils] Recognize units with full names in parse_filename
Reference: https://en.wikipedia.org/wiki/Template:Quantities_of_bytes
Yen Chi Hsuan e4659b4547
[utils] Correct octal/hexadecimal number detection in js_to_json
Sergey M․ 13585d7682
[utils] Recognize lowercase units in parse_filesize
Remita Amine 5f2c2b7936 [test_utils] add test for option with not str value
Sergey M․ a8795327ca
[utils] Add support TV Parental Guidelines ratings in parse_age_limit
Yen Chi Hsuan 7dc2a74e0a
[utils] Fix unified_timestamp for formats parsed by parsedate_tz()
Yen Chi Hsuan 0b68de3cc1 Merge pull request from remitamine/html5_media
[extractor/common] add helper method to extract html5 media entries
Yen Chi Hsuan 84c237fb8a
[utils] Add get_element_by_class
For 
Remita Amine dfaa86b75e [test_utils] add test for smuggling a smuggled url
remitamine 4f3c5e0627 [utils] add helper function for parsing codecs
Yen Chi Hsuan 1143535d76
[utils] Add urshift()
Used in IqiyiIE and LeIE
Sergey M․ 46f59e89ea
[utils] Add unified_timestamp
Yen Chi Hsuan 47212f7bcb
[utils] Don't transform numbers not starting with a zero
Fix test_Viidea and maybe others
Yen Chi Hsuan 55b2f099c0
[utils] Decode HTML5 entities
Used in test_Vporn_1. Also related to 
bzc6p b96f007eeb Added sanitization support for Hungarian letters Ő and Ű
Sergey M․ 46bc9b7d7c
[utils] Allow None in remove_{start,end}
Sergey M․ 364cf465dd
[test_utils] PEP 8
Sergey M․ 89ac4a19e6
[utils] Process non-base 10 integers in js_to_json
felix bd1e484448
[utils] js_to_json: various improvements
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
Yen Chi Hsuan 778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
Fixes 
Yen Chi Hsuan dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py
Adam Thalhammer 31c4448f6e Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes
Adam Thalhammer 79a2e94e79 Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes
Sergey M b6c0d4f431 Merge pull request from remitamine/parse_duration
[utils] imporove parse_duration to handle more formats
remitamine acaff49575 [utils] imporove parse_duration to handle more formats
Jaime Marquínez Ferrándiz eb9c3edd5e [test/utils] Add test for date_from_str
Yen Chi Hsuan 81f36eba88 [test/test_utils] Update for escape_url change (again)
Yen Chi Hsuan 2d60465e44 [test/test_utils] Update for escape_url change
Jaime Marquínez Ferrándiz 782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string
Sergey M․ c5229f3926 [utils] PEP 8
remitamine 83548824c2 Merge pull request from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
Sergey M․ fb47597b09 [bbc] Generalize unit table lookup and add parse_count
remitamine 3201a67f61 [test/test_utils] add more tests for update_url_query
remitamine fb640d0a3d [test/test_utils] add tests for update_url_query
Brian Foley 8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
Yen Chi Hsuan 5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
Sergey M․ f160785c5c [utils] Remove AM/PM from unified_strdate patterns
Yen Chi Hsuan 5bc880b988 [utils] Add OHDave's RSA encryption function
Sergey M․ 8411229bd5 [utils] Allow dot in strip_jsonp
Sergey M․ 86296ad2cd [utils] Add ability to control skipping false values in dict_get
Sergey M․ cbecc9b903 [utils] Add dict_get convenience method
Sergey M․ 6b77d52b1f [test_utils] Add tests for encode_compat_str
Yen Chi Hsuan db2fe38b55 [utils] Support alternative timestamp format in TTML
Fixes 
Yen Chi Hsuan d631d5f9f2 [utils] Fix TTML conversion
Tolerate invalid timestamps (closes )
Sergey M․ 31b2051e21 [utils] Add remove_quotes
Sergey M․ 9cb9a5df77 [utils] Check ext with trailing slash against the list of known extensions
Sergey M․ 5035536e3f [test_utils] Add tests for determine_ext
Sergey M․ 7aefc49c40 [utils] Skip invalid/non HTML entities (Closes )
Jaime Marquínez Ferrándiz 6a75040278 [utils] unified_strdate: Return None if the date format can't be recognized (fixes )
This issue was introduced with ae12bc3ebb, it returned 'None'.
Sergey M 30eecc6a04 Merge pull request from jaimeMF/xml_attrib_unicode
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
Sergey M․ 578c074575 [utils] Support list of xpath in xpath_element
Sergey M․ 52c3a6e49d [utils] Improve parse_iso8601
Jaime Marquínez Ferrándiz 36e6f62cd0 Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ()
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
Sergey M․ d01949dc89 [utils:js_to_json] Fix bad escape in double quoted strings
Sergey M․ f71264490c [test_utils] Add tests for cli option converters
Sergey M․ 87f70ab39d [test_utils] Add more tests for xpath
Sergey M․ ee114368ad [utils] Make value optional for find_xpath_attr
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
Yen Chi Hsuan 9c29bc69f7 [utils] Improve parse_duration
Now dots are parsed. For example '87 Min.'
Yen Chi Hsuan 1b0427e6c4 [utils] Support TTML without default namespace
In a strict sense such TTML is invalid, but Yahoo uses it.
Yen Chi Hsuan 7dff03636a [utils] Support 'dur' field in TTML
Yen Chi Hsuan d39e0f05db [utils] Remove sanitize_url_path_consecutive_slashes()
This function is used only in SohuIE, which is updated to use a new
extraction logic.
Yen Chi Hsuan 0fe2ff78e6 [NBC] Enhance embedURL extraction (closes )
Sergey M․ b3ed15b760 [utils] Add replace_extension
Sergey M․ a4bcaad773 [test_utils] Add tests for prepend_extension
Yen Chi Hsuan bf6427d2fb [ffmpeg] Add dfxp (TTML) subtitles support (, )
Yen Chi Hsuan 0a1603634b [utils] Remove url_infer_protocol
Yen Chi Hsuan 418c5cc3fc [udn] Add new extractor
Sergey M․ 8cf70de428 [test_utils] Add test for unified_strdate
Sergey M․ ba9e68f402 [utils] Drop trailing comma before closing brace
Naglis Jonaitis 91757b0f37 [utils] Escape all HTML entities written in hexadecimal form
Jaime Marquínez Ferrándiz 5379a2d40d [test/utils] Test xpath_text
Sergey M․ 92a4793b3c [utils] Place sanitize url function near other sanitizing functions
Sergey M․ dc03a42537 Merge branch 'sohu_fix' of https://github.com/yan12125/youtube-dl into yan12125-sohu_fix
Sergey M․ 2ebfeacabc [utils] Keep dot and dotdot unmodified (Closes )
Sergey M․ f18ef2d144 [utils] Disallow trailing dot in sanitize_path for a path part
Sergey M․ a2aaf4dbc6 [utils] Add sanitize_path
Yen Chi Hsuan 55969016e9 [utils] Add a function to sanitize consecutive slashes in URLs
Philipp Hagemeister a7440261c5 [utils] Streap leading dots
Fixes , closes 
Philipp Hagemeister 3e675fabe0 [airmozilla] Be more tolerant when nonessential items are missing ()
Philipp Hagemeister 5a42414b9c [utils] Prevent hyphen at beginning of filename (Fixes )
Philipp Hagemeister d305dd73a3 [utils] Fix js_to_json
Previously, the runtime could be atrocious for longer inputs.
Philipp Hagemeister 347de4931c [YoutubeDL] Add generic video filtering (Fixes )
This functionality is intended to eventually encompass the current format filtering.
Philipp Hagemeister 9bb8e0a3f9 [wsj] Add new extractor (Fixes )
Philipp Hagemeister 8f4b58d70e [ntvde] Add new extractor (Fixes )
Philipp Hagemeister cfb56d1af3 Add --list-thumbnails
Philipp Hagemeister 61ca9a80b3 [generic] Add support for BOMs (Fixes )
Naglis Jonaitis a69801e2c6 [utils] Add additional format to unified_strdate
Sergey M․ a5fb718c50 [test_utils] Add more tests for parse_duration
Philipp Hagemeister 2aeb06d6dc [utils] Improve colon handling (Fixes )
Philipp Hagemeister 0590062925 Respect age_limit when listing extractors (Fixes )
Philipp Hagemeister cae97f6521 Improve and test ffmpeg version detection
Philipp Hagemeister 42bdd9d051 [cinchcast] Add new extractor (Fixes )
Philipp Hagemeister 47d7c64274 [test_utils] Make test more realistically ()
Philipp Hagemeister 5f9b83944d [ffmpeg] Improve version check and call it from hls (Fixes )
Philipp Hagemeister e8df5cee12 [minhateca] Fix duration parsing
Philipp Hagemeister 4349c07dd7 [minhateca] Add extractor (Fixes )
Philipp Hagemeister e075a44afb [tests] Remove useless u prefixes
Philipp Hagemeister be64b5b098 [xminus] Simplify and extend ()
Jouke Waleson 8bcc875676 PEP8: more applied