Commit Graph

146 Commits (20cfdcc910d0bc2ee4b0ee38bdf5e6ecb67e5731)

Author SHA1 Message Date
Adam Thalhammer 31c4448f6e Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes
Adam Thalhammer 79a2e94e79 Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes
Sergey M b6c0d4f431 Merge pull request from remitamine/parse_duration
[utils] imporove parse_duration to handle more formats
remitamine acaff49575 [utils] imporove parse_duration to handle more formats
Jaime Marquínez Ferrándiz eb9c3edd5e [test/utils] Add test for date_from_str
Yen Chi Hsuan 81f36eba88 [test/test_utils] Update for escape_url change (again)
Yen Chi Hsuan 2d60465e44 [test/test_utils] Update for escape_url change
Jaime Marquínez Ferrándiz 782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string
Sergey M․ c5229f3926 [utils] PEP 8
remitamine 83548824c2 Merge pull request from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
Sergey M․ fb47597b09 [bbc] Generalize unit table lookup and add parse_count
remitamine 3201a67f61 [test/test_utils] add more tests for update_url_query
remitamine fb640d0a3d [test/test_utils] add tests for update_url_query
Brian Foley 8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
Yen Chi Hsuan 5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
Sergey M․ f160785c5c [utils] Remove AM/PM from unified_strdate patterns
Yen Chi Hsuan 5bc880b988 [utils] Add OHDave's RSA encryption function
Sergey M․ 8411229bd5 [utils] Allow dot in strip_jsonp
Sergey M․ 86296ad2cd [utils] Add ability to control skipping false values in dict_get
Sergey M․ cbecc9b903 [utils] Add dict_get convenience method
Sergey M․ 6b77d52b1f [test_utils] Add tests for encode_compat_str
Yen Chi Hsuan db2fe38b55 [utils] Support alternative timestamp format in TTML
Fixes 
Yen Chi Hsuan d631d5f9f2 [utils] Fix TTML conversion
Tolerate invalid timestamps (closes )
Sergey M․ 31b2051e21 [utils] Add remove_quotes
Sergey M․ 9cb9a5df77 [utils] Check ext with trailing slash against the list of known extensions
Sergey M․ 5035536e3f [test_utils] Add tests for determine_ext
Sergey M․ 7aefc49c40 [utils] Skip invalid/non HTML entities (Closes )
Jaime Marquínez Ferrándiz 6a75040278 [utils] unified_strdate: Return None if the date format can't be recognized (fixes )
This issue was introduced with ae12bc3ebb, it returned 'None'.
Sergey M 30eecc6a04 Merge pull request from jaimeMF/xml_attrib_unicode
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
Sergey M․ 578c074575 [utils] Support list of xpath in xpath_element
Sergey M․ 52c3a6e49d [utils] Improve parse_iso8601
Jaime Marquínez Ferrándiz 36e6f62cd0 Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ()
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
Sergey M․ d01949dc89 [utils:js_to_json] Fix bad escape in double quoted strings
Sergey M․ f71264490c [test_utils] Add tests for cli option converters
Sergey M․ 87f70ab39d [test_utils] Add more tests for xpath
Sergey M․ ee114368ad [utils] Make value optional for find_xpath_attr
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
Yen Chi Hsuan 9c29bc69f7 [utils] Improve parse_duration
Now dots are parsed. For example '87 Min.'
Yen Chi Hsuan 1b0427e6c4 [utils] Support TTML without default namespace
In a strict sense such TTML is invalid, but Yahoo uses it.
Yen Chi Hsuan 7dff03636a [utils] Support 'dur' field in TTML
Yen Chi Hsuan d39e0f05db [utils] Remove sanitize_url_path_consecutive_slashes()
This function is used only in SohuIE, which is updated to use a new
extraction logic.
Yen Chi Hsuan 0fe2ff78e6 [NBC] Enhance embedURL extraction (closes )
Sergey M․ b3ed15b760 [utils] Add replace_extension
Sergey M․ a4bcaad773 [test_utils] Add tests for prepend_extension
Yen Chi Hsuan bf6427d2fb [ffmpeg] Add dfxp (TTML) subtitles support (, )
Yen Chi Hsuan 0a1603634b [utils] Remove url_infer_protocol
Yen Chi Hsuan 418c5cc3fc [udn] Add new extractor
Sergey M․ 8cf70de428 [test_utils] Add test for unified_strdate
Sergey M․ ba9e68f402 [utils] Drop trailing comma before closing brace
Naglis Jonaitis 91757b0f37 [utils] Escape all HTML entities written in hexadecimal form
Jaime Marquínez Ferrándiz 5379a2d40d [test/utils] Test xpath_text