Yen Chi Hsuan
55b2f099c0
[utils] Decode HTML5 entities
...
Used in test_Vporn_1. Also related to #9270
9 years ago
bzc6p
b96f007eeb
Added sanitization support for Hungarian letters Ő and Ű
9 years ago
Sergey M․
46bc9b7d7c
[utils] Allow None in remove_{start,end}
9 years ago
Sergey M․
364cf465dd
[test_utils] PEP 8
9 years ago
Sergey M․
89ac4a19e6
[utils] Process non-base 10 integers in js_to_json
9 years ago
felix
bd1e484448
[utils] js_to_json: various improvements
...
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
9 years ago
Yen Chi Hsuan
778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
...
Fixes #9463
9 years ago
Yen Chi Hsuan
dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py
9 years ago
Adam Thalhammer
31c4448f6e
Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347
9 years ago
Adam Thalhammer
79a2e94e79
Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347
9 years ago
Sergey M
b6c0d4f431
Merge pull request #9110 from remitamine/parse_duration
...
[utils] imporove parse_duration to handle more formats
9 years ago
remitamine
acaff49575
[utils] imporove parse_duration to handle more formats
9 years ago
Jaime Marquínez Ferrándiz
eb9c3edd5e
[test/utils] Add test for date_from_str
9 years ago
Yen Chi Hsuan
81f36eba88
[test/test_utils] Update for escape_url change (again)
9 years ago
Yen Chi Hsuan
2d60465e44
[test/test_utils] Update for escape_url change
9 years ago
Jaime Marquínez Ferrándiz
782b1b5bd1
[utils] lookup_unit_table: Match word boundary instead of end of string
9 years ago
Sergey M․
c5229f3926
[utils] PEP 8
9 years ago
remitamine
83548824c2
Merge pull request #8092 from bpfoley/twitter-thumbnail
...
[utils] Add extract_attributes for extracting html tag attributes
9 years ago
Sergey M․
fb47597b09
[bbc] Generalize unit table lookup and add parse_count
9 years ago
remitamine
3201a67f61
[test/test_utils] add more tests for update_url_query
9 years ago
remitamine
fb640d0a3d
[test/test_utils] add tests for update_url_query
9 years ago
Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
...
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
9 years ago
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
...
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
9 years ago
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
9 years ago
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
9 years ago
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
9 years ago
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
9 years ago
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
9 years ago
Sergey M․
6b77d52b1f
[test_utils] Add tests for encode_compat_str
9 years ago
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
...
Fixes #7608
9 years ago
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
...
Tolerate invalid timestamps (closes #7909 )
9 years ago
Sergey M․
31b2051e21
[utils] Add remove_quotes
9 years ago
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
9 years ago
Sergey M․
5035536e3f
[test_utils] Add tests for determine_ext
9 years ago
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
9 years ago
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
...
This issue was introduced with ae12bc3ebb
, it returned 'None'.
9 years ago
Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
9 years ago
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
9 years ago
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
9 years ago
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
9 years ago
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
9 years ago
Sergey M․
f71264490c
[test_utils] Add tests for cli option converters
9 years ago
Sergey M․
87f70ab39d
[test_utils] Add more tests for xpath
9 years ago
Sergey M․
ee114368ad
[utils] Make value optional for find_xpath_attr
...
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
9 years ago
Yen Chi Hsuan
9c29bc69f7
[utils] Improve parse_duration
...
Now dots are parsed. For example '87 Min.'
9 years ago
Yen Chi Hsuan
1b0427e6c4
[utils] Support TTML without default namespace
...
In a strict sense such TTML is invalid, but Yahoo uses it.
10 years ago
Yen Chi Hsuan
7dff03636a
[utils] Support 'dur' field in TTML
10 years ago
Yen Chi Hsuan
d39e0f05db
[utils] Remove sanitize_url_path_consecutive_slashes()
...
This function is used only in SohuIE, which is updated to use a new
extraction logic.
10 years ago
Yen Chi Hsuan
0fe2ff78e6
[NBC] Enhance embedURL extraction ( closes #2549 )
10 years ago
Sergey M․
b3ed15b760
[utils] Add replace_extension
10 years ago