[3.9] gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508...
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Mon, 22 May 2023 10:42:37 +0000 (03:42 -0700)
committerAdrian Bunk <bunk@debian.org>
Sun, 1 Dec 2024 12:12:57 +0000 (14:12 +0200)
commit9d0f52e38927a8eedc433f2c4246db10b1557310
tree0d933999fc8cf110fbe0db82476cb2a9fdcc279b
parent189e6f302ac4046d32c658588e2a07365a768405
[3.9] gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508) (GH-104575) (GH-104592) (#104593)

gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508)

`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit GH-25595.

This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).

I simplified the docs by eliding the state of the world explanatory
paragraph in this security release only backport.  (people will see
that in the mainline /3/ docs)

(cherry picked from commit 2f630e1ce18ad2e07428296532a68b11dc66ad10)
(cherry picked from commit 610cc0ab1b760b2abaac92bd256b96191c46b941)
(cherry picked from commit f48a96a28012d28ae37a2f4587a780a5eb779946)

Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
Gbp-Pq: Name 0013-3.9-gh-102153-Start-stripping-C0-control-and-space-c.patch
Lib/test/test_urlparse.py
Lib/urllib/parse.py