CVE-2024-38428: GNU Wget url_skip_credentials mishandles ';' in userinfo, enabling hostname confusion
e4570f17-07f4-4910-9f35-a3c2b9a2248c
GNU Wget <= 1.24.5 mishandles the ';' character inside the userinfo subcomponent of a URI. In src/url.c, url_skip_credentials() uses strpbrk(url, "@/?#;") to find the '@' that ends the userinfo. Because ';' is incorrectly listed as a terminator (RFC 3986 explicitly allows ';' in userinfo as a sub-delim), any URL of the form scheme://X;Y@host/path causes the function to land on ';' first, see that *p != '@', and return the original URL unchanged — wget then treats the URL as having no userinfo. The userinfo bytes leak into the subsequently-parsed host string, producing 'insufficient separation between the userinfo subcomponent and the host subcomponent' (CVE-2024-38428). An attacker can craft URLs like http://trusted.example;@evil.example/ that look benign on inspection but cause wget to actually contact evil.example, breaking any host-based trust, logging, or filtering.