-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Summary
It appears logic introduced in #2695 to work-around the itms-services: schema oddity has an overly broad assumption that is results in invalid URIs due to not percent-encoding relative IRIs as intended.
Ultimately, this can trick browsers into behaving badly when faced with invalid relative URL locations. But I think it is ours to fix in werkzeug because the current code is emitting an invalid location header per RFCs (and the presumed intention of the code.)
RFC 3986 for URIs has a section 2 that essentially says URI components should be percent-encoded if they aren't part of the "reserved" or explicitly "unreserved" lists of characters. As an case-study: backslashes should be percent-encoded. Indeed, werkzeug seems to intent to encode them, but this issue describes the case where the typical encoding logic is not being applied.
Replication
As an example consider an app expecting to use redirect(relative_uri) and expecting werkzeug to properly encode the Location: header (as it historically would before the mentioned work-around)
from werkzeug.utils import redirect
problem_relative_uri = '/\\\\github.com?extra=data&blank=#fragment'
response = redirect(problem_relative_uri)
response.location
# '/\\\\github.com?extra=data&blank=#fragment'Note the backslashes (\) in this URI are /not/ being encoded as %5C .
As a smaller test case:
from werkzeug.urls import iri_to_uri, _invalid_iri_to_uri
problem_relative_uri = '/\\\\github.com?extra=data&blank=#fragment'
_invalid_iri_to_uri(problem_relative_uri)
# '/\\\\github.com?extra=data&blank=#fragment'
# :-(Expected Behavior
The previous encoding behavior before the wrapper was desirable
iri_to_uri(problem_relative_uri)
# '/%5C%5Cgithub.com?extra=data&blank=#fragment'
# :-)Environment:
- Python version: 3.12.1
- Werkzeug version: 3.0.1
Relevant Code
werkzeug/src/werkzeug/wrappers/response.py
Lines 482 to 483 in d3dd65a
| if location is not None: | |
| location = _invalid_iri_to_uri(location) |
Lines 167 to 185 in d3dd65a
| def _invalid_iri_to_uri(iri: str) -> str: | |
| """The URL scheme ``itms-services://`` must contain the ``//`` even though it does | |
| not have a host component. There may be other invalid schemes as well. Currently, | |
| responses will always call ``iri_to_uri`` on the redirect ``Location`` header, which | |
| removes the ``//``. For now, if the IRI only contains ASCII and does not contain | |
| spaces, pass it on as-is. In Werkzeug 3.0, this should become a | |
| ``response.process_location`` flag. | |
| :meta private: | |
| """ | |
| try: | |
| iri.encode("ascii") | |
| except UnicodeError: | |
| pass | |
| else: | |
| if len(iri.split(None, 1)) == 1: | |
| return iri | |
| return iri_to_uri(iri) |