Skip to content

Conversation

@marychatte
Copy link
Member

Subsystem
Client, DefaultRequest, CIO, Darwin

Motivation
KTOR-6837

Solution
DefaultRequest.host allows setting non-host values, which can include path, query, or fragment (e.g., "httpbin.org/status"), and this produces an invalid URL where the host field contains path segments. Most engines recover by parsing the full URL string and sending a correct request, but CIO and Darwin use the host field directly and therefore fail.
Since this behaviour could be used, the solution is to add a warning now and throw an exception for an invalid host in DefaultRequest in 4.0.0. Also, normalize URLs in CIO and Darwin by reparsing from url.toString() only when the host is invalid, aligning their behavior with that of other client engines.

@marychatte marychatte self-assigned this Nov 19, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 19, 2025

Walkthrough

This PR introduces URL normalization for invalid hosts containing special characters ('/', '?', '#'), adds validation warnings in the default request configuration, and improves URL-to-NSURL conversion on Darwin platforms. The changes ensure consistent URL handling across CIO, core, and Darwin engines.

Changes

Cohort / File(s) Summary
Core URL Normalization Pattern
ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/CIOEngine.kt, ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/utils.kt
Introduces new internal extension Url.rebuildIfNeeded() that reconstructs URLs when host contains invalid characters. Applied in CIOEngine.selectEndpoint() and writeHeaders() to normalize URLs before endpoint resolution and header processing.
Request Validation
ktor-client/ktor-client-core/common/src/io/ktor/client/plugins/DefaultRequest.kt
Adds validation warning in host setter to detect and log when host contains '/', '?', or '#', directing users to use url(...) or url{ ... } instead.
Darwin URL Component Handling
ktor-client/ktor-client-darwin-legacy/darwin/src/io/ktor/client/engine/darwin/internal/legacy/DarwinLegacyUrlUtils.kt, ktor-client/ktor-client-darwin/darwin/src/io/ktor/client/engine/darwin/internal/DarwinUrlUtils.kt
Reworks Url.toNSUrl() to comprehensively map all URL components (password, host, port, path, query, fragment) with proper encoding handling. Uses unified with(Url(...)) scope to build NSURLComponents with encoded and raw value considerations.
Integration Test
ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
Adds testInvalidHostInDefaultRequest() to validate URL normalization behavior when an invalid host is set via default request configuration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Darwin URL utilities: The refactored toNSUrl() methods involve complex mapping of multiple URL components with conditional encoding logic and require careful verification of encoding correctness across password, host, port, path, query, and fragment handling.
  • URL normalization consistency: The new rebuildIfNeeded() pattern appears across multiple engines (CIO and Darwin); verify the normalization logic is consistent and handles edge cases uniformly.
  • Validation warning placement: Confirm the DefaultRequest host validation doesn't introduce unexpected logging noise or performance impact in common usage patterns.
  • Test coverage: Ensure the new test adequately covers the invalid host normalization path and integrates properly with existing test infrastructure.

Suggested reviewers

  • osipxd

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'KTOR-6837 Fix client URL handling for invalid DefaultRequest.host' directly aligns with the main objective: fixing URL handling issues with invalid DefaultRequest.host values in CIO and Darwin engines.
Description check ✅ Passed The PR description includes all required template sections: Subsystem (Client, DefaultRequest, CIO, Darwin), Motivation (KTOR-6837 issue reference), and Solution (detailed explanation of the problem and fix approach).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch marychatte/KTOR-6837-Discrepancies-when-parsing-URL-host-with-CIO-and-Darwin-engines-compared-to-the-rest-engines

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@marychatte marychatte requested a review from osipxd November 19, 2025 23:25
@marychatte
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 20, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/utils.kt (1)

41-68: Normalize URL in writeHeaders is good; consider doing the same in startTunnel

Using request.url.rebuildIfNeeded() in writeHeaders ensures both the request line and Host header are based on a normalized URL, fixing the CIO behavior when host incorrectly contains path/query/fragment. That part looks solid.

However, startTunnel still uses request.url.hostWithPort directly, so an invalid host can still leak into the CONNECT request line and Host header when going through an HTTP proxy.

You can reuse the same normalization there with minimal risk:

 internal suspend fun startTunnel(
     request: HttpRequestData,
     output: ByteWriteChannel,
     input: ByteReadChannel
 ) {
     val builder = RequestResponseBuilder()

     try {
-        val hostWithPort = request.url.hostWithPort
+        val url = request.url.rebuildIfNeeded()
+        val hostWithPort = url.hostWithPort
         builder.requestLine(HttpMethod("CONNECT"), hostWithPort, HttpProtocolVersion.HTTP_1_1.toString())
         builder.headerLine(HttpHeaders.Host, hostWithPort)

This would make the proxy CONNECT path consistent with the rest of CIO’s URL handling for the same class of misconfigurations.

Also applies to: 217-256

🧹 Nitpick comments (3)
ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/CIOEngine.kt (1)

109-143: URL normalization in selectEndpoint and Url.rebuildIfNeeded looks correct

Normalizing the Url before endpoint selection via rebuildIfNeeded() ensures CIO no longer tries to open connections to hosts that accidentally include path/query/fragment, while remaining a no-op for valid hosts. The extension’s Url(this.toString()) strategy is appropriate here given Url.toString() produces a full URL that can be reparsed into clean host/port/path components.

If you want to slightly improve readability, you could rename the local in selectEndpoint (e.g., val normalizedUrl = url.rebuildIfNeeded()) and use that below, to avoid shadowing the parameter name, but it’s not functionally necessary.

Also applies to: 146-152

ktor-client/ktor-client-darwin/darwin/src/io/ktor/client/engine/darwin/internal/DarwinUrlUtils.kt (1)

22-64: Darwin URL reconstruction is comprehensive and aligns with the normalization strategy

The new with(Url(this.toString())) block correctly rebuilds NSURLComponents from a normalized Url, covering user/password, host, port, path, query (including multi-valued parameters), and fragment, while preserving a cheap fast-path when everything is already encoded. This also implicitly fixes cases where the original host contained path/query/fragment, bringing Darwin behavior in line with the CIO normalization.

Given the very similar logic in the legacy Darwin file, you might later consider factoring this into a shared helper to avoid drift, but the current change is functionally sound.

ktor-client/ktor-client-darwin-legacy/darwin/src/io/ktor/client/engine/darwin/internal/legacy/DarwinLegacyUrlUtils.kt (1)

22-64: Legacy Darwin URL handling now matches the main Darwin implementation

The legacy Url.toNSUrl() implementation mirrors the updated non-legacy version, including the reparsing step via Url(this.toString()) and full population of NSURLComponents (user, password, host, port, path, query items, fragment). That gives the legacy Darwin engine the same robustness against malformed host values and partial encoding.

As with the main Darwin file, this duplication could be reduced via a shared helper, but functionally this looks correct.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a66b47 and 917c066.

📒 Files selected for processing (6)
  • ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/CIOEngine.kt (2 hunks)
  • ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/utils.kt (1 hunks)
  • ktor-client/ktor-client-core/common/src/io/ktor/client/plugins/DefaultRequest.kt (1 hunks)
  • ktor-client/ktor-client-darwin-legacy/darwin/src/io/ktor/client/engine/darwin/internal/legacy/DarwinLegacyUrlUtils.kt (1 hunks)
  • ktor-client/ktor-client-darwin/darwin/src/io/ktor/client/engine/darwin/internal/DarwinUrlUtils.kt (1 hunks)
  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt (2 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: yschimke
Repo: ktorio/ktor PR: 4013
File: ktor-client/ktor-client-android/jvm/src/io/ktor/client/engine/android/Android14URLConnectionFactory.kt:24-26
Timestamp: 2025-09-30T07:52:14.769Z
Learning: In the Ktor Android HTTP client engine (ktor-client-android), prefer using `URI.create(urlString).toURL()` over the `URL(urlString)` constructor when opening connections with Android's HttpEngine, as it avoids deprecated APIs and the different exception behavior (IllegalArgumentException vs MalformedURLException) is acceptable.
📚 Learning: 2025-09-30T07:52:14.769Z
Learnt from: yschimke
Repo: ktorio/ktor PR: 4013
File: ktor-client/ktor-client-android/jvm/src/io/ktor/client/engine/android/Android14URLConnectionFactory.kt:24-26
Timestamp: 2025-09-30T07:52:14.769Z
Learning: In the Ktor Android HTTP client engine (ktor-client-android), prefer using `URI.create(urlString).toURL()` over the `URL(urlString)` constructor when opening connections with Android's HttpEngine, as it avoids deprecated APIs and the different exception behavior (IllegalArgumentException vs MalformedURLException) is acceptable.

Applied to files:

  • ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/CIOEngine.kt
  • ktor-client/ktor-client-darwin/darwin/src/io/ktor/client/engine/darwin/internal/DarwinUrlUtils.kt
  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
  • ktor-client/ktor-client-cio/common/src/io/ktor/client/engine/cio/utils.kt
  • ktor-client/ktor-client-darwin-legacy/darwin/src/io/ktor/client/engine/darwin/internal/legacy/DarwinLegacyUrlUtils.kt
📚 Learning: 2025-05-30T06:45:52.309Z
Learnt from: rururux
Repo: ktorio/ktor PR: 4896
File: ktor-client/ktor-client-core/jvm/test/FileStorageTest.kt:1-12
Timestamp: 2025-05-30T06:45:52.309Z
Learning: The headersOf() function from io.ktor.http package is available through wildcard imports like `import io.ktor.http.*`, so no explicit import statement is needed when using wildcard imports from that package.

Applied to files:

  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
📚 Learning: 2025-08-14T15:17:11.466Z
Learnt from: zibet27
Repo: ktorio/ktor PR: 5044
File: ktor-client/ktor-client-webrtc/ktor-client-webrtc-rs/build.gradle.kts:12-12
Timestamp: 2025-08-14T15:17:11.466Z
Learning: The Gobley Cargo plugin (dev.gobley.cargo) used in the Ktor WebRTC RS module requires the kotlin("plugin.atomicfu") plugin as a dependency, so the atomicfu plugin should not be removed even if there are no direct kotlinx.atomicfu imports in the module's source code.

Applied to files:

  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
📚 Learning: 2025-06-23T12:49:56.883Z
Learnt from: CR
Repo: ktorio/ktor PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-06-23T12:49:56.883Z
Learning: Error handling should follow Kotlin conventions and use specific Ktor exceptions.

Applied to files:

  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
📚 Learning: 2025-05-30T06:45:52.309Z
Learnt from: rururux
Repo: ktorio/ktor PR: 4896
File: ktor-client/ktor-client-core/jvm/test/FileStorageTest.kt:1-12
Timestamp: 2025-05-30T06:45:52.309Z
Learning: In Ktor test files, particularly in the ktor-client/ktor-client-core/jvm/test/ directory, test files follow the convention of not including explicit package declarations. This is consistent across test files like CachingCacheStorageTest.kt and should be maintained for consistency.

Applied to files:

  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
📚 Learning: 2025-08-14T15:17:11.466Z
Learnt from: zibet27
Repo: ktorio/ktor PR: 5044
File: ktor-client/ktor-client-webrtc/ktor-client-webrtc-rs/build.gradle.kts:12-12
Timestamp: 2025-08-14T15:17:11.466Z
Learning: The Cargo plugin (dev.gobley.cargo) used in the Ktor WebRTC RS module depends on the kotlin("plugin.atomicfu") plugin, so atomicfu should not be removed even if there are no direct kotlinx.atomicfu imports in the module's source code.

Applied to files:

  • ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt
🔇 Additional comments (2)
ktor-client/ktor-client-core/common/src/io/ktor/client/plugins/DefaultRequest.kt (1)

210-221: Host validation warning behavior looks correct

The additional warning on host when it includes /, ?, or # matches the documented misuse (host-with-path) and keeps behavior backward compatible by still assigning the value. The message is clear and points users to url(...) / url { ... } instead. No functional issues spotted here.

ktor-client/ktor-client-tests/common/test/io/ktor/client/tests/ConnectionTest.kt (1)

9-9: New test effectively covers the invalid DefaultRequest.host scenario

The test accurately simulates DefaultRequest.host containing a path segment and verifies that requests still succeed after the engine-side normalization. The use of defaultRequest from the plugins package and clientTests scaffolding is consistent with the rest of the suite. Relying on TEST_SERVER.removePrefix("http://") is acceptable given existing usage of TEST_SERVER in this file.

Also applies to: 52-65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants