Skip to content

Conversation

jmartinesp
Copy link
Contributor

@jmartinesp jmartinesp commented Jul 23, 2025

Having the read_timeout value set allows us to have long network requests that will get cancelled automatically when no data is received for a certain amount of time.

If we combine it with being able to remove the existing default timeout value for some requests, it would fix the issue with downloading large files in poor network connectivity scenarios where the download would just cancel automatically after the timeout (30s) was reached.

  • Public API changes documented in changelogs (optional)

Signed-off-by:

@jmartinesp jmartinesp requested a review from a team as a code owner July 23, 2025 09:51
@jmartinesp jmartinesp requested review from bnjbvr and removed request for a team July 23, 2025 09:51
Copy link
Contributor

@poljar poljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has one small problem.

The rest looks fine.

let mut request_config = self.request_config();
if let Some(timeout) = sync_settings.timeout {
request_config.timeout += timeout;
request_config.timeout = Some(timeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite the same. If we set the poll timeout of the sync to 30 seconds and the timeout itself was 30, then we would have had 60 here.

Instead now the connection aborts right when the poll timeout expires. I think this potentially breaks syncs for everybody.

You'll have to unwrap_or_default() the timeout here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, true. For some reason I didn't realise what this was doing was just adding the 2 durations.

Copy link

codecov bot commented Jul 23, 2025

Codecov Report

Attention: Patch coverage is 80.64516% with 6 lines in your changes missing coverage. Please review.

Project coverage is 88.85%. Comparing base (da946e5) to head (ac10d4a).
Report is 5 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
crates/matrix-sdk/src/http_client/native.rs 60.00% 1 Missing and 3 partials ⚠️
crates/matrix-sdk/src/config/request.rs 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5437      +/-   ##
==========================================
- Coverage   88.86%   88.85%   -0.01%     
==========================================
  Files         333      333              
  Lines       91452    91470      +18     
  Branches    91452    91470      +18     
==========================================
+ Hits        81270    81277       +7     
- Misses       6360     6369       +9     
- Partials     3822     3824       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jmartinesp jmartinesp requested a review from poljar July 23, 2025 11:40
Copy link
Contributor

@poljar poljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think we need a better default yet 😅

Copy link
Contributor

@poljar poljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@poljar
Copy link
Contributor

poljar commented Jul 23, 2025

Would you like to get rid of the fixup commit or should I just squash everything?

@poljar
Copy link
Contributor

poljar commented Jul 23, 2025

Also, would you like to mention this as a bugfix in the changelog? Might make sense to mention this.

This allows us to remove the default timeout on a per-request basis
…ions that have been stale for longer than the specified timeout
…`read_timeout` instead

Users with poor network connectivity complained their downloads were cancelled for no good reason after 30s before (the default `timeout` value).
@jmartinesp jmartinesp force-pushed the refactor/add-read-timeout-for-http-requests branch from d8ecab0 to ac10d4a Compare July 23, 2025 13:42
@jmartinesp jmartinesp enabled auto-merge (rebase) July 23, 2025 13:43
@jmartinesp
Copy link
Contributor Author

I've already added a changelog entry and squashed the fixup 👍 .

@frebib
Copy link
Contributor

frebib commented Jul 23, 2025

Would this apply to sync too? I ask because if you've ever tried to use EX on an airplane, you'd understand what "unusable network" means 😂 Sync is also affected by this too in extreme cases

@jmartinesp
Copy link
Contributor Author

Would this apply to sync too? I ask because if you've ever tried to use EX on an airplane, you'd understand what "unusable network" means 😂 Sync is also affected by this too in extreme cases

For /sync we keep the existing timeout strategy, with the extra read timeout.

@poljar
Copy link
Contributor

poljar commented Jul 23, 2025

Would this apply to sync too? I ask because if you've ever tried to use EX on an airplane, you'd understand what "unusable network" means 😂 Sync is also affected by this too in extreme cases

You mean that 30 seconds isn't enough to get a sync response?

This read_timeout applies to every request now, if no data is received in 60 seconds we abort and possibly retry the request.

Though the /sync request has an additional shorter timeout. We might want to get rid of this shorter timeout but I think it would be better if we could configure read_timeout on a per-request basis before this happens.

In other words we'd like: seanmonstar/reqwest#2764.

@poljar poljar disabled auto-merge July 23, 2025 13:53
@poljar poljar enabled auto-merge (rebase) July 23, 2025 13:53
@poljar poljar merged commit 6d562ef into main Jul 23, 2025
43 checks passed
@poljar poljar deleted the refactor/add-read-timeout-for-http-requests branch July 23, 2025 13:55
@frebib
Copy link
Contributor

frebib commented Jul 23, 2025

Would this apply to sync too? I ask because if you've ever tried to use EX on an airplane, you'd understand what "unusable network" means 😂 Sync is also affected by this too in extreme cases

You mean that 30 seconds isn't enough to get a sync response?

In my experience, no. Provided the read_timeout is never hit, I don't see a good reason to timeout the sync request to allow for slow networks. Several times I've been on extremely unusably slow networks (think bytes per second with big pauses of no data at all) where EX has never been able to sync for the duration of the flight, but both Signal and WhatsApp are able to sync eventually after several minutes.

See element-hq/element-x-android#1841

@poljar
Copy link
Contributor

poljar commented Jul 23, 2025

Would this apply to sync too? I ask because if you've ever tried to use EX on an airplane, you'd understand what "unusable network" means 😂 Sync is also affected by this too in extreme cases

You mean that 30 seconds isn't enough to get a sync response?

In my experience, no. Provided the read_timeout is never hit, I don't see a good reason to timeout the sync request to allow for slow networks. Several times I've been on extremely unusably slow networks (think bytes per second with big pauses of no data at all) where EX has never been able to sync for the duration of the flight, but both Signal and WhatsApp are able to sync eventually after several minutes.

Well that would explain why it doesn't work for you, if the request doesn't finish we abort after 30 seconds no matter if there's still data coming in.

The read_timeout is a relatively recent addition to the reqwest crate. It only kills the connection if not data is being received for a certain amount of time.

We should probably migrate away from timeout for most requests if not for all of them.

@frebib
Copy link
Contributor

frebib commented Jul 23, 2025

We should probably migrate away from timeout for most requests if not for all of them.

Yeah that sounds like a good improvement! It would certainly fix/prevent element-hq/element-x-android#1841 in the future
Despite that issue being closed, I'm pretty sure this is still a problem because of the 30s timeout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants