Skip to content

feat: Put tokens into Rc #2780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

Conversation

larseggert
Copy link
Collaborator

@larseggert larseggert commented Jul 4, 2025

The theory here is that by putting Tokens into an Rc, we can avoid copying them when a Packet gets cloned.

Copy link

codecov bot commented Jul 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.93%. Comparing base (6942acc) to head (8e2e605).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2780   +/-   ##
=======================================
  Coverage   94.93%   94.93%           
=======================================
  Files         115      115           
  Lines       34425    34425           
  Branches    34425    34425           
=======================================
  Hits        32682    32682           
  Misses       1736     1736           
  Partials        7        7           
Components Coverage Δ
neqo-common 97.73% <ø> (ø)
neqo-crypto 89.91% <ø> (ø)
neqo-http3 93.72% <ø> (ø)
neqo-qpack 95.45% <ø> (ø)
neqo-transport 95.95% <100.00%> (ø)
neqo-udp 89.85% <ø> (ø)

Copy link

github-actions bot commented Jul 4, 2025

Client/server transfer results

Performance differences relative to 6942acc.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
google vs. google 455.8 ± 4.8 449.0 465.9 70.2 ± 6.7
google vs. neqo (cubic, paced) 271.4 ± 4.4 263.8 281.4 117.9 ± 7.3 -0.6 -0.2%
msquic vs. msquic 121.6 ± 11.0 108.3 174.8 263.2 ± 2.9
msquic vs. neqo (cubic, paced) 143.6 ± 14.1 124.5 189.2 222.8 ± 2.3 -6.6 -4.4%
neqo vs. google (cubic, paced) 759.8 ± 5.3 752.5 775.4 42.1 ± 6.0 1.6 0.2%
neqo vs. msquic (cubic, paced) 154.6 ± 4.6 146.7 163.5 207.0 ± 7.0 -0.6 -0.4%
neqo vs. neqo (cubic) 92.5 ± 5.8 86.1 124.0 346.1 ± 5.5 0.5 0.5%
neqo vs. neqo (cubic, paced) 95.4 ± 6.5 87.3 120.7 335.5 ± 4.9 💔 3.3 3.6%
neqo vs. neqo (reno) 92.1 ± 5.9 81.7 121.7 347.3 ± 5.4 0.8 0.9%
neqo vs. neqo (reno, paced) 92.6 ± 4.6 86.3 108.4 345.6 ± 7.0 -0.5 -0.5%
neqo vs. quiche (cubic, paced) 196.4 ± 5.1 187.4 205.0 162.9 ± 6.3 💔 1.9 1.0%
neqo vs. s2n (cubic, paced) 217.6 ± 3.8 212.4 228.7 147.1 ± 8.4 💚 -3.7 -1.7%
quiche vs. neqo (cubic, paced) 147.1 ± 5.4 133.3 161.7 217.6 ± 5.9 -0.9 -0.6%
quiche vs. quiche 144.0 ± 4.4 137.2 160.3 222.3 ± 7.3
s2n vs. neqo (cubic, paced) 171.4 ± 4.3 162.8 181.2 186.7 ± 7.4 -1.8 -1.1%
s2n vs. s2n 247.4 ± 22.4 232.5 350.9 129.3 ± 1.4

Download data for profiler.firefox.com or download performance comparison data.

Copy link

github-actions bot commented Jul 4, 2025

Benchmark results

Performance differences relative to 6942acc.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold.
       time:   [205.94 ms 206.37 ms 206.83 ms]
       thrpt:  [483.50 MiB/s 484.57 MiB/s 485.57 MiB/s]
change:
       time:   [−1.6390% −1.1895% −0.7469%] (p = 0.00 < 0.05)
       thrpt:  [+0.7525% +1.2038% +1.6663%]

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [304.77 ms 306.15 ms 307.54 ms]
       thrpt:  [32.516 Kelem/s 32.663 Kelem/s 32.812 Kelem/s]
change:
       time:   [−1.1612% −0.5519% +0.0908%] (p = 0.07 > 0.05)
       thrpt:  [−0.0908% +0.5550% +1.1748%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [28.058 ms 28.165 ms 28.284 ms]
       thrpt:  [35.355   B/s 35.505   B/s 35.640   B/s]
change:
       time:   [−0.8021% −0.1517% +0.5228%] (p = 0.65 > 0.05)
       thrpt:  [−0.5201% +0.1519% +0.8086%]

Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.
       time:   [212.10 ms 212.47 ms 212.88 ms]
       thrpt:  [469.76 MiB/s 470.65 MiB/s 471.48 MiB/s]
change:
       time:   [+1.8940% +2.2714% +2.5722%] (p = 0.00 < 0.05)
       thrpt:  [−2.5077% −2.2209% −1.8588%]

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.617 µs 11.648 µs 11.687 µs]
       change: [−0.4144% +0.1374% +0.7362%] (p = 0.64 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
11 (11.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.0034 ms 3.0109 ms 3.0202 ms]
       change: [−0.5611% −0.1277% +0.3031%] (p = 0.57 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.362 µs 19.409 µs 19.462 µs]
       change: [−0.1582% +0.2703% +0.7067%] (p = 0.25 > 0.05)

Found 18 outliers among 100 measurements (18.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
14 (14.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.0843 ms 5.1037 ms 5.1325 ms]
       change: [−0.9875% −0.1310% +0.6761%] (p = 0.78 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low mild
12 (12.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [5.5231 µs 5.5426 µs 5.5699 µs]
       change: [−0.8346% −0.0983% +0.5372%] (p = 0.80 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
7 (7.00%) low mild
2 (2.00%) high mild
6 (6.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.7576 ms 1.7577 ms 1.7578 ms]
       change: [−0.4118% −0.1722% −0.0088%] (p = 0.08 > 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [88.783 ns 89.099 ns 89.403 ns]
       change: [−0.2520% +0.1920% +0.6408%] (p = 0.41 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
8 (8.00%) high mild
3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [106.19 ns 106.48 ns 106.80 ns]
       change: [−2.0334% −0.6022% +0.3383%] (p = 0.42 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
10 (10.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [105.53 ns 105.91 ns 106.37 ns]
       change: [−1.4662% −0.2845% +0.8182%] (p = 0.67 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [89.544 ns 89.695 ns 89.860 ns]
       change: [−0.7149% +0.2734% +1.4216%] (p = 0.63 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [108.10 ms 108.22 ms 108.37 ms]
       change: [−0.3707% −0.2537% −0.1185%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe

sent::Packets::take_ranges: :green_heart: Performance has improved.
       time:   [4.7108 µs 4.8499 µs 4.9862 µs]
       change: [−47.901% −46.338% −44.751%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [36.678 ms 36.769 ms 36.863 ms]
       change: [−1.9152% −1.5743% −1.2273%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [37.726 ms 37.855 ms 37.995 ms]
       change: [−1.2351% −0.7861% −0.3161%] (p = 0.00 < 0.05)

Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe

transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [36.299 ms 36.361 ms 36.422 ms]
       change: [−1.4468% −1.2330% −1.0062%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [38.303 ms 38.404 ms 38.509 ms]
       change: [−1.4529% −1.1248% −0.7951%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe

Download data for profiler.firefox.com or download performance comparison data.

@martinthomson
Copy link
Member

Do you need reference counting, or would Box suffice?

@larseggert
Copy link
Collaborator Author

My theory was that Box::clone would do a memcpy whereas Rc::clone would not. Am I wrong?

@mxinden
Copy link
Member

mxinden commented Jul 7, 2025

Seems correct, yes.

That said, what is the intention behind this pull request? The following confuses me. Token uses a SmallVec now, i.e. is stack allocated. That stack allocation is then wrapped by a pointer (Rc) thus heap allocated.

@larseggert
Copy link
Collaborator Author

It's a bit of an experiment. In 1f6d802, I put the Vec into an Rc to see if that made a difference. I didn't expect the change to SmallVec to matter, and it seems like it doesn't.

Copy link

github-actions bot commented Aug 8, 2025

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 853e4be.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants