Skip to content

ci: add windows-11-arm runner #2591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 29 commits into
base: main
Choose a base branch
from
Draft

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Apr 17, 2025

Copy link

github-actions bot commented Apr 17, 2025

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 8c65240.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

@larseggert
Copy link
Collaborator

rustup not being there may be due to actions/partner-runner-images#77

Maybe we just install it?

Copy link

github-actions bot commented Apr 17, 2025

Benchmark results

Performance differences relative to 847a598.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [693.78 ms 697.61 ms 701.42 ms]
       thrpt:  [142.57 MiB/s 143.35 MiB/s 144.14 MiB/s]
change:
       time:   [-1.5580% -0.7959% +0.0525%] (p = 0.05 > 0.05)
       thrpt:  [-0.0524% +0.8022% +1.5826%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [352.25 ms 353.62 ms 354.98 ms]
       thrpt:  [28.171 Kelem/s 28.279 Kelem/s 28.389 Kelem/s]
change:
       time:   [-0.3659% +0.2205% +0.7982%] (p = 0.47 > 0.05)
       thrpt:  [-0.7918% -0.2200% +0.3673%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [25.488 ms 25.633 ms 25.785 ms]
       thrpt:  [38.782  elem/s 39.012  elem/s 39.234  elem/s]
change:
       time:   [-1.3696% -0.5205% +0.2719%] (p = 0.22 > 0.05)
       thrpt:  [-0.2712% +0.5232% +1.3886%]

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💚 Performance has improved.
       time:   [1.7915 s 1.8133 s 1.8361 s]
       thrpt:  [54.463 MiB/s 55.147 MiB/s 55.820 MiB/s]
change:
       time:   [-8.3956% -6.6210% -4.8312%] (p = 0.00 < 0.05)
       thrpt:  [+5.0765% +7.0904% +9.1650%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [12.070 µs 12.100 µs 12.138 µs]
       change: [-0.4081% -0.1094% +0.1954%] (p = 0.49 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
6 (6.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.1272 ms 3.1348 ms 3.1442 ms]
       change: [-0.7339% -0.2697% +0.1582%] (p = 0.26 > 0.05)

Found 6 outliers among 100 measurements (6.00%)
6 (6.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [20.168 µs 20.224 µs 20.286 µs]
       change: [-0.3661% +0.0032% +0.4244%] (p = 0.98 > 0.05)

Found 21 outliers among 100 measurements (21.00%)
1 (1.00%) low severe
4 (4.00%) low mild
2 (2.00%) high mild
14 (14.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.2471 ms 5.2602 ms 5.2751 ms]
       change: [-0.3443% -0.0005% +0.3593%] (p = 1.00 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
1 (1.00%) low mild
13 (13.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [7.0074 µs 7.0185 µs 7.0385 µs]
       change: [-0.8880% -0.1196% +0.7775%] (p = 0.80 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
5 (5.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
7 (7.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.7922 ms 1.7980 ms 1.8051 ms]
       change: [-0.6316% -0.0830% +0.4601%] (p = 0.75 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severe

1000 streams of 1 bytes/multistream: No change in performance detected.
       time:   [25.576 ms 25.602 ms 25.628 ms]
       change: [-0.0568% +0.0869% +0.2311%] (p = 0.23 > 0.05)

Found 57 outliers among 500 measurements (11.40%)
53 (10.60%) high mild
4 (0.80%) high severe

1000 streams of 1000 bytes/multistream: Change within noise threshold.
       time:   [143.92 ms 143.96 ms 143.99 ms]
       change: [+0.0182% +0.0530% +0.0891%] (p = 0.00 < 0.05)

Found 2 outliers among 500 measurements (0.40%)
2 (0.40%) high mild

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [94.809 ns 95.147 ns 95.482 ns]
       change: [-0.4014% +0.0250% +0.4502%] (p = 0.91 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
13 (13.00%) high mild

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [112.69 ns 113.02 ns 113.37 ns]
       change: [-0.6434% -0.1758% +0.2601%] (p = 0.46 > 0.05)

Found 16 outliers among 100 measurements (16.00%)
1 (1.00%) low severe
2 (2.00%) low mild
13 (13.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [111.94 ns 112.29 ns 112.74 ns]
       change: [-0.5043% +0.2238% +0.9630%] (p = 0.59 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
4 (4.00%) low mild
7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [92.961 ns 93.453 ns 93.987 ns]
       change: [-0.5859% +0.4267% +1.4132%] (p = 0.41 > 0.05)

Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [117.33 ms 117.38 ms 117.44 ms]
       change: [+0.3764% +0.4410% +0.5103%] (p = 0.00 < 0.05)

Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
5 (5.00%) low mild
9 (9.00%) high mild

SentPackets::take_ranges: No change in performance detected.
       time:   [8.2627 µs 8.5307 µs 8.7854 µs]
       change: [-3.2255% -0.6526% +2.0184%] (p = 0.63 > 0.05)

Found 23 outliers among 100 measurements (23.00%)
1 (1.00%) low severe
18 (18.00%) low mild
4 (4.00%) high mild

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [35.824 ms 35.885 ms 35.945 ms]
       change: [+0.7514% +1.0002% +1.2374%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [36.695 ms 36.799 ms 36.903 ms]
       change: [+0.1973% +0.6341% +1.0569%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [35.490 ms 35.542 ms 35.597 ms]
       change: [+0.5478% +0.7464% +0.9507%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild

transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [37.377 ms 37.444 ms 37.512 ms]
       change: [+0.2418% +0.4769% +0.7140%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

Client/server transfer results

Performance differences relative to 847a598.

Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.

Client Server CC Pacing Mean ± σ Min Max MiB/s ± σ Δ main Δ main
neqo neqo reno on 329.9 ± 46.9 287.4 514.3 97.0 ± 0.7 -3.4 -1.0%
neqo neqo reno 345.5 ± 111.8 287.9 848.4 92.6 ± 0.3 -10.0 -2.8%
neqo neqo cubic on 330.1 ± 48.8 299.6 508.4 96.9 ± 0.7 8.0 2.5%
neqo neqo cubic 333.7 ± 51.3 294.7 498.3 95.9 ± 0.6 3.2 1.0%
google neqo reno on 769.7 ± 88.4 571.2 918.9 41.6 ± 0.4 8.0 1.0%
google neqo reno 771.5 ± 90.3 547.4 946.8 41.5 ± 0.4 11.7 1.5%
google neqo cubic on 775.1 ± 109.7 575.2 1185.1 41.3 ± 0.3 19.7 2.6%
google neqo cubic 757.9 ± 83.4 556.9 886.6 42.2 ± 0.4 -4.3 -0.6%
google google 575.5 ± 41.0 551.8 781.3 55.6 ± 0.8 2.7 0.5%
neqo msquic reno on 273.6 ± 42.3 243.9 430.1 117.0 ± 0.8 4.9 1.8%
neqo msquic reno 263.9 ± 18.6 243.4 312.4 121.2 ± 1.7 -9.9 -3.6%
neqo msquic cubic on 270.9 ± 36.5 245.6 434.1 118.1 ± 0.9 7.5 2.9%
neqo msquic cubic 267.9 ± 38.0 240.8 448.0 119.5 ± 0.8 0.5 0.2%
msquic msquic 196.8 ± 39.3 159.3 339.8 162.6 ± 0.8 0.1 0.1%

⬇️ Download logs

@larseggert
Copy link
Collaborator

This is now stuck on actions/partner-runner-images#90

@larseggert larseggert added the blocked Blocked on something else label Apr 24, 2025
Copy link

codecov bot commented Jul 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.91%. Comparing base (967947c) to head (a5ed824).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2591      +/-   ##
==========================================
+ Coverage   94.88%   94.91%   +0.03%     
==========================================
  Files         115      115              
  Lines       34291    34323      +32     
  Branches    34291    34323      +32     
==========================================
+ Hits        32536    32577      +41     
+ Misses       1748     1737      -11     
- Partials        7        9       +2     
Components Coverage Δ
neqo-common 97.62% <ø> (+0.11%) ⬆️
neqo-crypto 89.64% <ø> (ø)
neqo-http3 93.71% <ø> (ø)
neqo-qpack 95.45% <ø> (ø)
neqo-transport 95.94% <ø> (+0.04%) ⬆️
neqo-udp 89.85% <ø> (ø)

@larseggert
Copy link
Collaborator

There is a new version of the Win11 ARM image, but it's still shipping with the amd64 llvm toolchain.

Copy link

github-actions bot commented Jul 19, 2025

Client/server transfer results

Performance differences relative to 967947c.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
google vs. google 451.8 ± 4.3 443.8 461.6 70.8 ± 7.4
google vs. neqo (cubic, paced) 270.6 ± 4.1 263.3 277.6 118.3 ± 7.8 0.4 0.1%
msquic vs. msquic 131.6 ± 39.0 111.4 485.0 243.2 ± 0.8
msquic vs. neqo (cubic, paced) 146.6 ± 12.9 122.3 184.9 218.3 ± 2.5 -7.3 -4.7%
neqo vs. google (cubic, paced) 752.6 ± 7.1 744.3 801.8 42.5 ± 4.5 1.7 0.2%
neqo vs. msquic (cubic, paced) 156.3 ± 5.1 148.8 186.6 204.7 ± 6.3 0.2 0.1%
neqo vs. neqo (cubic) 90.0 ± 4.5 83.0 103.7 355.5 ± 7.1 -1.1 -1.2%
neqo vs. neqo (cubic, paced) 91.3 ± 4.3 82.5 101.7 350.6 ± 7.4 -1.0 -1.0%
neqo vs. neqo (reno) 91.1 ± 4.5 82.9 100.4 351.2 ± 7.1 💔 1.6 1.8%
neqo vs. neqo (reno, paced) 91.6 ± 4.7 83.8 100.4 349.4 ± 6.8 1.2 1.4%
neqo vs. quiche (cubic, paced) 192.3 ± 4.2 186.7 204.3 166.4 ± 7.6 💚 -2.8 -1.4%
neqo vs. s2n (cubic, paced) 218.7 ± 4.4 211.0 228.1 146.3 ± 7.3 -0.3 -0.1%
quiche vs. neqo (cubic, paced) 157.3 ± 5.0 149.5 168.8 203.4 ± 6.4 -1.1 -0.7%
quiche vs. quiche 148.8 ± 5.4 138.4 162.8 215.0 ± 5.9
s2n vs. neqo (cubic, paced) 173.0 ± 4.7 162.7 191.5 184.9 ± 6.8 💔 2.2 1.3%
s2n vs. s2n 247.2 ± 23.3 230.9 351.3 129.4 ± 1.4

Download data for profiler.firefox.com or download performance comparison data.

Copy link

github-actions bot commented Jul 19, 2025

Benchmark results

Performance differences relative to 967947c.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [199.78 ms 200.13 ms 200.48 ms]
       thrpt:  [498.79 MiB/s 499.66 MiB/s 500.55 MiB/s]
change:
       time:   [−0.4357% −0.1466% +0.1287%] (p = 0.33 > 0.05)
       thrpt:  [−0.1286% +0.1468% +0.4376%]

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) low mild
1 (1.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [299.93 ms 301.52 ms 303.11 ms]
       thrpt:  [32.991 Kelem/s 33.165 Kelem/s 33.341 Kelem/s]
change:
       time:   [−0.6636% +0.0487% +0.7705%] (p = 0.90 > 0.05)
       thrpt:  [−0.7646% −0.0487% +0.6680%]

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) low mild
1 (1.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [27.904 ms 28.070 ms 28.258 ms]
       thrpt:  [35.388   B/s 35.625   B/s 35.837   B/s]
change:
       time:   [−0.1936% +0.6986% +1.5772%] (p = 0.13 > 0.05)
       thrpt:  [−1.5527% −0.6937% +0.1940%]

Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.
       time:   [204.41 ms 204.72 ms 205.10 ms]
       thrpt:  [487.56 MiB/s 488.46 MiB/s 489.21 MiB/s]
change:
       time:   [+1.2373% +1.4793% +1.7197%] (p = 0.00 < 0.05)
       thrpt:  [−1.6907% −1.4577% −1.2221%]

Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.816 µs 11.854 µs 11.900 µs]
       change: [−0.5587% −0.0573% +0.5033%] (p = 0.84 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
8 (8.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.0255 ms 3.0388 ms 3.0549 ms]
       change: [−0.3495% +0.2837% +0.9105%] (p = 0.38 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
12 (12.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [20.010 µs 20.069 µs 20.135 µs]
       change: [−0.4156% −0.0211% +0.3534%] (p = 0.92 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
11 (11.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.0441 ms 5.0555 ms 5.0685 ms]
       change: [−1.1720% −0.2888% +0.3142%] (p = 0.53 > 0.05)

Found 16 outliers among 100 measurements (16.00%)
1 (1.00%) low mild
1 (1.00%) high mild
14 (14.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [8.2625 µs 8.2891 µs 8.3226 µs]
       change: [−0.8723% −0.1852% +0.4155%] (p = 0.60 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) high mild
8 (8.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.5852 ms 1.5895 ms 1.5951 ms]
       change: [−0.6160% −0.0953% +0.4334%] (p = 0.71 > 0.05)

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [88.247 ns 88.604 ns 88.970 ns]
       change: [−0.6027% +1.3932% +4.1036%] (p = 0.27 > 0.05)

Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [105.52 ns 105.81 ns 106.11 ns]
       change: [−0.5111% −0.1207% +0.2883%] (p = 0.56 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) high mild
8 (8.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [104.99 ns 105.41 ns 105.92 ns]
       change: [−0.8398% +0.0371% +0.7992%] (p = 0.93 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
2 (2.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [88.663 ns 88.761 ns 88.876 ns]
       change: [−8.7172% −2.9078% +0.6883%] (p = 0.42 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [108.64 ms 108.83 ms 109.12 ms]
       change: [−0.8593% −0.6688% −0.3346%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe

sent::Packets::take_ranges: No change in performance detected.
       time:   [8.0637 µs 8.2690 µs 8.4567 µs]
       change: [−0.9199% +5.0453% +15.595%] (p = 0.25 > 0.05)

Found 20 outliers among 100 measurements (20.00%)
4 (4.00%) low severe
12 (12.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [37.393 ms 37.469 ms 37.548 ms]
       change: [+1.0035% +1.3060% +1.6036%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe

transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [38.242 ms 38.370 ms 38.498 ms]
       change: [+0.0906% +0.5552% +1.0060%] (p = 0.02 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-false/same-seed: No change in performance detected.
       time:   [36.780 ms 36.843 ms 36.908 ms]
       change: [−0.1610% +0.0682% +0.3097%] (p = 0.57 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-true/same-seed: No change in performance detected.
       time:   [38.638 ms 38.730 ms 38.826 ms]
       change: [−0.5422% −0.2261% +0.0730%] (p = 0.16 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

Download data for profiler.firefox.com or download performance comparison data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Blocked on something else
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants