Skip to content

Conversation

@str4d
Copy link
Member

@str4d str4d commented Apr 20, 2022

This uses Algorithm 2 from https://eprint.iacr.org/2022/367 to speed up Fp2 and Fp6 multiplication.

@str4d str4d marked this pull request as ready for review April 20, 2022 10:20
@dconnolly
Copy link

This uses Algorithm 2 from https://eprint.iacr.org/2022/368 to speed up pairing operations.

ITYM /2022/367 ?

@str4d
Copy link
Member Author

str4d commented Apr 20, 2022

Indeed yes, IDK how the increment happened (gonna blame the new key Oard).

@str4d str4d force-pushed the efficient-extension-field-arithmetic branch from 3c80c48 to 2623717 Compare April 20, 2022 19:16
@str4d
Copy link
Member Author

str4d commented Apr 20, 2022

I accidentally-on-purpose made it faster. Benchmarks on an M1 of the combined Fp2 and Fp6 changes relative to current main:

Benchmarking full pairing: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.8s, enable flat sampling, or reduce sample count to 60.
full pairing            time:   [1.1461 ms 1.1471 ms 1.1483 ms]                          
                        change: [-14.596% -14.434% -14.264%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  2 (2.00%) high mild
  12 (12.00%) high severe

G2 preparation for pairing                                                                            
                        time:   [119.24 us 119.30 us 119.38 us]
                        change: [-10.505% -10.322% -10.140%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) high mild
  11 (11.00%) high severe

miller loop for pairing time:   [318.95 us 319.25 us 319.68 us]                                    
                        change: [-25.148% -25.009% -24.876%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) high mild
  9 (9.00%) high severe

final exponentiation for pairing                                                                            
                        time:   [708.89 us 709.41 us 710.08 us]
                        change: [-9.3887% -9.2309% -9.0655%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
  7 (7.00%) high mild
  9 (9.00%) high severe

G1Affine check on curve time:   [154.60 ns 154.66 ns 154.74 ns]                                    
                        change: [-0.4970% -0.2625% -0.0453%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe

G1Affine check equality time:   [29.487 ns 29.560 ns 29.627 ns]                                     
                        change: [-0.9213% -0.5789% -0.2616%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

G1Affine scalar multiplication                                                                            
                        time:   [326.08 us 326.34 us 326.72 us]
                        change: [-0.2479% -0.0292% +0.1745%] (p = 0.80 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

G1Affine subgroup check time:   [326.17 us 326.48 us 326.91 us]                                    
                        change: [-0.2842% +0.0110% +0.3325%] (p = 0.95 > 0.05)
                        No change in performance detected.
Found 17 outliers among 100 measurements (17.00%)
  2 (2.00%) high mild
  15 (15.00%) high severe

G1Affine deserialize compressed point                                                                            
                        time:   [354.00 us 354.20 us 354.53 us]
                        change: [-0.3253% -0.0909% +0.1408%] (p = 0.47 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe

G1Affine deserialize uncompressed point                                                                            
                        time:   [326.28 us 326.46 us 326.68 us]
                        change: [+0.0999% +0.3978% +0.7055%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  5 (5.00%) high mild
  12 (12.00%) high severe

G1Projective check on curve                                                                            
                        time:   [334.82 ns 335.50 ns 336.31 ns]
                        change: [-0.1588% +0.0935% +0.3468%] (p = 0.48 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

G1Projective check equality                                                                            
                        time:   [233.39 ns 233.55 ns 233.78 ns]
                        change: [-0.7959% -0.5866% -0.3619%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  5 (5.00%) high mild
  10 (10.00%) high severe

G1Projective to affine  time:   [27.851 us 27.873 us 27.901 us]                                    
                        change: [-0.1425% +0.0676% +0.2896%] (p = 0.55 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  9 (9.00%) high mild
  7 (7.00%) high severe

G1Projective doubling   time:   [488.15 ns 488.47 ns 488.84 ns]                                  
                        change: [-0.2595% -0.0492% +0.1342%] (p = 0.64 > 0.05)
                        No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
  4 (4.00%) high mild
  11 (11.00%) high severe

G1Projective addition   time:   [789.62 ns 790.23 ns 791.16 ns]                                   
                        change: [-0.9932% -0.6714% -0.3551%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) high mild
  11 (11.00%) high severe

G1Projective mixed addition                                                                             
                        time:   [688.78 ns 689.36 ns 690.08 ns]
                        change: [-0.1772% +0.0252% +0.2402%] (p = 0.82 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe

G1Projective scalar multiplication                                                                            
                        time:   [326.04 us 326.17 us 326.34 us]
                        change: [-0.2632% -0.0799% +0.1049%] (p = 0.40 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

G1Projective batch to affine n=10000                                                                             
                        time:   [2.6787 ms 2.6801 ms 2.6819 ms]
                        change: [+0.7160% +0.7985% +0.8879%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe

G2Affine check on curve time:   [405.61 ns 406.59 ns 408.25 ns]                                    
                        change: [-13.449% -13.255% -13.051%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) high mild
  12 (12.00%) high severe

G2Affine check equality time:   [55.682 ns 55.795 ns 55.912 ns]                                    
                        change: [+0.4433% +0.6924% +0.9430%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

G2Affine scalar multiplication                                                                            
                        time:   [857.31 us 859.20 us 861.13 us]
                        change: [-27.168% -26.988% -26.817%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

G2Affine subgroup check time:   [857.37 us 859.17 us 861.09 us]                                    
                        change: [-27.184% -27.007% -26.820%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking G2Affine deserialize compressed point: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.1s, enable flat sampling, or reduce sample count to 60.
G2Affine deserialize compressed point                                                                             
                        time:   [1.0094 ms 1.0108 ms 1.0124 ms]
                        change: [-25.448% -25.297% -25.148%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

G2Affine deserialize uncompressed point                                                                            
                        time:   [862.29 us 864.20 us 866.02 us]
                        change: [-27.065% -26.889% -26.714%] (p = 0.00 < 0.05)
                        Performance has improved.

G2Projective check on curve                                                                             
                        time:   [920.61 ns 922.53 ns 924.43 ns]
                        change: [-19.949% -19.665% -19.341%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  5 (5.00%) high mild
  15 (15.00%) high severe

G2Projective check equality                                                                             
                        time:   [596.70 ns 598.00 ns 599.37 ns]
                        change: [-29.451% -29.256% -29.075%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

G2Projective to affine  time:   [28.221 us 28.233 us 28.247 us]                                    
                        change: [-0.6778% -0.4721% -0.2682%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe

G2Projective doubling   time:   [1.3098 us 1.3127 us 1.3158 us]                                   
                        change: [-24.035% -23.850% -23.640%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

G2Projective addition   time:   [2.0576 us 2.0599 us 2.0629 us]                                   
                        change: [-29.439% -29.244% -29.037%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

G2Projective mixed addition                                                                             
                        time:   [1.8443 us 1.8484 us 1.8527 us]
                        change: [-30.083% -29.922% -29.767%] (p = 0.00 < 0.05)
                        Performance has improved.

G2Projective scalar multiplication                                                                            
                        time:   [858.12 us 859.90 us 861.84 us]
                        change: [-27.285% -27.073% -26.880%] (p = 0.00 < 0.05)
                        Performance has improved.

G2Projective batch to affine n=10000                                                                            
                        time:   [6.7734 ms 6.7856 ms 6.7983 ms]
                        change: [-31.236% -31.108% -30.965%] (p = 0.00 < 0.05)
                        Performance has improved.

@str4d str4d changed the title Use interleaving to improve Fp6 multiplication Use interleaving to improve performance of G2 arithmetic and pairings Apr 20, 2022
str4d added 2 commits April 21, 2022 01:55
This uses Algorithm 2 from https://eprint.iacr.org/2022/367 to speed
up pairing operations.
This requires making `Fp2::mul` non-const, in order to be able to use
`Fp::sum_of_products`. But `Fp2` is not public, so it doesn't affect
the crate API.
@str4d str4d force-pushed the efficient-extension-field-arithmetic branch from e23e576 to ed8f172 Compare April 20, 2022 23:56
str4d added a commit to zkcrypto/ff that referenced this pull request Apr 26, 2022
@str4d str4d force-pushed the efficient-extension-field-arithmetic branch from 6c3a3ac to ed8f172 Compare May 4, 2022 19:20
@str4d
Copy link
Member Author

str4d commented May 4, 2022

Tried using the interleaving for impl Mul for Fp, but it seems to have slowdowns on some machines, so leaving it aside. Those kinds of lower-level changes will be better suited to proper per-arch backends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants