Skip to content

Conversation

sim642
Copy link
Member

@sim642 sim642 commented Aug 13, 2021

Issue #265.

The pinned fork optimizes the problems mentioned in #265 (comment) by omitting eta-expansion and unit arguments in simple cases, where they are unnecessary. I also opened a PR to ppx_deriving (ocaml-ppx/ppx_deriving#252) to hopefully get these optimizations upstream, allowing us to get rid of the pin.

On the added benchmarks (https://github.com/goblint/analyzer/blob/a21f33511183074c693c15269990210b283ae045/bench/deriving/benchEq.ml), this improves the performance of derived equal 3 times, bringing it more or less up to par with a completely manual implementation:
(deriving(_module)? is the new one, deriving(_module)?_expand is the old one)

*** Run benchmarks for path "module"

Throughputs for "manual", "deriving", "deriving_expand", "deriving_expand_simpl1", "deriving_expand_simpl2", "deriving_expand_simpl3" each running for at least 1 CPU second:
                manual:  1.24 WALL ( 1.19 usr +  0.01 sys =  1.20 CPU) @ 1721503707.76/s (n=2061015226)
              deriving:  1.02 WALL ( 1.00 usr +  0.00 sys =  1.00 CPU) @ 1312165613.22/s (n=1318050676)
       deriving_expand:  1.03 WALL ( 1.03 usr +  0.00 sys =  1.03 CPU) @ 319989570.35/s (n=329818050)
deriving_expand_simpl1:  1.07 WALL ( 1.06 usr +  0.00 sys =  1.06 CPU) @ 1316607444.91/s (n=1397158805)
deriving_expand_simpl2:  2.62 WALL ( 2.62 usr +  0.00 sys =  2.62 CPU) @ 1313578735.67/s (n=3441060051)
deriving_expand_simpl3:  1.11 WALL ( 1.11 usr +  0.00 sys =  1.11 CPU) @ 1254004556.80/s (n=1394683804)
                               Rate deriving_expand deriving_expand_simpl3 deriving deriving_expand_simpl2 deriving_expand_simpl1 manual
       deriving_expand  319989570/s              --                   -74%     -76%                   -76%                   -76%   -81%
deriving_expand_simpl3 1254004557/s            292%                     --      -4%                    -5%                    -5%   -27%
              deriving 1312165613/s            310%                     5%       --                    -0%                    -0%   -24%
deriving_expand_simpl2 1313578736/s            311%                     5%       0%                     --                    -0%   -24%
deriving_expand_simpl1 1316607445/s            311%                     5%       0%                     0%                     --   -24%
                manual 1721503708/s            438%                    37%      31%                    31%                    31%     --
**********************************************************************
*** Run benchmarks for path "pair.fst"

Throughputs for "manual_primitive", "deriving_primitive", "manual_module", "deriving_module", "deriving_module_expand", "deriving_module_expand_simpl1", "deriving_module_expand_simpl2" each running for at least 1 CPU second:
             manual_primitive:  1.33 WALL ( 1.32 usr +  0.00 sys =  1.32 CPU) @ 1043208945.28/s (n=1381680174)
           deriving_primitive:  1.14 WALL ( 1.14 usr +  0.00 sys =  1.14 CPU) @ 1391755623.57/s (n=1591025802)
                manual_module:  1.06 WALL ( 1.06 usr +  0.00 sys =  1.06 CPU) @ 1243419689.31/s (n=1318778383)
              deriving_module:  1.13 WALL ( 1.13 usr +  0.00 sys =  1.13 CPU) @ 1560610803.69/s (n=1755868185)
       deriving_module_expand:  1.09 WALL ( 1.09 usr +  0.00 sys =  1.09 CPU) @ 240938154.63/s (n=262628612)
deriving_module_expand_simpl1:  1.09 WALL ( 1.08 usr +  0.00 sys =  1.08 CPU) @ 284771156.32/s (n=308329135)
deriving_module_expand_simpl2:  1.08 WALL ( 1.12 usr +  0.00 sys =  1.12 CPU) @ 1397085184.51/s (n=1558836913)
                                      Rate deriving_module_expand deriving_module_expand_simpl1 manual_primitive manual_module deriving_primitive deriving_module_expand_simpl2 deriving_module
       deriving_module_expand  240938155/s                     --                          -15%             -77%          -81%               -83%                          -83%            -85%
deriving_module_expand_simpl1  284771156/s                    18%                            --             -73%          -77%               -80%                          -80%            -82%
             manual_primitive 1043208945/s                   333%                          266%               --          -16%               -25%                          -25%            -33%
                manual_module 1243419689/s                   416%                          337%              19%            --               -11%                          -11%            -20%
           deriving_primitive 1391755624/s                   478%                          389%              33%           12%                 --                           -0%            -11%
deriving_module_expand_simpl2 1397085185/s                   480%                          391%              34%           12%                 0%                            --            -10%
              deriving_module 1560610804/s                   548%                          448%              50%           26%                12%                           12%              --
**********************************************************************
*** Run benchmarks for path "pair.snd"

Throughputs for "manual_primitive", "deriving_primitive", "manual_module", "deriving_module", "deriving_module_expand", "deriving_module_expand_simpl1", "deriving_module_expand_simpl2" each running for at least 1 CPU second:
             manual_primitive:  1.13 WALL ( 1.13 usr +  0.00 sys =  1.13 CPU) @ 438248508.20/s (n=495223882)
           deriving_primitive:  1.08 WALL ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 442271278.58/s (n=474449610)
                manual_module:  1.05 WALL ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 445978757.51/s (n=466018367)
              deriving_module:  1.07 WALL ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 439682085.65/s (n=468568759)
       deriving_module_expand:  1.06 WALL ( 1.05 usr +  0.00 sys =  1.05 CPU) @ 109149441.96/s (n=114901945)
deriving_module_expand_simpl1:  1.05 WALL ( 1.05 usr +  0.00 sys =  1.05 CPU) @ 163268054.55/s (n=171139534)
deriving_module_expand_simpl2:  1.07 WALL ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 444435898.99/s (n=476877053)
                                     Rate deriving_module_expand deriving_module_expand_simpl1 manual_primitive deriving_module deriving_primitive deriving_module_expand_simpl2 manual_module
       deriving_module_expand 109149442/s                     --                          -33%             -75%            -75%               -75%                          -75%          -76%
deriving_module_expand_simpl1 163268055/s                    50%                            --             -63%            -63%               -63%                          -63%          -63%
             manual_primitive 438248508/s                   302%                          168%               --             -0%                -1%                           -1%           -2%
              deriving_module 439682086/s                   303%                          169%               0%              --                -1%                           -1%           -1%
           deriving_primitive 442271279/s                   305%                          171%               1%              1%                 --                           -0%           -1%
deriving_module_expand_simpl2 444435899/s                   307%                          172%               1%              1%                 0%                            --           -0%
                manual_module 445978758/s                   309%                          173%               2%              1%                 1%                            0%            --

@sim642 sim642 added the performance Analysis time, memory usage label Aug 13, 2021
@sim642
Copy link
Member Author

sim642 commented Aug 15, 2021

The ppx_deriving PR was merged upstream, so that's good, but there's no release yet which contains it. So currently ppx_deriving is still pinned but to upstream, until they release it.

@michael-schwarz michael-schwarz merged commit eccee52 into master Aug 15, 2021
@michael-schwarz michael-schwarz deleted the optimize-deriving branch August 15, 2021 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Analysis time, memory usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants