Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize floodfilling for expansion #1707

Merged
merged 12 commits into from
Apr 29, 2022

Conversation

MrStevns
Copy link
Member

@MrStevns MrStevns commented Apr 9, 2022

After the recent forum post about the floodfill algorithm had become slow, I started looking into what could be done to make filling faster with and without expand enabled.
This PR is the result of a collaboration between myself and Scribblemaniac, with some pretty substantial performance improvements in all but one of the benchmarks cases

The floodfill algorithm was already quite fast but some additional changes were be made to improve the speed in some cases while other it's gotten slightly slower. The main motivation however was to make floodfill with expansion work faster.

Benchmark results

Branch used to perform benchmark: https://github.com/MrStevns/pencil/tree/bucket-tool-benchmark
Benchmark was made at the following commit: 3bace61

The benchmark specs:

Spec type Value
Model Name: MacBook Pro
Model Identifier: MacBookPro12,1
Processor Name: Dual-Core Intel Core i7
Processor Speed: 3.1 GHz
Number of Processors: 1
Total Number of Cores: 2
L2 Cache (per Core): 256 KB
L3 Cache: 4 MB
Hyper-Threading Technology: Enabled
Memory: 16 GB
--- ================== Master ========================	Fri Apr  8 19:41:28 2022
+++ =============== Benchmark bucket_tool_optimization	Fri Apr  8 19:41:28 2022
@@ -1,108 +1,108 @@
-================== Master ===========================================
+=============== Benchmark bucket_tool_optimizations2 =================
 [==========] Running 5 benchmarks.
[ RUN      ] BitmapImage1080pFixture.FloodFill (10 runs, 5 iterations per run)
-[     DONE ] BitmapImage1080pFixture.FloodFill (4530.489401 ms)
-[   RUNS   ]        Average time: 453048.940 us (~6350.387 us)
-                    Fastest time: 446220.267 us (-6828.673 us / -1.507 %)
-                    Slowest time: 466590.746 us (+13541.806 us / +2.989 %)
-                     Median time: 452343.653 us (1st quartile: 446834.360 us | 3rd quartile: 457059.640 us)
+[     DONE ] BitmapImage1080pFixture.FloodFill (4611.106197 ms)
+[   RUNS   ]        Average time: 461110.620 us (~7446.206 us)
+                    Fastest time: 452641.573 us (-8469.047 us / -1.837 %)
+                    Slowest time: 475489.724 us (+14379.104 us / +3.118 %)
+                     Median time: 458457.295 us (1st quartile: 455701.915 us | 3rd quartile: 465509.400 us)

Runs (time per run):
Average:                (461110.620 - 453048.940) / 461110.620 * 100 = 1.74% slower
Fastest:                (452641.573 - 446220.267) / 452641.573 * 100 = 1.41% slower
Slowest:                (475489.724 - 466590.746) / 475489.724 * 100 = 1.87% slower
Median:                 (458457.295 - 452343.653) / 458457.295 * 100 = 1.33% slower  


-             Average performance: 2.20727 runs/s
-                Best performance: 2.24105 runs/s (+0.03378 runs/s / +1.53034 %)
-               Worst performance: 2.14321 runs/s (-0.06406 runs/s / -2.90229 %)
-              Median performance: 2.21071 runs/s (1st quartile: 2.23797 | 3rd quartile: 2.18790)
+             Average performance: 2.16868 runs/s
+                Best performance: 2.20925 runs/s (+0.04058 runs/s / +1.87103 %)
+               Worst performance: 2.10309 runs/s (-0.06558 runs/s / -3.02406 %)
+              Median performance: 2.18123 runs/s (1st quartile: 2.19442 | 3rd quartile: 2.14818)

Performance (runs/s):
Average:                (2.16868 - 2.20727) / 2.16868 * 100          = 1.77% slower
Fastest:                (2.20925 - 2.24105) / 2.20925 * 100          = 1.43% slower
Worst:                  (2.10309 - 2.14321) / 2.10309 * 100          = 1.90% slower
Median:                 (2.18123 - 2.21071) / 2.18123 * 100          = 1.35% slower                                   
                                   
-[ITERATIONS]        Average time: 90609.788 us (~1270.077 us)
-                    Fastest time: 89244.053 us (-1365.735 us / -1.507 %)
-                    Slowest time: 93318.149 us (+2708.361 us / +2.989 %)
-                     Median time: 90468.731 us (1st quartile: 89366.872 us | 3rd quartile: 91411.928 us)
+[ITERATIONS]        Average time: 92222.124 us (~1489.241 us)
+                    Fastest time: 90528.315 us (-1693.809 us / -1.837 %)
+                    Slowest time: 95097.945 us (+2875.821 us / +3.118 %)
+                     Median time: 91691.459 us (1st quartile: 91140.383 us | 3rd quartile: 93101.880 us)

Runs (time per iteration):
Average:                (92222.124 - 90609.788) / 92222.124 * 100          = 1.74% slower
Fastest:                (90528.315 - 89244.053) / 90528.315 * 100          = 1.41% slower
Worst:                  (95097.945 - 93318.149) / 95097.945 * 100          = 1.87% slower
Median:                 (91691.459 - 90468.731) / 91691.459 * 100          = 1.33% slower                                   
                                   
-             Average performance: 11.03634 iterations/s
-                Best performance: 11.20523 iterations/s (+0.16889 iterations/s / +1.53034 %)
-               Worst performance: 10.71603 iterations/s (-0.32031 iterations/s / -2.90229 %)
-              Median performance: 11.05354 iterations/s (1st quartile: 11.18983 | 3rd quartile: 10.93949)
+             Average performance: 10.84339 iterations/s
+                Best performance: 11.04627 iterations/s (+0.20288 iterations/s / +1.87103 %)
+               Worst performance: 10.51547 iterations/s (-0.32791 iterations/s / -3.02406 %)
+              Median performance: 10.90614 iterations/s (1st quartile: 10.97208 | 3rd quartile: 10.74092)

Performance (iterations/s):
Average:                (10.84339 - 11.03634) / 10.84339 * 100          = 1.77% slower
Fastest:                (11.04627 - 11.20523) / 11.04627 * 100          = 1.43% slower
Worst:                  (10.51547 - 10.71603) / 10.51547 * 100          = 1.90% slower
Median:                 (10.90614 - 11.05354) / 10.90614 * 100          = 1.35% slower

 [ RUN      ] BitmapImageEmptyFixture.FloodFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageEmptyFixture.FloodFillTo1080p (40758.633419 ms)
-[   RUNS   ]        Average time: 4075863.342 us (~42349.720 us)
-                    Fastest time: 4009787.105 us (-66076.237 us / -1.621 %)
-                    Slowest time: 4166157.056 us (+90293.714 us / +2.215 %)
-                     Median time: 4066468.953 us (1st quartile: 4053330.720 us | 3rd quartile: 4084090.062 us)
+[     DONE ] BitmapImageEmptyFixture.FloodFillTo1080p (4506.705095 ms)
+[   RUNS   ]        Average time: 450670.509 us (~13626.122 us)
+                    Fastest time: 435942.980 us (-14727.530 us / -3.268 %)
+                    Slowest time: 477413.027 us (+26742.518 us / +5.934 %)
+                     Median time: 444731.660 us (1st quartile: 440258.731 us | 3rd quartile: 462771.563 us)

Runs (time per run):
Average:                (450670.509 - 4075863.342) / 450670.509 * 100          = 804% faster!
Fastest:                (435942.980 - 4009787.105) / 435942.980 * 100          = 819% faster!
Worst:                  (477413.027 - 4166157.056) / 477413.027 * 100          = 772% faster!
Median:                 (444731.660 - 4066468.953) / 444731.660 * 100          = 814% faster!
                                   
-             Average performance: 0.24535 runs/s
-                Best performance: 0.24939 runs/s (+0.00404 runs/s / +1.64787 %)
-               Worst performance: 0.24003 runs/s (-0.00532 runs/s / -2.16731 %)
-              Median performance: 0.24591 runs/s (1st quartile: 0.24671 | 3rd quartile: 0.24485)
+             Average performance: 2.21892 runs/s
+                Best performance: 2.29388 runs/s (+0.07496 runs/s / +3.37832 %)
+               Worst performance: 2.09462 runs/s (-0.12429 runs/s / -5.60155 %)
+              Median performance: 2.24855 runs/s (1st quartile: 2.27139 | 3rd quartile: 2.16089)

Performance (runs/s):
Average:                (2.21892 - 0.24535) / 2.21892 * 100          = 88% faster!
Fastest:                (2.29388 - 0.24939) / 2.29388 * 100          = 89% faster!
Worst:                  (2.09462 - 0.24003) / 2.09462 * 100          = 88% faster!
Median:                 (2.24855 - 0.24591) / 2.24855 * 100          = 89% faster!
                                   
-[ITERATIONS]        Average time: 815172.668 us (~8469.944 us)
-                    Fastest time: 801957.421 us (-13215.247 us / -1.621 %)
-                    Slowest time: 833231.411 us (+18058.743 us / +2.215 %)
-                     Median time: 813293.791 us (1st quartile: 810666.144 us | 3rd quartile: 816818.012 us)
+[ITERATIONS]        Average time: 90134.102 us (~2725.224 us)
+                    Fastest time: 87188.596 us (-2945.506 us / -3.268 %)
+                    Slowest time: 95482.605 us (+5348.503 us / +5.934 %)
+                     Median time: 88946.332 us (1st quartile: 88051.746 us | 3rd quartile: 92554.313 us)

Runs (time per iteration):
Average:                (90134.102 - 815172.668) / 90134.102 * 100          = 804% faster!
Fastest:                (87188.596 - 801957.421) / 87188.596 * 100          = 819% faster!
Worst:                  (95482.605 - 833231.411) / 95482.605 * 100          = 772% faster!
Median:                 (88946.332 - 813293.791) / 88946.332 * 100          = 814% faster!
                                   
-             Average performance: 1.22673 iterations/s
-                Best performance: 1.24695 iterations/s (+0.02022 iterations/s / +1.64787 %)
-               Worst performance: 1.20015 iterations/s (-0.02659 iterations/s / -2.16731 %)
-              Median performance: 1.22957 iterations/s (1st quartile: 1.23355 | 3rd quartile: 1.22426)
+             Average performance: 11.09458 iterations/s
+                Best performance: 11.46939 iterations/s (+0.37481 iterations/s / +3.37832 %)
+               Worst performance: 10.47311 iterations/s (-0.62147 iterations/s / -5.60155 %)
+              Median performance: 11.24273 iterations/s (1st quartile: 11.35696 | 3rd quartile: 10.80447)

Performance (iterations/s):
Average:                (11.09458 - 1.22673) / 11.09458 * 100          = 88% faster!
Fastest:                (11.46939 - 1.24695) / 11.46939 * 100          = 89% faster!
Worst:                  (10.47311 - 1.20015) / 10.47311 * 100          = 88% faster!
Median:                 (11.24273 - 1.22957) / 11.24273 * 100          = 89% faster!

 [ RUN      ] BitmapImage1080pFixture.ExpandFill (10 runs, 5 iterations per run)
-[     DONE ] BitmapImage1080pFixture.ExpandFill (6955.762108 ms)
-[   RUNS   ]        Average time: 695576.211 us (~6425.858 us)
-                    Fastest time: 687291.201 us (-8285.010 us / -1.191 %)
-                    Slowest time: 707410.629 us (+11834.418 us / +1.701 %)
-                     Median time: 692907.324 us (1st quartile: 691267.696 us | 3rd quartile: 700121.380 us)
+[     DONE ] BitmapImage1080pFixture.ExpandFill (4697.038929 ms)
+[   RUNS   ]        Average time: 469703.893 us (~10115.183 us)
+                    Fastest time: 458384.736 us (-11319.157 us / -2.410 %)
+                    Slowest time: 494947.449 us (+25243.556 us / +5.374 %)
+                     Median time: 467526.077 us (1st quartile: 463464.587 us | 3rd quartile: 472521.326 us)

Runs (time per run):
Average:                (469703.893 - 695576.211) / 469703.893 * 100          = 48% faster!
Fastest:                (458384.736 - 687291.201) / 458384.736 * 100          = 49% faster!
Worst:                  (494947.449 - 707410.629) / 494947.449 * 100          = 42% faster!
Median:                 (467526.077 - 692907.324) / 467526.077 * 100          = 48% faster!
                                   
-             Average performance: 1.43766 runs/s
-                Best performance: 1.45499 runs/s (+0.01733 runs/s / +1.20546 %)
-               Worst performance: 1.41361 runs/s (-0.02405 runs/s / -1.67292 %)
-              Median performance: 1.44319 runs/s (1st quartile: 1.44662 | 3rd quartile: 1.42832)
+             Average performance: 2.12900 runs/s
+                Best performance: 2.18157 runs/s (+0.05257 runs/s / +2.46936 %)
+               Worst performance: 2.02042 runs/s (-0.10858 runs/s / -5.10025 %)
+              Median performance: 2.13892 runs/s (1st quartile: 2.15766 | 3rd quartile: 2.11631)

Performance (runs/s):
Average:                (2.12900 - 1.43766) / 2.12900 * 100          = 32% faster!
Fastest:                (2.18157 - 1.45499) / 2.18157 * 100          = 33% faster!
Worst:                  (2.02042 - 1.41361) / 2.02042 * 100          = 30% faster!
Median:                 (2.13892 - 1.44319) / 2.13892 * 100          = 32% faster!
                                   
-[ITERATIONS]        Average time: 139115.242 us (~1285.172 us)
-                    Fastest time: 137458.240 us (-1657.002 us / -1.191 %)
-                    Slowest time: 141482.126 us (+2366.884 us / +1.701 %)
-                     Median time: 138581.465 us (1st quartile: 138253.539 us | 3rd quartile: 140024.276 us)
+[ITERATIONS]        Average time: 93940.779 us (~2023.037 us)
+                    Fastest time: 91676.947 us (-2263.831 us / -2.410 %)
+                    Slowest time: 98989.490 us (+5048.711 us / +5.374 %)
+                     Median time: 93505.215 us (1st quartile: 92692.917 us | 3rd quartile: 94504.265 us)

Runs (time per iteration):
Average:                (93940.779 - 139115.242) / 93940.779 * 100          = 48% faster!
Fastest:                (91676.947 - 137458.240) / 91676.947 * 100          = 49% faster!
Worst:                  (98989.490 - 141482.126) / 98989.490 * 100          = 42% faster!
Median:                 (93505.215 - 138581.465) / 93505.215 * 100          = 48% faster!
                                   
-             Average performance: 7.18828 iterations/s
-                Best performance: 7.27494 iterations/s (+0.08665 iterations/s / +1.20546 %)
-               Worst performance: 7.06803 iterations/s (-0.12025 iterations/s / -1.67292 %)
-              Median performance: 7.21597 iterations/s (1st quartile: 7.23309 | 3rd quartile: 7.14162)
+             Average performance: 10.64500 iterations/s
+                Best performance: 10.90787 iterations/s (+0.26286 iterations/s / +2.46936 %)
+               Worst performance: 10.10208 iterations/s (-0.54292 iterations/s / -5.10025 %)
+              Median performance: 10.69459 iterations/s (1st quartile: 10.78831 | 3rd quartile: 10.58153)

Performance (iterations/s):
Average:                (10.64500 - 7.18828) / 10.64500 * 100          = 32% faster!
Fastest:                (10.90787 - 7.27494) / 10.90787 * 100          = 33% faster!
Worst:                  (10.10208 - 7.06803) / 10.10208 * 100          = 30% faster!
Median:                 (10.69459 - 7.21597) / 10.69459 * 100          = 32% faster!

 [ RUN      ] BitmapImageEmptyFixture.ExpandFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageEmptyFixture.ExpandFillTo1080p (43237.915080 ms)
-[   RUNS   ]        Average time: 4323791.508 us (~63580.763 us)
-                    Fastest time: 4258401.458 us (-65390.050 us / -1.512 %)
-                    Slowest time: 4479067.309 us (+155275.801 us / +3.591 %)
-                     Median time: 4306999.570 us (1st quartile: 4284776.485 us | 3rd quartile: 4331160.437 us)
+[     DONE ] BitmapImageEmptyFixture.ExpandFillTo1080p (4468.471240 ms)
+[   RUNS   ]        Average time: 446847.124 us (~13304.307 us)
+                    Fastest time: 435109.549 us (-11737.575 us / -2.627 %)
+                    Slowest time: 480501.513 us (+33654.389 us / +7.532 %)
+                     Median time: 441647.754 us (1st quartile: 439750.535 us | 3rd quartile: 447650.451 us)

Runs (time per run):
Average:                (446847.124 - 4323791.508) / 446847.124 * 100          = 867% faster!
Fastest:                (435109.549 - 4258401.458) / 435109.549 * 100          = 878% faster!
Worst:                  (480501.513 - 4479067.309) / 480501.513 * 100          = 832% faster!
Median:                 (441647.754 - 4306999.570) / 441647.754 * 100          = 875% faster!
                                   
-             Average performance: 0.23128 runs/s
-                Best performance: 0.23483 runs/s (+0.00355 runs/s / +1.53555 %)
-               Worst performance: 0.22326 runs/s (-0.00802 runs/s / -3.46670 %)
-              Median performance: 0.23218 runs/s (1st quartile: 0.23338 | 3rd quartile: 0.23089)
+             Average performance: 2.23790 runs/s
+                Best performance: 2.29827 runs/s (+0.06037 runs/s / +2.69761 %)
+               Worst performance: 2.08116 runs/s (-0.15674 runs/s / -7.00401 %)
+              Median performance: 2.26425 runs/s (1st quartile: 2.27402 | 3rd quartile: 2.23389)

Performance (runs/s):
Average:                (2.23790 - 0.23128) / 2.23790 * 100          = 89% faster!
Fastest:                (2.29827 - 0.23483) / 2.29827 * 100          = 89% faster!
Worst:                  (2.08116 - 0.22326) / 2.08116 * 100          = 89% faster!
Median:                 (2.26425 - 0.23218) / 2.26425 * 100          = 89% faster!
                                   
-[ITERATIONS]        Average time: 864758.302 us (~12716.153 us)
-                    Fastest time: 851680.292 us (-13078.010 us / -1.512 %)
-                    Slowest time: 895813.462 us (+31055.160 us / +3.591 %)
-                     Median time: 861399.914 us (1st quartile: 856955.297 us | 3rd quartile: 866232.087 us)
+[ITERATIONS]        Average time: 89369.425 us (~2660.861 us)
+                    Fastest time: 87021.910 us (-2347.515 us / -2.627 %)
+                    Slowest time: 96100.303 us (+6730.878 us / +7.532 %)
+                     Median time: 88329.551 us (1st quartile: 87950.107 us | 3rd quartile: 89530.090 us)

Runs (time per iteration):
Average:                (89369.425 - 864758.302) / 89369.425 * 100          = 867% faster!
Fastest:                (87021.910 - 851680.292) / 87021.910 * 100          = 878% faster!
Worst:                  (96100.303 - 895813.462) / 96100.303 * 100          = 832% faster!
Median:                 (88329.551 - 861399.914) / 88329.551 * 100          = 875% faster!
                                   
-             Average performance: 1.15639 iterations/s
-                Best performance: 1.17415 iterations/s (+0.01776 iterations/s / +1.53555 %)
-               Worst performance: 1.11630 iterations/s (-0.04009 iterations/s / -3.46670 %)
-              Median performance: 1.16090 iterations/s (1st quartile: 1.16692 | 3rd quartile: 1.15443)
+             Average performance: 11.18951 iterations/s
+                Best performance: 11.49136 iterations/s (+0.30185 iterations/s / +2.69761 %)
+               Worst performance: 10.40579 iterations/s (-0.78371 iterations/s / -7.00401 %)
+              Median performance: 11.32124 iterations/s (1st quartile: 11.37008 | 3rd quartile: 11.16943)

Performance (iterations/s):
Average:                (11.18951 - 1.15639) / 11.18951 * 100          = 89% faster!
Fastest:                (11.49136 - 1.17415) / 11.49136 * 100          = 89% faster!
Worst:                  (10.40579 - 1.11630) / 10.40579 * 100          = 89% faster!
Median:                 (11.32124 - 1.16090) / 11.32124 * 100          = 89% faster!

 [ RUN      ] BitmapImageContourFixture.ExpandFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageContourFixture.ExpandFillTo1080p (189621.906654 ms)
-[   RUNS   ]        Average time: 18962190.665 us (~978157.563 us)
-                    Fastest time: 17307715.584 us (-1654475.081 us / -8.725 %)
-                    Slowest time: 20021912.750 us (+1059722.085 us / +5.589 %)
-                     Median time: 18976471.604 us (1st quartile: 18293238.608 us | 3rd quartile: 19921682.275 us)
+[     DONE ] BitmapImageContourFixture.ExpandFillTo1080p (36661.419619 ms)
+[   RUNS   ]        Average time: 3666141.962 us (~39787.390 us)
+                    Fastest time: 3619969.962 us (-46172.000 us / -1.259 %)
+                    Slowest time: 3753313.138 us (+87171.176 us / +2.378 %)
+                     Median time: 3656865.183 us (1st quartile: 3639021.966 us | 3rd quartile: 3673942.287 us)

Runs (time per run):
Average:                (3666141.962 - 18962190.665) / 3666141.962 * 100          = 417% faster!
Fastest:                (3619969.962 - 17307715.584) / 3619969.962 * 100          = 378% faster!
Worst:                  (3753313.138 - 20021912.750) / 3753313.138 * 100          = 433% faster!
Median:                 (3656865.183 - 18976471.604) / 3656865.183 * 100          = 418% faster!
                                   
-             Average performance: 0.05274 runs/s
-                Best performance: 0.05778 runs/s (+0.00504 runs/s / +9.55918 %)
-               Worst performance: 0.04995 runs/s (-0.00279 runs/s / -5.29281 %)
-              Median performance: 0.05270 runs/s (1st quartile: 0.05467 | 3rd quartile: 0.05020)
+             Average performance: 0.27277 runs/s
+                Best performance: 0.27625 runs/s (+0.00348 runs/s / +1.27548 %)
+               Worst performance: 0.26643 runs/s (-0.00634 runs/s / -2.32251 %)
+              Median performance: 0.27346 runs/s (1st quartile: 0.27480 | 3rd quartile: 0.27219)

Performance (runs/s):
Average:                (0.27277 - 0.05274) / 0.27277 * 100          = 80% faster!
Fastest:                (0.27625 - 0.05778) / 0.27625 * 100          = 79% faster!
Worst:                  (0.26643 - 0.04995) / 0.26643 * 100          = 81% faster!
Median:                 (0.27346 - 0.05270) / 0.27346 * 100          = 80% faster!
                                   
-[ITERATIONS]        Average time: 3792438.133 us (~195631.513 us)
-                    Fastest time: 3461543.117 us (-330895.016 us / -8.725 %)
-                    Slowest time: 4004382.550 us (+211944.417 us / +5.589 %)
-                     Median time: 3795294.321 us (1st quartile: 3658647.722 us | 3rd quartile: 3984336.455 us)
+[ITERATIONS]        Average time: 733228.392 us (~7957.478 us)
+                    Fastest time: 723993.992 us (-9234.400 us / -1.259 %)
+                    Slowest time: 750662.628 us (+17434.235 us / +2.378 %)
+                     Median time: 731373.037 us (1st quartile: 727804.393 us | 3rd quartile: 734788.457 us)

Runs (time per iteration):
Average:                (733228.392 - 3792438.133) / 733228.392 * 100          = 417% faster!
Fastest:                (723993.992 - 3461543.117) / 723993.992 * 100          = 378% faster!
Worst:                  (750662.628 - 4004382.550) / 750662.628 * 100          = 433% faster!
Median:                 (731373.037 - 3795294.321) / 731373.037 * 100          = 418% faster!
                                   
-             Average performance: 0.26368 iterations/s
-                Best performance: 0.28889 iterations/s (+0.02521 iterations/s / +9.55918 %)
-               Worst performance: 0.24973 iterations/s (-0.01396 iterations/s / -5.29281 %)
-              Median performance: 0.26348 iterations/s (1st quartile: 0.27333 | 3rd quartile: 0.25098)
+             Average performance: 1.36383 iterations/s
+                Best performance: 1.38123 iterations/s (+0.01740 iterations/s / +1.27548 %)
+               Worst performance: 1.33216 iterations/s (-0.03168 iterations/s / -2.32251 %)
+              Median performance: 1.36729 iterations/s (1st quartile: 1.37400 | 3rd quartile: 1.36094)

Performance (iterations/s):
Average:                (1.36383 - 0.26368) / 1.36383 * 100          = 80% faster!
Fastest:                (1.38123 - 0.28889) / 1.38123 * 100          = 79% faster!
Worst:                  (1.33216 - 0.24973) / 1.33216 * 100          = 81% faster!
Median:                 (1.36729 - 0.26348) / 1.36729 * 100          = 80% faster!
 [==========] Ran 5 benchmarks.

MrStevns and others added 6 commits April 2, 2022 20:39
- Minimizes the amount of allocations required to fill by only allocating when the new size has been calculated
- Calculate the new fill rect, and only expand to what's required. This should should give a substantial performance improvement to the expand algorithm
- Create replaceImage after figuring out the size for the optimal size to search when calling expand
- Replace hashmaps with ptr array.
  - Credits goes to scribblemaniac for the solution to traverse a ptr instead of using hashmaps.
This change will only affect when filling inside a contour, but that also means that even on a huge 5000x5000 canvas, filling inside a 200x200 contour will be much faster.
Taken from Scribblemaniac's branch.
@MrStevns MrStevns changed the title Optimize floodfill and expand algorithm Optimize floodfilling for expansion Apr 9, 2022
scribblemaniac and others added 4 commits April 12, 2022 19:20
This change is useless for release build as the compiler will
optimize this, but for debug builds this results in a roughly 3x
speed improvment.
This treats everything outside the minimally bounded target image
as a single region (which it is). compareColor only needs to be
called once, and if it's true, then all of the transparent "border"
region can be filled at once. Ideally this would be done with a
rectangular fill with a mask when creating replaceImage, but
unforunately that information needs to be present in filledPixels
if expandFill is run. Thankfull this is pretty fast anyway.
Issue 1. The maxBounds shouldn't use the fillBounds, because it has an extra pixel boundary, instead it should be based on the targetImage bounds.

Issue 2. Qt::GlobalColors and QRgb are not interchangeable, one can't compare Qt::Transparent against 0.
@MrStevns
Copy link
Member Author

MrStevns commented Apr 13, 2022

PR is ready for review.

Here's a new benchmark comparing the latest changes with the previous benchmark.
Overall not a huge improvement over the previous changes, slightly slower in some cases and a bit faster in other.

--- =============== Benchmark bucket_tool_optimization	Fri Apr  8 19:41:28 2022
+++ =============== Benchmark bucket_tool_optimization	Fri Apr  13 14:00:02 2022
@@ -1,108 +1,108 @@
-================== Master ===========================================
+=============== Benchmark bucket_tool_optimizations2 =================
 [==========] Running 5 benchmarks.
[ RUN      ] BitmapImage1080pFixture.FloodFill (10 runs, 5 iterations per run)
-[     DONE ] BitmapImage1080pFixture.FloodFill (4611.106197 ms)
-[   RUNS   ]        Average time: 461110.620 us (~7446.206 us)
-                    Fastest time: 452641.573 us (-8469.047 us / -1.837 %)
-                    Slowest time: 475489.724 us (+14379.104 us / +3.118 %)
-                     Median time: 458457.295 us (1st quartile: 455701.915 us | 3rd quartile: 465509.400 us)
+[     DONE ] BitmapImage1080pFixture.FloodFill (4880.580094 ms)
+[   RUNS   ]        Average time: 488058.009 us (~27866.416 us)
+                    Fastest time: 453688.253 us (-34369.756 us / -7.042 %)
+                    Slowest time: 532625.082 us (+44567.073 us / +9.132 %)
+                     Median time: 492077.907 us (1st quartile: 458973.391 us | 3rd quartile: 505361.628 us)

Runs (time per run):
Average:                (488058.009 - 461110.620) / 488058.009 * 100 = 5.52% slower
Fastest:                (453688.253 - 452641.573) / 453688.253 * 100 = 0.23% slower
Slowest:                (532625.082 - 475489.724) / 532625.082 * 100 = 10.72% slower
Median:                 (492077.907 - 458457.295) / 492077.907 * 100 = 6.83% slower  


-             Average performance: 2.16868 runs/s
-                Best performance: 2.20925 runs/s (+0.04058 runs/s / +1.87103 %)
-               Worst performance: 2.10309 runs/s (-0.06558 runs/s / -3.02406 %)
-              Median performance: 2.18123 runs/s (1st quartile: 2.19442 | 3rd quartile: 2.14818)
+             Average performance: 2.04894 runs/s
+                Best performance: 2.20416 runs/s (+0.15522 runs/s / +7.57563 %)
+               Worst performance: 1.87749 runs/s (-0.17144 runs/s / -8.36744 %)
+              Median performance: 2.03220 runs/s (1st quartile: 2.17878 | 3rd quartile: 1.97878)

Performance (runs/s):
Average:                (2.16868 - 2.20727) / 2.16868 * 100          = 1.77% slower
Fastest:                (2.20925 - 2.24105) / 2.20925 * 100          = 1.43% slower
Worst:                  (2.10309 - 2.14321) / 2.10309 * 100          = 1.90% slower
Median:                 (2.18123 - 2.21071) / 2.18123 * 100          = 1.35% slower                                   
                                   
-[ITERATIONS]        Average time: 92222.124 us (~1489.241 us)
-                    Fastest time: 90528.315 us (-1693.809 us / -1.837 %)
-                    Slowest time: 95097.945 us (+2875.821 us / +3.118 %)
-                     Median time: 91691.459 us (1st quartile: 91140.383 us | 3rd quartile: 93101.880 us)
+[ITERATIONS]        Average time: 97611.602 us (~5573.283 us)
+                    Fastest time: 90737.651 us (-6873.951 us / -7.042 %)
+                    Slowest time: 106525.016 us (+8913.415 us / +9.132 %)
+                     Median time: 98415.581 us (1st quartile: 91794.678 us | 3rd quartile: 101072.326 us)

Runs (time per iteration):
Average:                (97611.602  - 92222.124) / 97611.602  * 100          = 5.52% slower
Fastest:                (90737.651  - 90528.315) / 90737.651  * 100          = 0.23% slower
Worst:                  (106525.016 - 95097.945) / 106525.016 * 100          = 1.72% slower
Median:                 (98415.581  - 91691.459) / 98415.581  * 100          = 6.83% slower                                   
                                   
-             Average performance: 10.84339 iterations/s
-                Best performance: 11.04627 iterations/s (+0.20288 iterations/s / +1.87103 %)
-               Worst performance: 10.51547 iterations/s (-0.32791 iterations/s / -3.02406 %)
-              Median performance: 10.90614 iterations/s (1st quartile: 10.97208 | 3rd quartile: 10.74092)
+             Average performance: 10.24468 iterations/s
+                Best performance: 11.02078 iterations/s (+0.77610 iterations/s / +7.57563 %)
+               Worst performance: 9.38747  iterations/s (-0.85722 iterations/s / -8.36744 %)
+              Median performance: 10.16099 iterations/s (1st quartile: 10.89388 | 3rd quartile: 9.89391)

Performance (iterations/s):
Average:                (10.24468 - 10.84339) / 10.24468 * 100          = 5.84% slower
Fastest:                (11.02078 - 11.04627) / 11.02078 * 100          = 0.23% slower
Worst:                  (9.38747  - 10.51547) / 9.38747  * 100          = 12.06% slower
Median:                 (10.16099 - 10.90614) / 10.16099 * 100          = 7.33% slower

 [ RUN      ] BitmapImageEmptyFixture.FloodFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageEmptyFixture.FloodFillTo1080p (4506.705095 ms)
-[   RUNS   ]        Average time: 450670.509 us (~13626.122 us)
-                    Fastest time: 435942.980 us (-14727.530 us / -3.268 %)
-                    Slowest time: 477413.027 us (+26742.518 us / +5.934 %)
-                     Median time: 444731.660 us (1st quartile: 440258.731 us | 3rd quartile: 462771.563 us)
+[     DONE ] BitmapImageEmptyFixture.FloodFillTo1080p (4032.089759 ms)
+[   RUNS   ]        Average time: 403208.976 us (~5620.114 us)
+                    Fastest time: 398307.949 us (-4901.027 us / -1.216 %)
+                    Slowest time: 417317.197 us (+14108.221 us / +3.499 %)
+                     Median time: 402028.231 us (1st quartile: 399732.100 us | 3rd quartile: 403416.067 us)

Runs (time per run):
Average:                (403208.976 - 450670.509) / 403208.976 * 100          = 11% faster!
Fastest:                (398307.949 - 435942.980) / 398307.949 * 100          = 9.44% faster!
Worst:                  (417317.197 - 477413.027) / 417317.197 * 100          = 14.40% faster!
Median:                 (402028.231 - 444731.660) / 402028.231 * 100          = 10.62% faster!
                                   
-             Average performance: 2.21892 runs/s
-                Best performance: 2.29388 runs/s (+0.07496 runs/s / +3.37832 %)
-               Worst performance: 2.09462 runs/s (-0.12429 runs/s / -5.60155 %)
-              Median performance: 2.24855 runs/s (1st quartile: 2.27139 | 3rd quartile: 2.16089)
+             Average performance: 2.48010 runs/s
+                Best performance: 2.51062 runs/s (+0.03052 runs/s / +1.23046 %)
+               Worst performance: 2.39626 runs/s (-0.08384 runs/s / -3.38069 %)
+              Median performance: 2.48739 runs/s (1st quartile: 2.50168 | 3rd quartile: 2.47883)

Performance (runs/s):
Average:                (2.48010 - 2.21892) / 2.48010 * 100          = 10.53% faster!
Fastest:                (2.51062 - 2.29388) / 2.51062 * 100          = 8.63% faster!
Worst:                  (2.39626 - 2.09462) / 2.39626 * 100          = 12.58% faster!
Median:                 (2.48739 - 2.24855) / 2.48739 * 100          = 9.60% faster!
                                   
-[ITERATIONS]        Average time: 90134.102 us (~2725.224 us)
-                    Fastest time: 87188.596 us (-2945.506 us / -3.268 %)
-                    Slowest time: 95482.605 us (+5348.503 us / +5.934 %)
-                     Median time: 88946.332 us (1st quartile: 88051.746 us | 3rd quartile: 92554.313 us)
+[ITERATIONS]        Average time: 80641.795 us (~1124.023 us)
+                    Fastest time: 79661.590 us (-980.205 us / -1.216 %)
+                    Slowest time: 83463.439 us (+2821.644 us / +3.499 %)
+                     Median time: 80405.646 us (1st quartile: 79946.420 us | 3rd quartile: 80683.213 us)

Runs (time per iteration):
Average:                (80641.795 - 90134.102) / 80641.795 * 100          = 11.77% faster!
Fastest:                (79661.590 - 87188.596) / 79661.590 * 100          = 9.44% faster!
Worst:                  (83463.439 - 95482.605) / 83463.439 * 100          = 14.40% faster!
Median:                 (80405.646 - 88946.332) / 80405.646 * 100          = 1.62% faster!
                                   
-             Average performance: 11.09458 iterations/s
-                Best performance: 11.46939 iterations/s (+0.37481 iterations/s / +3.37832 %)
-               Worst performance: 10.47311 iterations/s (-0.62147 iterations/s / -5.60155 %)
-              Median performance: 11.24273 iterations/s (1st quartile: 11.35696 | 3rd quartile: 10.80447)
+             Average performance: 12.40052 iterations/s
+                Best performance: 12.55310 iterations/s (+0.15258 iterations/s / +1.23046 %)
+               Worst performance: 11.98129 iterations/s (-0.41922 iterations/s / -3.38069 %)
+              Median performance: 12.43694 iterations/s (1st quartile: 12.50838 | 3rd quartile: 12.39415)

Performance (iterations/s):
Average:                (12.40052 - 11.09458) / 12.40052 * 100          = 10.53% faster!
Fastest:                (12.55310 - 11.46939) / 12.55310 * 100          = 8.63% faster!
Worst:                  (11.98129 - 10.47311) / 11.98129 * 100          = 12.58% faster!
Median:                 (12.43694 - 11.24273) / 12.43694 * 100          = 9.60% faster!

 [ RUN      ] BitmapImage1080pFixture.ExpandFill (10 runs, 5 iterations per run)
-[     DONE ] BitmapImage1080pFixture.ExpandFill (4697.038929 ms)
-[   RUNS   ]        Average time: 469703.893 us (~10115.183 us)
-                    Fastest time: 458384.736 us (-11319.157 us / -2.410 %)
-                    Slowest time: 494947.449 us (+25243.556 us / +5.374 %)
-                     Median time: 467526.077 us (1st quartile: 463464.587 us | 3rd quartile: 472521.326 us)
+[     DONE ] BitmapImage1080pFixture.ExpandFill (4697.038929 ms)
+[   RUNS   ]        Average time: 465087.265 us (~19217.560 us)
+                    Fastest time: 452785.872 us (-12301.394 us / -2.645 %)
+                    Slowest time: 517690.645 us (+52603.380 us / +11.310 %)
+                     Median time: 457856.551 us (1st quartile: 455050.232 us | 3rd quartile: 466857.769 us)

Runs (time per run):
Average:                (465087.265 - 469703.893) / 465087.265 * 100          = 0.99% faster!
Fastest:                (452785.872 - 458384.736) / 452785.872 * 100          = 1.23% faster!
Worst:                  (517690.645 - 494947.449) / 517690.645 * 100          = 4.39% slower!
Median:                 (457856.551 - 467526.077) / 457856.551 * 100          = 2.11% faster!
                                   
-             Average performance: 2.12900 runs/s
-                Best performance: 2.18157 runs/s (+0.05257 runs/s / +2.46936 %)
-               Worst performance: 2.02042 runs/s (-0.10858 runs/s / -5.10025 %)
-              Median performance: 2.13892 runs/s (1st quartile: 2.15766 | 3rd quartile: 2.11631)
+             Average performance: 2.15013 runs/s
+                Best performance: 2.20855 runs/s (+0.05842 runs/s / +2.71682 %)
+               Worst performance: 1.93166 runs/s (-0.21848 runs/s / -10.16116 %)
+              Median performance: 2.18409 runs/s (1st quartile: 2.19756 | 3rd quartile: 2.14198)

Performance (runs/s):
Average:                (2.15013 - 2.12900) / 2.15013 * 100          = 0.98 faster!
Fastest:                (2.20855 - 2.18157) / 2.20855 * 100          = 1.22% faster!
Worst:                  (1.93166 - 2.02042) / 1.93166 * 100          = 4.59% slower!
Median:                 (2.18409 - 2.13892) / 2.18409 * 100          = 2.06% faster!
                                   
-[ITERATIONS]        Average time: 93940.779 us (~2023.037 us)
-                    Fastest time: 91676.947 us (-2263.831 us / -2.410 %)
-                    Slowest time: 98989.490 us (+5048.711 us / +5.374 %)
-                     Median time: 93505.215 us (1st quartile: 92692.917 us | 3rd quartile: 94504.265 us)
+[ITERATIONS]        Average time: 93017.453  us (~3843.512 us)
+                    Fastest time: 90557.174  us (-2460.279 us / -2.645 %)
+                    Slowest time: 103538.129 us (+10520.676 us / +11.310 %)
+                     Median time: 91571.310  us (1st quartile: 91010.046 us | 3rd quartile: 93371.554 us)

Runs (time per iteration):
Average:                (93017.453  - 93940.779) / 93017.453  * 100          = 0.99% faster!
Fastest:                (90557.174  - 91676.947) / 90557.174  * 100          = 1.23% faster!
Worst:                  (103538.129 - 98989.490) / 103538.129 * 100          = 4.39% slower!
Median:                 (91571.310  - 93505.215) / 91571.310  * 100          = 2.11% faster!
                                   
-             Average performance: 10.64500 iterations/s
-                Best performance: 10.90787 iterations/s (+0.26286 iterations/s / +2.46936 %)
-               Worst performance: 10.10208 iterations/s (-0.54292 iterations/s / -5.10025 %)
-              Median performance: 10.69459 iterations/s (1st quartile: 10.78831 | 3rd quartile: 10.58153)
+             Average performance: 10.75067 iterations/s
+                Best performance: 11.04275 iterations/s (+0.29208 iterations/s / +2.71682 %)
+               Worst performance: 9.65828  iterations/s (-1.09239 iterations/s / -10.16116 %)
+              Median performance: 10.92045 iterations/s (1st quartile: 10.98780 | 3rd quartile: 10.70990)

Performance (iterations/s):
Average:                (10.75067 - 10.64500) / 10.75067 * 100          = 0.98% faster!
Fastest:                (11.04275 - 10.90787) / 11.04275 * 100          = 1.22% faster!
Worst:                  (9.65828  - 10.10208) / 9.65828  * 100          = 4.59% slower!
Median:                 (10.92045 - 10.69459) / 10.92045 * 100          = 2.06% faster!

 [ RUN      ] BitmapImageEmptyFixture.ExpandFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageEmptyFixture.ExpandFillTo1080p (4468.471240 ms)
-[   RUNS   ]        Average time: 446847.124 us (~13304.307 us)
-                    Fastest time: 435109.549 us (-11737.575 us / -2.627 %)
-                    Slowest time: 480501.513 us (+33654.389 us / +7.532 %)
-                     Median time: 441647.754 us (1st quartile: 439750.535 us | 3rd quartile: 447650.451 us)
+[     DONE ] BitmapImageEmptyFixture.ExpandFillTo1080p (4468.471240 ms)
+[   RUNS   ]        Average time: 430470.024 us (~51435.871 us)
+                    Fastest time: 398677.488 us (-31792.536 us / -7.386 %)
+                    Slowest time: 562161.587 us (+131691.563 us / +30.593 %)
+                     Median time: 408095.926 us (1st quartile: 399905.314 us | 3rd quartile: 447659.435 us)

Runs (time per run):
Average:                (430470.024 - 446847.124) / 430470.024 * 100          = 3.80% faster!
Fastest:                (398677.488 - 435109.549) / 398677.488 * 100          = 9.13% faster!
Worst:                  (562161.587 - 480501.513) / 562161.587 * 100          = 14.52% slower!
Median:                 (408095.926 - 441647.754) / 408095.926 * 100          = 8.22% faster!
                                   
-             Average performance: 2.23790 runs/s
-                Best performance: 2.29827 runs/s (+0.06037 runs/s / +2.69761 %)
-               Worst performance: 2.08116 runs/s (-0.15674 runs/s / -7.00401 %)
-              Median performance: 2.26425 runs/s (1st quartile: 2.27402 | 3rd quartile: 2.23389)
+             Average performance: 2.32304 runs/s
+                Best performance: 2.50829 runs/s (+0.18525 runs/s / +7.97450 %)
+               Worst performance: 1.77885 runs/s (-0.54419 runs/s / -23.42593 %)
+              Median performance: 2.45040 runs/s (1st quartile: 2.50059 | 3rd quartile: 2.23384)

Performance (runs/s):
Average:                (2.32304 - 2.23790) / 2.32304 * 100          = 3.66% faster!
Fastest:                (2.50829 - 2.29827) / 2.50829 * 100          = 8.37% faster!
Worst:                  (1.77885 - 2.08116) / 1.77885 * 100          = 16.99% slower!
Median:                 (2.45040 - 2.26425) / 2.45040 * 100          = 7.59% faster!
                                   
-[ITERATIONS]        Average time: 89369.425 us (~2660.861 us)
-                    Fastest time: 87021.910 us (-2347.515 us / -2.627 %)
-                    Slowest time: 96100.303 us (+6730.878 us / +7.532 %)
-                     Median time: 88329.551 us (1st quartile: 87950.107 us | 3rd quartile: 89530.090 us)
+[ITERATIONS]        Average time: 86094.005  us (~10287.174 us)
+                    Fastest time: 79735.498  us (-6358.507 us / -7.386 %)
+                    Slowest time: 112432.317 us (+26338.313 us / +30.593 %)
+                     Median time: 81619.185  us (1st quartile: 79981.063 us | 3rd quartile: 89531.887 us)

Runs (time per iteration):
Average:                (86094.005  - 89369.425) / 86094.005  * 100          = 3.80% faster!
Fastest:                (79735.498  - 87021.910) / 79735.498  * 100          = 9.13% faster!
Worst:                  (112432.317 - 96100.303) / 112432.317 * 100          = 14.52% slower!
Median:                 (81619.185  - 88329.551) / 81619.185  * 100          = 8.22% faster!
                                   
-             Average performance: 11.18951 iterations/s
-                Best performance: 11.49136 iterations/s (+0.30185 iterations/s / +2.69761 %)
-               Worst performance: 10.40579 iterations/s (-0.78371 iterations/s / -7.00401 %)
-              Median performance: 11.32124 iterations/s (1st quartile: 11.37008 | 3rd quartile: 11.16943)
+             Average performance: 11.61521 iterations/s
+                Best performance: 12.54147 iterations/s (+0.92625 iterations/s / +7.97450 %)
+               Worst performance: 8.89424  iterations/s (-2.72097 iterations/s / -23.42593 %)
+              Median performance: 12.25202 iterations/s (1st quartile: 12.50296 | 3rd quartile: 11.16921)

Performance (iterations/s):
Average:                (11.61521 - 11.18951) / 11.61521 * 100          = 3.66% faster!
Fastest:                (12.54147 - 11.49136) / 12.54147 * 100          = 8.37% faster!
Worst:                  (8.89424  - 10.40579) / 8.89424  * 100          = 16.99% slower!
Median:                 (12.25202 - 11.32124) / 12.25202 * 100          = 7.59% faster!

 [ RUN      ] BitmapImageContourFixture.ExpandFillTo1080p (10 runs, 5 iterations per run)
-[     DONE ] BitmapImageContourFixture.ExpandFillTo1080p (36661.419619 ms)
-[   RUNS   ]        Average time: 3666141.962 us (~39787.390 us)
-                    Fastest time: 3619969.962 us (-46172.000 us / -1.259 %)
-                    Slowest time: 3753313.138 us (+87171.176 us / +2.378 %)
-                     Median time: 3656865.183 us (1st quartile: 3639021.966 us | 3rd quartile: 3673942.287 us)
+[     DONE ] BitmapImageContourFixture.ExpandFillTo1080p (36661.419619 ms)
+[   RUNS   ]        Average time: 3704239.943 us (~79696.742 us)
+                    Fastest time: 3605367.074 us (-98872.869 us / -2.669 %)
+                    Slowest time: 3837315.506 us (+133075.563 us / +3.593 %)
+                     Median time: 3674391.197 us (1st quartile: 3661619.050 us | 3rd quartile: 3777444.493 us)

Runs (time per run):
Average:                (3704239.943 - 3666141.962) / 3704239.943 * 100          = 1.02% slower!
Fastest:                (3605367.074 - 3619969.962) / 3605367.074 * 100          = 0.40% faster!
Worst:                  (3837315.506 - 3753313.138) / 3837315.506 * 100          = 2.18% slower!
Median:                 (3674391.197 - 3656865.183) / 3674391.197 * 100          = 0.47% slower!
                                   
-             Average performance: 0.27277 runs/s
-                Best performance: 0.27625 runs/s (+0.00348 runs/s / +1.27548 %)
-               Worst performance: 0.26643 runs/s (-0.00634 runs/s / -2.32251 %)
-              Median performance: 0.27346 runs/s (1st quartile: 0.27480 | 3rd quartile: 0.27219)
+             Average performance: 0.26996 runs/s
+                Best performance: 0.27736 runs/s (+0.00740 runs/s / +2.74238 %)
+               Worst performance: 0.26060 runs/s (-0.00936 runs/s / -3.46793 %)
+              Median performance: 0.27215 runs/s (1st quartile: 0.27310 | 3rd quartile: 0.26473)

Performance (runs/s):
Average:                (0.26996 - 0.27277) / 0.26996 * 100          = 1.04% slower!
Fastest:                (0.27736 - 0.27625) / 0.27736 * 100          = 0.40% faster!
Worst:                  (0.26060 - 0.26643) / 0.26060 * 100          = 2.23% slower!
Median:                 (0.27215 - 0.27346) / 0.27215 * 100          = 0.48% slower!
                                   
-[ITERATIONS]        Average time: 733228.392 us (~7957.478 us)
-                    Fastest time: 723993.992 us (-9234.400 us / -1.259 %)
-                    Slowest time: 750662.628 us (+17434.235 us / +2.378 %)
-                     Median time: 731373.037 us (1st quartile: 727804.393 us | 3rd quartile: 734788.457 us)
+[ITERATIONS]        Average time: 740847.989 us (~15939.348 us)
+                    Fastest time: 721073.415 us (-19774.574 us / -2.669 %)
+                    Slowest time: 767463.101 us (+26615.113 us / +3.593 %)
+                     Median time: 734878.239 us (1st quartile: 732323.810 us | 3rd quartile: 755488.899 us)

Runs (time per iteration):
Average:                (740847.989 - 733228.392) / 740847.989 * 100          = 1.02% faster!
Fastest:                (721073.415 - 723993.992) / 721073.415 * 100          = 0.40% faster!
Worst:                  (767463.101 - 750662.628) / 767463.101 * 100          = 2.18% slower!
Median:                 (734878.239 - 731373.037) / 734878.239 * 100          = 0.47% faster!
                                   
-             Average performance: 1.36383 iterations/s
-                Best performance: 1.38123 iterations/s (+0.01740 iterations/s / +1.27548 %)
-               Worst performance: 1.33216 iterations/s (-0.03168 iterations/s / -2.32251 %)
-              Median performance: 1.36729 iterations/s (1st quartile: 1.37400 | 3rd quartile: 1.36094)
+             Average performance: 1.34980 iterations/s
+                Best performance: 1.38682 iterations/s (+0.03702 iterations/s / +2.74238 %)
+               Worst performance: 1.30299 iterations/s (-0.04681 iterations/s / -3.46793 %)
+              Median performance: 1.36077 iterations/s (1st quartile: 1.36552 | 3rd quartile: 1.32365)

Performance (iterations/s):
Average:                (1.34980 - 1.36383) / 1.34980 * 100          = 1.03% slower!
Fastest:                (1.38682 - 1.38123) / 1.38682 * 100          = 0.40% faster!
Worst:                  (1.30299 - 1.33216) / 1.30299 * 100          = 2.23% slower!
Median:                 (1.36077 - 1.36729) / 1.36077 * 100          = 0.47% faster!
 [==========] Ran 5 benchmarks.

@chchwy chchwy self-requested a review April 29, 2022 03:33
@chchwy
Copy link
Member

chchwy commented Apr 29, 2022

I have tested this PR, and the speed improvement is significant. The time goes down from a few seconds to a blink when I use the bucket tool with a 4000x3000 canvas. That's very good.

Are there other changes coming, @MrStevns @scribblemaniac? otherwise I will be merging it.

Do you plan to push the benchmark code into the official repo as well? I think it could be useful in other scenarios.

@MrStevns
Copy link
Member Author

There are no more planned changes, so it's ready for be merged.

@chchwy
Copy link
Member

chchwy commented Apr 29, 2022

Thanks @MrStevns @scribblemaniac

@chchwy chchwy merged commit be2cdb3 into pencil2d:master Apr 29, 2022
@chchwy chchwy added this to the 0.7.0 milestone May 20, 2022
@MrStevns MrStevns modified the milestones: 0.7.0, v0.6.7 Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants