Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map.traverse performance is non-linear. #1633

Closed
udalrich opened this issue Oct 19, 2016 · 2 comments · Fixed by #1671
Closed

Map.traverse performance is non-linear. #1633

udalrich opened this issue Oct 19, 2016 · 2 comments · Fixed by #1671

Comments

@udalrich
Copy link

It appears that Map.traverse calls append instead of prepend and reverse, resulting in quadratic performance.

Timing code:

public void timing() {
    int N = 100;
    while (true) {
        val timer = new ElapsedTime();
        val map = HashMap.ofEntries(List.range(0, N).
                                        map(n -> Tuple.of(n, n*n)));

        logger.info("Created list with {} entries in {}", N, timer);
        logScaledValues(N, timer);

        val result = traverse((a,b) -> a + b);
        logger.info("  Traverse after {}", timer);
        logScaledValues(N, timer);

        val sum = result.sum();
        logger.info("  Sum is {} after {}", sum, timer);
        logScaledValues(N, timer);

        N = N * 4;
    }
}
private void logScaledValues(int n, ElapsedTime timer) {
    double delta = timer.getElapsed();
    logger.info("\t{}\t{}\t{}", delta, delta / n, delta / (n*n));
}

ElapsedTime just stores System.nanoTime and then returns the elapsed time since the constructor was called.

I get these timing results:

Created list with 100 entries in 31.0 ms
   3.6121375E7  361213.75   3612.1375
  Traverse after 42.2 ms
   4.250858E7   425085.8    4250.858
  Sum is 333300 after 43.1 ms
   4.336273E7   433627.3    4336.273
Created list with 400 entries in 1.6 ms
   1906584.0    4766.46 11.91615
  Traverse after 10.3 ms
   1.0587551E7  26468.8775  66.17219375
  Sum is 21333200 after 11.1 ms
   1.1184696E7  27961.74    69.90435
Created list with 1600 entries in 4.0 ms
   4238637.0    2649.148125 1.655717578125
  Traverse after 48.8 ms
   4.9086191E7  30678.869375    19.174293359375
  Sum is 1365332800 after 49.4 ms
   4.9560259E7  30975.161875    19.359476171875
Created list with 6400 entries in 5.4 ms
   5773190.0    902.0609375 0.140947021484375
  Traverse after 463.6 ms
   4.64015186E8 72502.3728125   11.328495751953126
  Sum is 87381331200 after 464.5 ms
   4.6466207E8  72603.4484375   11.344288818359376
Created list with 25600 entries in 13.0 ms
   1.3337646E7  521.001796875   0.020351632690429687
  Traverse after 4.9 seconds
   4.870842472E9    190267.2840625  7.432315783691406
  Sum is 5592405324800 after 4.9 seconds
   4.871352589E9    190287.2105078125   7.433094160461426
Created list with 102400 entries in 31.3 ms
   3.1552248E7  308.127421875   0.01664301357438079
  Traverse after 66.9 seconds
   6.6876570864E10  653091.51234375 35.275701328716444
  Sum is 22073268555776 after 66.9 seconds
   6.6877863187E10  653104.1326855469   35.27638299644521
Created list with 409600 entries in 71.6 ms
   7.1933567E7  175.61906005859376  0.2679734192788601

The final traverse of 409600 entries is still running after 15 minutes.

@danieldietrich
Copy link
Contributor

danieldietrich commented Oct 19, 2016

Thank you, that's right! We will fix it.
(Note to myself: Maybe Vector or Array will be faster because they need not to be reversed)

@danieldietrich danieldietrich modified the milestones: 2.0.5, 2.1.0 Nov 2, 2016
@danieldietrich
Copy link
Contributor

@mduesterhoeft Please take a look at Map.traverse() and Multimap.traverse().

The List.append() calls are O(n) for each append. The foldLeft calls append n times => we have an overall O(n^2).

There are two possibilities to solve this issue:

  1. Use another Seq implementation than List. Vector is the way to go!
  2. Eliminate append() and use prepend() instead. Then foldLeft(...).reverse() all the things.

List has the least memory overhead but is iterated again when calling reverse(). Therefore I think we should implement 1) instead and use Vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants