Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need an unzip() operator. #323

Closed
icefoxen opened this issue Apr 24, 2017 · 7 comments
Closed

Need an unzip() operator. #323

icefoxen opened this issue Apr 24, 2017 · 7 comments

Comments

@icefoxen
Copy link

Essentially, I want to collect N vec's together, map a tuple over them that returns N values, and divide that back up into N vec's, ideally without allocating anything new. Currently, it works quite well for N=1:

v1.par_iter()
    .map(|i1| ...do stuff and return n1)
    .collect_into(results1);

What I want to do is something like this:

v1.par_iter()
    .zip(v2)
    .map(|(i1, i2)| ...do stuff and return (n1, n2))
    .collect_into_many(results1, results2);

If there were an unzip() I could do something like this maybe:

let (r1, r2) = v1.par_iter()
    .zip(v2)
    .map(|(i1, i2)| ...do stuff and return (n1, n2))
    .unzip();
r1.collect_into(results1);
r2.collect_into(results2);

and it would be a little awkward but would generalize to any N (for my use case, N ~ 6 or 8). But if there's a way to do this with the current API I haven't yet figured it out.

@cuviper
Copy link
Member

cuviper commented Apr 24, 2017

I actually have been thinking about this. The signature of Iterator::unzip is:

fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
  where FromA: Default + Extend<A>,
        FromB: Default + Extend<B>,
        Self: Iterator<Item=(A, B)>

To that end, I've started implementing a ParallelExtend trait for us to mirror this. But while that trait is not too hard, employing it for a parallel unzip is more challenging, to the point that I haven't decided if this is worth it. Sequentially, they basically just need to push one item into each collection at a time, but in parallel it would be nice to be smarter than that.

However, note that unzip does not return distinct iterators as you've seemed to outline, but rather returns the resulting collections directly. There would have to be buffering of some sort to support splitting the iterators even in the sequential case, and I shudder to think what that would look like in the parallel case.

I think as a first step, we could implement an unzip_into for vectors only, just like collect_into, and that will probably satisfy a large percentage of folks who would want this.

icefoxen referenced this issue in ggez/ggez-goodies Apr 24, 2017
@cuviper
Copy link
Member

cuviper commented Apr 24, 2017

You could manually do this with current APIs if you just zip the result vectors in there too, writing with for_each. Something like:

v1.par_iter().zip(&v2)
  .zip(r1.par_iter_mut().zip(&mut r2)
  .for_each(|((a, b), (x, y))| {
      *x = a + b;
      *y = a * b;
  });

You can generalize this to as large of N as you like. But this does mean you have to initialize the values in the result vectors, or deal with the unsafety of using them uninitialized.

@cuviper
Copy link
Member

cuviper commented Apr 27, 2017

See #326 for unzip and unzip_into. (for your N=2 case, at least)

@cuviper
Copy link
Member

cuviper commented May 1, 2017

I was thinking whether we could support a truly general version of unzip using the same mechanism I've used in #326 to gather separate consumers. Something like:

fn unzip_with<F1, R1, F2, R2>(self, left: F1, right: F2) -> (R1, R2)
    where F1: Fn(ParallelIterator) -> R1,
          F2: Fn(ParallelIterator) -> R2;

But you can't really do that, because Fn(ParallelIterator) is trying to take an unsized trait by value, and we can't actually name the real iterator types. If closures could themselves be generic, it might be OK, but otherwise you end up having a manual callback like our existing ProducerCallback. Not really ergonomic for a public API.

Anyway, the standard Iterator doesn't have anything along these lines either, so I'm not too worried about it -- just musing. :)

@nikomatsakis
Copy link
Member

Given the existing unzip support, should we close this? Or you think there is still more to be done here?

@cuviper
Copy link
Member

cuviper commented Jun 27, 2017

IMO we're done. @icefoxen are you satisfied?

@icefoxen
Copy link
Author

icefoxen commented Jul 2, 2017

Yep, totally. Looks good, thanks!

@icefoxen icefoxen closed this as completed Jul 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants