Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Halton Sequence and Quasi Random Numbers #182

Closed
Tracked by #1432
glfmn opened this issue Oct 24, 2017 · 8 comments
Closed
Tracked by #1432

Halton Sequence and Quasi Random Numbers #182

glfmn opened this issue Oct 24, 2017 · 8 comments
Labels
E-easy Participation: easy job

Comments

@glfmn
Copy link

glfmn commented Oct 24, 2017

For an optimization library I'm writing I need a fast sampling algorithm so I started by implemented a fast Halton sequence generator. The rand crate provides a really nice API for randomness and I don't want to go through the trouble of re-implementing a similar API for my sampling algorithms so I implemented Rng trait for my generator.

The halton sequence is a range of floating point values from 0 to 1 and is defined by its base, which should be a prime number, and will produce regular values with a base of 2 like:

1/2,  1/4,  3/4,  1/8,  5/8,  3/8,  7/8,  1/16,  9/16

and with a base of 3 like:

 1/3,  2/3,  1/9,  4/9,  7/9,  2/9,  5/9,  8/9,  1/27

To fill the domain in a pretty even way. You can read more on Wikipedia.

Is there any space in the rand crate for quasi-random number generators like the halton sequence? I'd love to open a PR and add my implementation if it's appropriate.

@dhardy
Copy link
Member

dhardy commented Oct 24, 2017

Sorry but I don't think it belongs in rand itself. It's quite easy to publish a small crate yourself.

Use of Rng gets you conversion from integers to floating point numbers if you're producing integers internally, but it doesn't sound like that's what you're doing, in which case why not just implement Iterator<f64>?

@glfmn
Copy link
Author

glfmn commented Oct 24, 2017

That's exactly what I started with actually. It wasn't hard to simply replace the default definitions of the Rng::next_u64(), Rng::next_u32(), Rng::next_f64(), Rng::next_f32() methods with ones appropriate for my generator.

However, the rand crate provides the to derive Rand for types, especially arbitrary structs of numeric data! The library I'm working on is essentially a monte-carlo optimization method, but we normally use a quasi-random sequence because it tends to converge faster that way. It's better to use structs with fields to store our states because they can be pretty complicated and it's easier to understand a field name than an index in a fixed-sized array, and being able to implement methods on our states is useful too. I'll stick with using rand for now, but I might decide in the future to make a crate focused on monte-carlo and quasi-monte-carlo specifically.

To play off the example from the Readme:

use rand::Rng;
use rand::distributions::{IndependentSample, Range};
use std::f64::consts::PI;

// Define a function to take two random number generators from which to sample
fn monte_carlo_pi<R: Rng>(points: usize, mut s1: R, mut s2: R) -> f64 {

    let between = Range::new(-1f64, 1.);
    let mut in_circle = 0;

    for _ in 0..points {
        let a = between.ind_sample(&mut s1);
        let b = between.ind_sample(&mut s2);
        if a*a + b*b <= 1. {
            in_circle += 1;
        }
    }

    4. * (in_circle as f64) / (points as f64)
}

// Create two halton sequence generators with coprime bases 17 and 19
let h_est = monte_carlo_pi(10_000, Halton::new(1,17), Halton::new(1,19));

// Estimate using standard thread_rng
let r_est = monte_carlo_pi(1_000_000, rand::thread_rng(), rand::thread_rng());

// The error is less than the standard number generator with 100 times less points
assert!((h_est-PI).abs() < (r_est-PI).abs());

@dhardy
Copy link
Member

dhardy commented Mar 4, 2018

I suppose we could add this as a "mock RNG" (i.e. add a HaltonSeq struct implementing RngCore in the new mock module). (We could rename StepRngStepSeq for consistency.)

Anyone want to make a PR? Doesn't look too hard.

@aldanor
Copy link

aldanor commented Aug 29, 2019

Just to chime in, I’ve been recently working on a qrng crate and I just got a Gray-code based Sobol sequence generator working which is a few times faster than both the GSL one and the SmallRng gen (just so as to establish some baseline). Sobol sequence is probably one of the best options currently available for high dim data, especially if scrambled. I’m planning to add a few scrambling options, SIMD support and perhaps a few weaker options like Halton as well. If it would be of any interest, I could post it back here once it’s in any usable state.

I’m not sure it fits the rand interface though since for high dimensional samples you really want to do it all at once, without allocations, ideally while exploiting simd etc - which means writing to the same buffer over and over. But that we can figure out later.

@vks
Copy link
Collaborator

vks commented Aug 29, 2019 via email

@aldanor
Copy link

aldanor commented Aug 29, 2019

I think the fill interface satisfies your requirements, doesn't it?

You mean Rng::fill()? Not really, that's for filling integer-type slices when you can already generate an integer scalar.

In this case, we're talking about slice-type samples.

@dhardy
Copy link
Member

dhardy commented Aug 31, 2019

Are you are talking about some type of correlation within matrix / higher-dimensional tensors or just efficiently filling the data space? I assume the latter, though the same API may work for both.

For generating a fixed-size object like [f32; 4] or (R, S, T) in one go, Rng::gen can already do this (via the Distribution trait and impls and RngCore source). However, if you want to generate such values directly without going through an interface like RngCore::next_u32, then it may be best to remove the generator interface (RngCore) and use your own traits.

/// The generator state
pub struct GenState { ... }

/// Constructing objects
pub trait GenNew<T> {
    pub fn gen_new(state: &mut GenState) -> T;
}

/// Filling objects
pub trait GenFill<T> {
    pub fn gen_fill(obj: &mut T, state: &mut GenState);
}

At this point you basically have your own lib which shares no code with rand. The only point of using the rand lib is if you wish to implement RngCore and leave all the stuff about converting to output types to us.

Note that we only partially solved the question of how to efficiently generate SIMD values within this project. Block-rngs like ChaCha may use SIMD internally but there is no way to directly output large values from the RNG (we did discuss adding next_u128, but that fell out of favour — extra complexity in what should be a minimal trait, and not much benefit in whatever benches we ran at the time). We have support for generating SIMD types (both via Standard and Uniform distributions), but this uses fill_bytes to bridge from the generator to the SIMD type.

@MichaelOwenDyer
Copy link
Member

Closing this issue, hopefully there are no objections after all this time 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-easy Participation: easy job
Projects
None yet
Development

No branches or pull requests

5 participants