Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floats are problematic #39

Open
4 tasks
workingjubilee opened this issue Oct 14, 2020 · 8 comments
Open
4 tasks

Floats are problematic #39

workingjubilee opened this issue Oct 14, 2020 · 8 comments
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Ongoing issue with checkboxes for partial progress and such ISA-ARMv7 ARM's 7th generation of CPUs ISA-Neon Arm's Neon SIMD unit O-MIPS RISC CPUs from MIPS Technologies O-RISC-V O-Wasm WebAssembly's virtual architecture O-x86 Intel and AMD's x86 processors

Comments

@workingjubilee
Copy link
Member

It is well known that floating point code often merits extra caution due to the details of rounding math, but the fact that floating point numbers are more interesting to handle applies not just to software interacting with floating point numbers and the floating point environment (all of which can affect our API significantly), but also applies to hardware implementations of both vector and scalar floating point units, so to both our SIMD code and our scalar fallbacks. These aren't bugs per se because it's not a bug, it's a feature, but they are features requiring enhanced attention to detail. This is related to rust-lang/unsafe-code-guidelines#237 and rust-lang/rust#73328 as well.

Most domains involving SIMD code use floats extensively, so this library has a "front-row seat" to problems with floating point operations. Thus, in order for SIMD code to be reasonably portable, we should try to discover where these... unique... implementations lie and decide how to work around them, or at least inform our users of the Fun Facts we learn.

Arch-specific concerns remain for:

  • 32-bit x86
  • 32-bit ARM
  • MIPS
  • Wasm

32-bit x86

The x87 80-bit "long double" float registers can do interesting things to NaN, and in general if a floating point value ends up in them and experiences an operation, this can introduce extra precision that may lead to incorrect mathematical conclusions later on. Further, Rust no longer supports MMX code at all because their interaction with the x87 registers were just entirely too much trouble. Altogether, this is probably why the x86-64 System V ABI specifies usage of the XMM registers for handling floating point values, which do not have these problems.

32-bit ARM

There are so many different floating point implementations on 32-bit ARM that armclang includes a compiler flag for FPU architecture and another one for float ABI.

  • VFP AKA Vector Floating Point: VFP units that appear on ARMv7 seem to default to flushing denormals to zero unless the appropriate control register has the "FZ bit" set appropriately.
  • Neon AKA Advanced SIMD: Vector registers flush denormals to zero always. This is not true of aarch64's Enhanced Neon AKA Advanced SIMD v2.
  • Aarch32: Lest we imagine that ARMv8-A is completely free of problems, the "aarch32" execution mode has an unspecified default value for the FZ bit even if Neon v2 is available.

MIPS

NaNs sometimes do weird things on this platform, resulting in some of the packed_simd tests getting the very interesting number -3.0.

Wasm

Technically Wasm is IEEE754 compliant but it is very "...technically!" here because it specifies a canonicalizing behavior on NaNs that constitutes an interesting choice amongst architectures and may not be expected by a programmer that is used to e.g. being able to rely on NaN bitfields being fairly stable, so we will want to watch out for it when implementing our initial floating point test suite.

The Good News

We have reasonable confidence in float behavior in x86's XMM-and-later registers (so from SSE2 onwards), aarch64's Neon v2 registers, PowerPC, and z/Architecture (s390x). As far as we know, on these architectures ordinary binary32 and binary64s are what they say they are, support basic operations in a reasonably consistent fashion, and nothing particularly weird occurs even with NaNs. So we are actually in a pretty good position for actually using most vector ISAs! It's just all the edge cases that pile up.

Main Takeaway

We want to extensively test even "simple" scalar operations on floating point numbers, especially casts and bitwhacking, or anything else that might possibly be affected by denormal numbers or NaN, so as to surface known and unknown quirks in floating point architectures and how LLVM handles them.

@workingjubilee workingjubilee added C-tracking-issue Ongoing issue with checkboxes for partial progress and such O-x86 Intel and AMD's x86 processors ISA-Neon Arm's Neon SIMD unit ISA-ARMv7 ARM's 7th generation of CPUs A-floating-point Area: Floating point numbers and arithmetic O-MIPS RISC CPUs from MIPS Technologies O-RISC-V O-Wasm WebAssembly's virtual architecture labels Oct 14, 2020
@programmerjake
Copy link
Member

IEEE 754 states that implementations should use the MSB of the mantissa being set to indicate a quiet NaN, however MIPS before the 2008 revision and PA-RISC use the MSB of the mantissa being clear to indicate a quiet NaN.

@workingjubilee
Copy link
Member Author

@programmerjake Ah, thank you, I kept hearing the detail of swapped qNaN and sNaNs but for some reason "it's the most significant bit in the mantissa" kept sliding from my mind while writing from this up. This makes sense.

MIPS doesn't seem to have a lot of usage, but it is on our Platform Support list and we do build hosts on it. Anyways, for this reason we might run into troubles exploiting its vector ISAs, like we might with Neon v1, but this difference at least seems more soluble and I would prefer if we could at least get basic functionality together (i.e. scalar fallbacks). And really it was IEEE754-1985's fault for needlessly avoiding specifying which way around it should go.

It doesn't seem like there's a clear target division between the different eras of MIPS targets in rustc's targets list, and MIPS64 is older than 2008. I suppose we have to assume all MIPS FPUs are Like That, then?

@programmerjake
Copy link
Member

AFAICT, the mips32r6 and mips64r6 targets are the ones that switch to the non-weird NaN encodings:

https://github.com/llvm/llvm-project/blob/0069824feab0af5ade571d975deb1efd893c2466/clang/lib/Basic/Targets/Mips.h#L84-L86

@workingjubilee
Copy link
Member Author

workingjubilee commented Oct 14, 2020

Ah. Those are on tier 3 support under the mipsisa{32,64}{,el} names, and it's not even clear we can build std for them, so it seems they are not in our field of concern and all the MIPS targets we should be keeping an eye on do not incorporate the IEEE754-2008 behavior.

@clarfonthey
Copy link

clarfonthey commented May 2, 2022

So, I saw this thread and wanted to add something since I didn't see it discussed: do we want to provide any guarantee on the rounding behaviour of the reduce operations? In particular, for x86, I did some searching around and didn't find anything… all the docs I've seen places just say that HADDPD "adds everything" and doesn't specify the rounding behaviour, which I assume would be equivalent to performing a single round after the operation (all other versions don't make sense).

However, if we guaranteed this at the library level, that means that any implementation that's forced to rely on software instead of dedicated instructions will have a performance penalty. I also don't know how the existing software implementations deal with this.

@calebzulawski
Copy link
Member

Somewhat related to #235.

IEEE 754 specifies that reductions may associate in any order, and may use any wider intermediate format. I'm partial to that because it's specified in the standard, fast, and I don't think it sacrifices much.

If you want to guarantee any particular behavior, there's always the option to use swizzles.

@clarfonthey
Copy link

I'm a bit confused how swizzles guarantee the behaviour if the ordering is always unspecified; do you mean just performing the sums manually instead of using the reduction methods?

@calebzulawski
Copy link
Member

Yeah, sorry, I just mean manually implementing the order and reduction function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Ongoing issue with checkboxes for partial progress and such ISA-ARMv7 ARM's 7th generation of CPUs ISA-Neon Arm's Neon SIMD unit O-MIPS RISC CPUs from MIPS Technologies O-RISC-V O-Wasm WebAssembly's virtual architecture O-x86 Intel and AMD's x86 processors
Projects
None yet
Development

No branches or pull requests

4 participants