Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add x86 intrinsics for bit manipulation (BMI 1.0, BMI 2.0, and TBM). #34412

Merged
merged 4 commits into from
Jul 5, 2016

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Jun 22, 2016

This PR adds the LLVM x86 intrinsics for the bit manipulation instruction sets (BMI 1.0, BMI 2.0, and TBM).

The objective of this pull-request is to allow building a library that implements all the algorithms offered by those instruction sets, using compiler intrinsics for the targets that support them (by means of target_feature).

The target features added are:

  • bmi: Bit Manipulation Instruction Set 1.0, available in Intel >= Haswell and AMD's >= Jaguar/Piledriver,
  • bmi2: Bit Manipulation Instruction Set 2.0, available in Intel >= Haswell and AMD's >= Excavator,
  • tbm: Trailing Bit Manipulation, available only in AMD's Piledriver (won't be available in newer CPUs).

The intrinsics added are:

  • BMI 1.0:
    • bextr: Bit field extract (with register).
  • BMI 2.0:
    • bzhi: Zero high bits starting with specified bit position.
    • pdep: Parallel bits deposit.
    • pext: Parallel bits extract.
  • TBM:
    • bextri: Bit field extract (with immediate).

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@gnzlbg gnzlbg force-pushed the document_platform_intrinsics_generate branch from 7ee154e to 483bec7 Compare June 22, 2016 15:11
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 23, 2016

I don' have access to all the platforms that the SIMD crate supports. Is there a better way to test it than compiling the SIMD crate for all the platforms it supports?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 24, 2016

@huonw you used to work on the SIMD crate, any ideas about how to test it?

Ping: @ranma42 @brson @alexcrichton

@brson brson added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. I-nominated labels Jun 25, 2016
@brson
Copy link
Contributor

brson commented Jun 25, 2016

Might be good to get some more eyes on this still. Previous issue didn't generate much discussion.

Niko's on vacation. r? @nrc

@rust-highfive rust-highfive assigned nrc and unassigned nikomatsakis Jun 25, 2016
@brson brson mentioned this pull request Jun 25, 2016
@alexcrichton
Copy link
Member

Currently the src/etc/platform-intrinsics folder is primarily used for SIMD stuff, and I don't think that we basically expose any other platform intrinsics just yet (afaik). Doing it through that interface seems somewhat reasonable to me, however, especially when there's a wide number of variations in terms of types for one intrinsics to define in LLVM.

@nrc
Copy link
Member

nrc commented Jun 27, 2016

I like the looks of this PR. I'm not sure about the second commit - I don't understand the issues there. So, r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned nrc Jun 27, 2016
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 27, 2016

@nrc

I'm not sure about the second commit - I don't understand the issues there.

It is what @alexcrichton mentioned above, the src/etc/platform-intrinsics is only used for SIMD instructions, and as a consequence for x86 all the intrinsics had the name x86_mm_... (mm for multi-media).

I could probably have used x86_mm_ for the bit manipulation instructions as well, but I read somewhere (in the docs of the generator.py?) that the intrinsic functions should be "closely named" to the ones exported by clang's C headers.

The second commit allows having differently named instruction sets per platform, e.g., x86_bmi_... along x86_mm_, and handling those in the match statement for the platform (e.g. x86).

Doing it through that interface seems somewhat reasonable to me, however, especially when there's a wide number of variations in terms of types for one intrinsics to define in LLVM.

While generator.py was a bit overkill for these intrinsics (the number of variations is ~2 per intrinsic), it was actually useful to generate both the match statement ("compiler-defs) and the function signatures for the intrinsics automatically.

@alexcrichton
Copy link
Member

Discussed during the libs triage meeting yesterday, the conclusion was that this is fine to merge, thanks for the PR @gnzlbg! We're in general pretty amenable to adding any support for fancy intrinsics like this, although we'd be much more hesitant about adding bits and pieces to the standard library, but that's not happening here!

@bors: r+ 483bec7

@alexcrichton alexcrichton removed I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jun 28, 2016
@bors
Copy link
Contributor

bors commented Jun 29, 2016

⌛ Testing commit 483bec7 with merge 849864d...

@bors
Copy link
Contributor

bors commented Jun 29, 2016

💔 Test failed - auto-linux-64-x-android-t

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 29, 2016

It's my first PR so I don't really know what to look for in the build logs yet to figure out what failed, but this looks like some sort of time out while compiling.

@alexcrichton
Copy link
Member

Ah no worries that just means that the Android build timed out, which is unlikely to be related to this patch. so let's just...

@bors: retry

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 30, 2016

Maybe I did something wrong?

@alexcrichton
Copy link
Member

Oh nah this PR is fine, we're just having a lot of automation trouble lately apparently so it may take awhile for bors to get around to this PR

@bors
Copy link
Contributor

bors commented Jul 1, 2016

⌛ Testing commit 483bec7 with merge 7e269b2...

@bors
Copy link
Contributor

bors commented Jul 1, 2016

💔 Test failed - auto-win-msvc-64-opt-rustbuild

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 1, 2016

@bors: retry

@alexcrichton
Copy link
Member

alexcrichton commented Jul 1, 2016

@bors
Copy link
Contributor

bors commented Jul 1, 2016

⌛ Testing commit 483bec7 with merge f8be993...

@bors
Copy link
Contributor

bors commented Jul 1, 2016

💔 Test failed - auto-win-msvc-64-opt-rustbuild

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 1, 2016

Come on @bors you can do it!

@bors: retry

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 5, 2016

@bors: retry

1 similar comment
@GuillaumeGomez
Copy link
Member

@bors: retry

@bors
Copy link
Contributor

bors commented Jul 5, 2016

⌛ Testing commit 483bec7 with merge ec58d0c...

bors added a commit that referenced this pull request Jul 5, 2016
…r=alexcrichton

Add x86 intrinsics for bit manipulation (BMI 1.0, BMI 2.0, and TBM).

This PR adds the LLVM x86 intrinsics for the bit manipulation instruction sets (BMI 1.0, BMI 2.0, and TBM).

The objective of this pull-request is to allow building a library that implements all the algorithms offered by those instruction sets, using compiler intrinsics for the targets that support them (by means of `target_feature`).

The target features added are:

- `bmi`: Bit Manipulation Instruction Set 1.0, available in Intel >= Haswell and AMD's >= Jaguar/Piledriver,
- `bmi2`: Bit Manipulation Instruction Set 2.0, available in Intel >= Haswell and AMD's >= Excavator,
- `tbm`: Trailing Bit Manipulation, available only in AMD's Piledriver (won't be available in newer CPUs).

The intrinsics added are:

- BMI 1.0:
  - `bextr`: Bit field extract (with register).
- BMI 2.0:
  - `bzhi`: Zero high bits starting with specified bit position.
  - `pdep`: Parallel bits deposit.
  - `pext`: Parallel bits extract.
- TBM:
 - `bextri`: Bit field extract (with immediate).
@bors bors merged commit 483bec7 into rust-lang:master Jul 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants