Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current beta breaks my build that works on stable #40573

Closed
Xaeroxe opened this issue Mar 16, 2017 · 21 comments
Closed

Current beta breaks my build that works on stable #40573

Xaeroxe opened this issue Mar 16, 2017 · 21 comments
Assignees
Labels
P-high High priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Xaeroxe
Copy link
Contributor

Xaeroxe commented Mar 16, 2017

Current beta compiler is segfaulting when attempting to build my project that previously worked on stable: https://travis-ci.org/Xaeroxe/nitro-game-engine/builds/211212522

@alexcrichton alexcrichton added I-nominated regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 16, 2017
@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Mar 16, 2017

It's also worth noting: This only happens on Linux and OSX. My Windows build works fine still.

@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Mar 16, 2017

@nagisa
Copy link
Member

nagisa commented Mar 16, 2017

Stack overflow building MIR. Looking into it more.

@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Mar 16, 2017

This project is pretty reliant on generics and auto-generated code so I'm not entirely surprised that's the cause.

@nagisa
Copy link
Member

nagisa commented Mar 16, 2017

Does not reproduce on a nightly build of my own, but reproduces just fine on the most recent nightly from bots. Ugh!

That does suggest some stupid decisions being made by the optimiser. I feel like at this point just marking a bunch of functions in mir build no-inline would fix a lot of our problems wrt this.


The workaround is to set RUST_MIN_STACK environment variable to some large value (such as 32MiB (worked for me), currently stack defaults to 16MIB). Due to release happening today, this regression will slip into stable.

@nagisa
Copy link
Member

nagisa commented Mar 16, 2017

To compare, my local build easily handles 3000 frames on 8MiB stack, whereas with our binary distribution builds there’s approx 500 stack frames on a 16MiB stack.

(8MiB here is comparison point because that’s what overflow stacks locally for me as well)

@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Mar 16, 2017

Well thanks for the workaround. Evidently I didn't report this fast enough, so my bad there.

@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Mar 16, 2017

So some notes: It seems I did this build in the middle of today's release.

Rust 1.15 yesterday's stable - works
Rust 1.16 today's stable - works
Rust 1.17 today's beta - doesn't work

Meaning we still have another 6 weeks before this bug would hit stable.

@nikomatsakis
Copy link
Contributor

triage: P-high

@rust-highfive rust-highfive added P-high High priority and removed I-nominated labels Mar 16, 2017
@nagisa
Copy link
Member

nagisa commented Mar 16, 2017

@arielb1
Copy link
Contributor

arielb1 commented Mar 27, 2017

It appears that build::expr::into::into_expr is taking up 25 kB per call, while on rustc 1.16 it takes 7.5 kB/call. 25k * ~320 call depth = 8M.

There does not appear to be a secular large increase in stack sizes - for example, expr_as_operand grew from 936 to 968 bytes of stack.

@arielb1
Copy link
Contributor

arielb1 commented Mar 27, 2017

It seems that LLVM has a problem with allocas with overlapping lifetimes that causes stack usage blowups - for example, this:

#![crate_type="rlib"]

pub struct Big {
    drop_me: [Option<Box<u8>>; 1024]
}

pub fn supersize_me(big: Big, may_panic: fn(), cond: bool) {
    may_panic();
    let _big2 = big;
    may_panic();
    if cond {
        let _big3 = _big2;
        may_panic();
        let _big4 = _big3;
        may_panic();
        let _big5 = _big4;
        may_panic();
        let _big6 = _big5;
    } else {
        let _big7 = _big2;
        may_panic();
        let _big8 = _big7;
        may_panic();
        let _big9 = _big8;
        may_panic();
        let _bigA = _big9;
    }
}

Needs 72k of stack space for 9 allocas (on all rustc versions).

Compiling librustc_mir with no landing pads uses less than 2k of stack space per call site.

@arielb1
Copy link
Contributor

arielb1 commented Mar 27, 2017

However, to some degree this looks like a codegen regression - stage1 uses 0x29e8 of stack space per call, but stage2 uses 0x6448.

@nagisa
Copy link
Member

nagisa commented Mar 28, 2017 via email

@arielb1
Copy link
Contributor

arielb1 commented Mar 28, 2017

I more guessed than bisected, but 906c06a (aka the main part of #40133) is the cause of the regression - reverting it on top of master reduces stack usage to 0x28c8.

@arielb1
Copy link
Contributor

arielb1 commented Mar 28, 2017

Marking start_new_block as #[inline(never)] reduces stack usage to 10k (0x27e8) both with and without the PR, so we have a band-aid fix.

@arielb1
Copy link
Contributor

arielb1 commented Mar 28, 2017

After further analysis, it looks that the cause of the regression is that with the commit applied, LLVM decides to inline start_new_block, which triggers the (pre-existing) overlapping alloca bug, which causes stack space to blow up: marking start_new_block as #[inline(always)] causes 25kB of stack usage on both stages, and marking it as #[inline(never)] causes 10kB stack usage on both stages.

@arielb1
Copy link
Contributor

arielb1 commented Mar 28, 2017

Tracking the root cause at #40883 - we may want to fix this with the #[inline(never)] if I can't find a beta-ready fix for the root cause.

@arielb1
Copy link
Contributor

arielb1 commented Mar 29, 2017

With a few small fixes, I was able to get stack usage to 0x2408 - 9224 bytes, but marking start_new_block as #[inline(never)] reduces stack usage even further to 5384 bytes (and start_new_block by itself takes 944 bytes, so there still quadruplication of stack use).

@alexcrichton
Copy link
Member

Looks like this regression may also affect the boiler crate

cc @TF2Maps

arielb1 added a commit to arielb1/rust that referenced this issue Apr 2, 2017
LLVM has a bug - PR32488 - where it fails to deduplicate allocas in some
circumstances. The function `start_new_block` has allocas totalling 1216
bytes, and when LLVM inlines several copies of that function into
the recursive function `expr::into`, that function's stack space usage
goes into tens of kiBs, causing stack overflows.

Mark `start_new_block` as inline(never) to keep it from being inlined,
getting stack usage under control.

Fixes rust-lang#40493.
Fixes rust-lang#40573.
arielb1 added a commit to arielb1/rust that referenced this issue Apr 2, 2017
LLVM has a bug - PR32488 - where it fails to deduplicate allocas in some
circumstances. The function `start_new_block` has allocas totalling 1216
bytes, and when LLVM inlines several copies of that function into
the recursive function `expr::into`, that function's stack space usage
goes into tens of kiBs, causing stack overflows.

Mark `start_new_block` as inline(never) to keep it from being inlined,
getting stack usage under control.

Fixes rust-lang#40493.
Fixes rust-lang#40573.
arielb1 pushed a commit to arielb1/rust that referenced this issue Apr 5, 2017
…hton

mark build::cfg::start_new_block as inline(never)

LLVM has a bug - [PR32488](https://bugs.llvm.org//show_bug.cgi?id=32488) - where it fails to deduplicate allocas in some
circumstances. The function `start_new_block` has allocas totalling 1216
bytes, and when LLVM inlines several copies of that function into
the recursive function `expr::into`, that function's stack space usage
goes into tens of kiBs, causing stack overflows.

Mark `start_new_block` as inline(never) to keep it from being inlined,
getting stack usage under control.

Fixes rust-lang#40493.
Fixes rust-lang#40573.

r? @eddyb
@Xaeroxe
Copy link
Contributor Author

Xaeroxe commented Apr 6, 2017

Thanks everyone! I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-high High priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants