Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cunn tests fail with cuda 8.0rc on GTX1070 #321

Closed
htoyryla opened this issue Aug 14, 2016 · 4 comments
Closed

cunn tests fail with cuda 8.0rc on GTX1070 #321

htoyryla opened this issue Aug 14, 2016 · 4 comments

Comments

@htoyryla
Copy link

I made a fresh install of Ubuntu 14.04 with GTX1070, cuda 8.0rc, torch + cutorch + cunn. Nvidia driver version 367.35

Testing cunn fails as reported here https://gist.github.com/htoyryla/6ab33ffc9794d3d002ce066553142cd6

During installation of cutorch and cunn, the architecture 6.1. is correctly detected. Install log for cunn here https://gist.github.com/htoyryla/aa77379e493bf155de3eac709a2cb41f

Running programs like neural-style fail with cunn, but work with cudnn. As far as I can see, torch works with cudnn without problems.

Error output from neural-style when running without cudnn (same error is also reported by cunn.test(), see gist above).

/home/hannu/torch/install/bin/luajit: /home/hannu/torch/install/share/lua/5.1/nn/Container.lua:67:
In 7 module of nn.Sequential:
/home/hannu/torch/install/share/lua/5.1/nn/THNN.lua:109: wrong number of arguments for function call
stack traceback:
    [C]: in function 'v'
    /home/hannu/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialMaxPooling_updateOutput'
    ...nnu/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:42: in function <...nnu/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:31>
    [C]: in function 'xpcall'
    /home/hannu/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    /home/hannu/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:515: in main chunk
    [C]: in function 'dofile'
    ...annu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x004065d0
@soumith
Copy link
Member

soumith commented Aug 14, 2016

Update nn and cutorch and cunn

luarocks install nn

luarocks install cutorch

luarocks install cunn

@htoyryla
Copy link
Author

OK, that helped. Reinstalling nn was the trick, the others I had already reinstalled a few times. Silly of me not going back to nn immediately.

There was still this:

VolumetricDilatedConvolution 
 Function call failed
/home/hannu/torch/install/share/lua/5.1/nn/THNN.lua:109: $ Torch: not enough memory: you tried to allocate 21GB. Buy new RAM! at /tmp/luarocks_torch-scm-1-8091/torch7/lib/TH/THGeneral.c:226
stack traceback:
    [C]: in function 'v'
    /home/hannu/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'VolumetricDilatedConvolution_updateOutput'
    ...nstall/share/lua/5.1/nn/VolumetricDilatedConvolution.lua:32: in function 'forward'
    /home/hannu/torch/install/share/lua/5.1/cunn/test.lua:5346: in function 'v'
    /home/hannu/torch/install/share/lua/5.1/cunn/test.lua:6029: in function </home/hannu/torch/install/share/lua/5.1/cunn/test.lua:6027>
    [C]: in function 'xpcall'
    /home/hannu/torch/install/share/lua/5.1/torch/Tester.lua:477: in function '_pcall'
    /home/hannu/torch/install/share/lua/5.1/torch/Tester.lua:436: in function '_run'
    /home/hannu/torch/install/share/lua/5.1/torch/Tester.lua:355: in function 'run'
    /home/hannu/torch/install/share/lua/5.1/cunn/test.lua:6049: in function 'test'
    (command line):1: in main chunk
    [C]: at 0x004065d0

--------------------------------------------------------------------------------
luajit: /home/hannu/torch/install/share/lua/5.1/torch/Tester.lua:363: An error was found while running tests!
stack traceback:
    [C]: in function 'assert'
    /home/hannu/torch/install/share/lua/5.1/torch/Tester.lua:363: in function 'run'
    /home/hannu/torch/install/share/lua/5.1/cunn/test.lua:6049: in function 'test'
    (command line):1: in main chunk
    [C]: at 0x004065d0

but even that disappeared when I ran the test a second time (after installing cudnn too).

@soumith
Copy link
Member

soumith commented Aug 14, 2016

That looks like a bad test, I'll fix it. Don't treat it as a failure for now

@IEWbgfnYDwHRoRRSKtkdyMDUzgdwuBYgDKtDJWd

@htoyryla Would you be able to say which drivers/steps you took to install drivers/cuda on 14.04? Im having LOTS of issues, could even tip you btc for the help. my original issue opened is here jcjohnson/neural-style#460 if you might have the time to look. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants