Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"kernel NULL pointer dereference" on modprobe zfs #1346

Closed
posativ opened this issue Mar 9, 2013 · 6 comments
Closed

"kernel NULL pointer dereference" on modprobe zfs #1346

posativ opened this issue Mar 9, 2013 · 6 comments
Labels
Type: Building Indicates an issue related to building binaries
Milestone

Comments

@posativ
Copy link

posativ commented Mar 9, 2013

I'm using the gentoo hardened profile with linux 3.7.10 and after some new configuration flags to support libvirt/qemu, I'm no longer able to start zfs:

SPL: Loaded module v0.6.0-rc14
znvpair: module license 'CDDL' taints kernel.
Disabling lock debugging due to kernel taint
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: [<ffffffff81803059>] _raw_spin_lock+0x9/0x30
PGD 0 
Oops: 0002 [#1] SMP 
Modules linked in: znvpair(PO+) spl(O) xt_TCPMSS ipt_rpfilter xt_statistic xt_LOG xt_time xt_connlimit xt_realm xt_addrtype xt_comment xt_recent xt_nat ipt_MASQUERADE ipt_ECN ipt_ah nf_nat_sip nf_nat_irc nf_nat_ftp xt_tcpmss xt_pkttype xt_owner xt_NFQUEUE xt_NFLOG xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY xt_AUDIT w83627ehf hwmon_vid xt_conntrack iptable_nat nf_nat_ipv4 nf_nat arc4 ath9k_htc coretemp ath9k ath9k_common ath9k_hw ath mac80211 cfg80211
CPU 1 
Pid: 3897, comm: modprobe Tainted: P           O 3.7.10-gentoo #4                  /S1200KP
RIP: 0010:[<ffffffff81803059>]  [<ffffffff81803059>] _raw_spin_lock+0x9/0x30
RSP: 0018:ffff8803f39e9e88  EFLAGS: 00010282
RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88040d01e0b0 RSI: ffff8803f39e9ed8 RDI: 0000000000000004
RBP: ffff8803f39e9e88 R08: 0000000000000000 R09: ffff88040f00accc
R10: 8080808080808080 R11: 0000000000000000 R12: ffff8803f39e9ed8
R13: ffffffffa020a950 R14: 00007f8a088fd301 R15: 0000000000040000
FS:  00007f8a0832a700(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000004 CR3: 00000003f39d0000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 3897, threadinfo ffff8803f39e8000, task ffff88040d01e0b0)
Stack:
 ffff8803f39e9eb8 ffffffffa01f919a 00007ffffffff000 0000000000000000
 00007ffffffff000 0000000000000000 ffff8803f39e9ef8 ffffffffa01f928c
 0000000000000001 ffffffff00000000 ffff88040d5b2520 ffff88040be69000
Call Trace:
 [<ffffffffa01f919a>] set_fs_pwd+0x1a/0x60 [spl]
 [<ffffffffa01f928c>] vn_set_pwd+0xac/0xc0 [spl]
 [<ffffffffa01f9740>] spl_setup+0x10/0x30 [spl]
 [<ffffffffa020a959>] init_module+0x9/0x10 [znvpair]
 [<ffffffff810001fa>] do_one_initcall+0x3a/0x160
 [<ffffffff810d74da>] sys_init_module+0x8a/0x1f0
 [<ffffffff8180a912>] system_call_fastpath+0x16/0x1b
Code: 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f b6 03 38 c2 75 f7 48 83 c4 08 5b 5d c3 0f 1f 84 00 00 00 00 00 55 b8 00 01 00 00 48 89 e5 <f0> 66 0f c1 07 0f b6 d4 38 c2 74 0c 0f 1f 00 f3 90 0f b6 07 38 
RIP  [<ffffffff81803059>] _raw_spin_lock+0x9/0x30
 RSP <ffff8803f39e9e88>
CR2: 0000000000000004
---[ end trace 5b8e2aa935261552 ]---

I'm still trying to figure out which configuration settings is the cause for the exception. It worked before using 3.7.10 FWIW. zfs and zfs-kmod from master branch.

@ryao
Copy link
Contributor

ryao commented Mar 9, 2013

It looks like autotools did something wrong. Would you paste the config.log from spl?

@posativ
Copy link
Author

posativ commented Mar 9, 2013

config.log, spl-9999 uses the latest master from GitHub.

@posativ
Copy link
Author

posativ commented Mar 11, 2013

It seems that my autoconf version has changed from 2.13 (working state) to 2.69 (no longer working). Is this an ZFS fault or rather a gentoo build issue?

@posativ
Copy link
Author

posativ commented Mar 11, 2013

Moving back to an early "snapshot" of spl (spl-0.6.0_rc14-r1), modprobe zfs solved this issue for me.

@posativ posativ closed this as completed Mar 11, 2013
b333z pushed a commit to b333z/zfs that referenced this issue May 17, 2013
…ary clones

1356 zfs dataset prefetch code not working
Reviewed by: Matthew Ahrens <matt@delphix.com>
Reviewed by: Dan McDonald <danmcd@nexenta.com>
Approved by: Gordon Ross <gwr@nexenta.com>

References to Illumos issue:
  https://www.illumos.org/issues/1346
  https://www.illumos.org/issues/1356

Ported-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#647
@acertain
Copy link

acertain commented May 2, 2018

NixOS/nixpkgs#39225 might be similar to this? dmesg there, https://gist.github.com/fread2281/c2de6722605838c5d517630274b74b8b is a build log for spl (a different version than the one from the dmesg), and https://gist.github.com/fread2281/c5fd6e556ad02c91b41b198b27acbcce is a build log for zfs

@clefru
Copy link
Contributor

clefru commented Jun 6, 2018

FYI, it seems that current->fs can be NULL under 4.16, see linked NixOS bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Building Indicates an issue related to building binaries
Projects
None yet
Development

No branches or pull requests

4 participants