Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb core dump #4479

Closed
jhetrick opened this issue Mar 31, 2016 · 7 comments
Closed

zdb core dump #4479

jhetrick opened this issue Mar 31, 2016 · 7 comments

Comments

@jhetrick
Copy link

In researching issues 3708 and 1126 I've had intermittent problems with zdb dumping core on anything other than plan "zdb".

System is CentOS 7: 3.10.0-327.10.1.el7.x86_64 #1 SMP Sat Jan 23 04:54:55 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

zfs-debuginfo-0.6.5.5-1.el7.centos.x86_64
libzfs2-0.6.5.5-1.el7.centos.x86_64
zfs-0.6.5.5-1.el7.centos.x86_64
zfs-dkms-0.6.5.5-1.el7.centos.noarch
zfs-release-1-2.el7.centos.noarch

and GDB:

`Core was generated by zdb -dddd dpool04.
Program terminated with signal 6, Aborted.
#0 0x00007f3e448425f7 in raise () from /lib64/libc.so.6

(gdb) bt
#0 0x00007f3e448425f7 in raise () from /lib64/libc.so.6
#1 0x00007f3e44843ce8 in abort () from /lib64/libc.so.6
#2 0x00007f3e45d82fee in vn_rdwr (uio=, vp=0x7f3e28000920, addr=0x7f3e28003800, len=8192,

offset=8001481613312, x1=x1@entry=1, x2=x2@entry=0, x3=x3@entry=18446744073709551615, x4=x4@entry=0x0, 
residp=residp@entry=0x7f3e467a7e00) at kernel.c:734

#3 0x00007f3e45df4803 in vdev_file_io_strategy (arg=0x7f3e280033f0) at ../../module/zfs/vdev_file.c:154
#4 0x00007f3e45d842d4 in taskq_thread (arg=0x204d0a0) at taskq.c:252
#5 0x00007f3e45d81001 in zk_thread_helper (arg=0x204da50) at kernel.c:134
#6 0x00007f3e44bd5dc5 in start_thread () from /lib64/libpthread.so.0
#7 0x00007f3e4490328d in clone () from /lib64/libc.so.6

`

But really any zdb argument causes a core; not just -ddd..

@behlendorf
Copy link
Contributor

You appear to running afoul of this assertion, and pread64() is returning EINVAL. According to the man page this can happen for a few reasons:

       EINVAL fd is attached to an object which is unsuitable for reading;  or
              the  file  was  opened  with  the  O_DIRECT flag, and either the
              address specified in buf, the value specified in count,  or  the
              current file offset is not suitably aligned.

You could try removing O_DIRECT flag from vn_open() to see if this is due to an alignment issue. The issue might also be the offset which in this case looks suspiciously large to me, 8T. Are these really 8T devices?

@jhetrick
Copy link
Author

Indeed; they are 8T HGST drives.

That might explain why this doesn't seem to happen with file devices as test, or, against the slog SSDs in this system.

@behlendorf
Copy link
Contributor

@jhetrick can you check what the sector size is for those disks. Logical and physical.

@jhetrick
Copy link
Author

Certainly! They are 4Kn.

@behlendorf
Copy link
Contributor

OK, well that should work fine. You'll need to keep digging in to why exactly the read is failing.

@jhetrick
Copy link
Author

I'll see what happens with O_DIRECT out of the picture; thanks.

@jhetrick
Copy link
Author

I have not been able to test this. Closing until i can do so.

behlendorf added a commit to behlendorf/zfs that referenced this issue Jun 29, 2016
Here's the problem - on 4K native devices in userland on
Linux using O_DIRECT, buffers must be 4K aligned or I/O
will fail with EINVAL, causing zdb (and others) to coredump.
Since userland probably doesn't need optimized buffer caches,
we just force 4K alignment on everything.

Issue openzfs#4479
behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 12, 2016
Here's the problem - on 4K native devices in userland on
Linux using O_DIRECT, buffers must be 4K aligned or I/O
will fail with EINVAL, causing zdb (and others) to coredump.
Since userland probably doesn't need optimized buffer caches,
we just force 4K alignment on everything.

Issue openzfs#4479
behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 14, 2016
Here's the problem - on 4K native devices in userland on
Linux using O_DIRECT, buffers must be 4K aligned or I/O
will fail with EINVAL, causing zdb (and others) to coredump.
Since userland probably doesn't need optimized buffer caches,
we just force 4K alignment on everything.

Issue openzfs#4479
behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 28, 2016
Here's the problem - on 4K native devices in userland on
Linux using O_DIRECT, buffers must be 4K aligned or I/O
will fail with EINVAL, causing zdb (and others) to coredump.
Since userland probably doesn't need optimized buffer caches,
we just force 4K alignment on everything.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes openzfs#4479
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants