Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when importing pool #1114

Closed
olegkrutov opened this issue Nov 26, 2012 · 6 comments
Closed

Panic when importing pool #1114

olegkrutov opened this issue Nov 26, 2012 · 6 comments

Comments

@olegkrutov
Copy link

:~ uname -a

Linux xubuntu 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

:~ zpool import

pool: data
id: 17502164096278276766
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
the '-f' flag.
see: http://zfsonlinux.org/msg/ZFS-8000-EY
config:

data                                     ONLINE
  raidz2-0                                     ONLINE
    ata-WDC_WD1003FBYX-01Y7B0_WD-WCAW33240308  ONLINE
    ata-WDC_WD1003FBYX-01Y7B0_WD-WCAW33255745  ONLINE
    ata-WDC_WD1003FBYX-01Y7B0_WD-WCAW33280767  ONLINE
    ata-WDC_WD1003FBYX-01Y7B0_WD-WCAW33271246  ONLINE

:~ zpool import data -f
:~

:~ dmesg

[ 1318.919972] SPL: Loaded module v0.6.0.87-rc12
[ 1318.920246] zunicode: module license 'CDDL' taints kernel.
[ 1318.920248] Disabling lock debugging due to kernel taint
[ 1318.937032] ZFS: Loaded module v0.6.0.87-rc12, ZFS pool version 28, ZFS filesystem version 5
[ 1743.606988] SPL: using hostid 0x007f0101
[ 2585.494040] general protection fault: 0000 [#1] SMP
[ 2585.496380] CPU 2
[ 2585.496391] Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) dm_crypt snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq parport_pc snd_timer eeepc_wmi snd_seq_device ppdev asus_wmi bnep sparse_keymap rfcomm snd mac_hid dm_multipath serio_raw lp soundcore snd_page_alloc mei(C) bluetooth parport squashfs overlayfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc dm_raid45 xor dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c usbhid hid wmi i915 r8169 drm_kms_helper drm i2c_algo_bit video
[ 2585.502881]
[ 2585.504260] Pid: 3966, comm: spl_system_task Tainted: P C O 3.2.0-29-generic #46-Ubuntu System manufacturer System Product Name/P8H67-M LE
[ 2585.505749] RIP: 0010:[] [] spl_kmem_cache_alloc+0x4c/0xdd0 [spl]
[ 2585.507281] RSP: 0018:ffff88022c67f210 EFLAGS: 00010246
[ 2585.508811] RAX: 0002007400012ce7 RBX: 000000000058c600 RCX: 0000000011ff0000
[ 2585.510372] RDX: 0000000047fc0000 RSI: 0000000000000230 RDI: 0002007400012487
[ 2585.511923] RBP: ffff88022c67f300 R08: 0000000000000000 R09: 0000000000000006
[ 2585.513459] R10: 0000000000000000 R11: 0000000000000001 R12: 0002007400012487
[ 2585.514983] R13: 0000000000000001 R14: ffff8801c1eb8130 R15: ffffffffa05e1740
[ 2585.516501] FS: 0000000000000000(0000) GS:ffff88023fb00000(0000) knlGS:0000000000000000
[ 2585.518035] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2585.519568] CR2: 00007ff2072b89a0 CR3: 000000022c66f000 CR4: 00000000000406e0
[ 2585.521140] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2585.522721] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2585.524290] Process spl_system_task (pid: 3966, threadinfo ffff88022c67e000, task ffff8802318c5c00)
[ 2585.525898] Stack:
[ 2585.527497] ffff8801c59b1f40 ffff8801c77d3000 ffff88022c67f260 ffffffffa05a9ceb
[ 2585.529148] ffffffffa05713b0 ffff8801c59b1f40 ffff88022c67f260 ffffffff816583dd
[ 2585.530808] ffff8801c77d5098 0000000000000202 ffff88022c67f290 ffffffffa05683cc
[ 2585.532471] Call Trace:
[ 2585.534158] [] ? zio_vdev_io_start+0xab/0x2f0 [zfs]
[ 2585.535877] [] ? vdev_raidz_asize+0x60/0x60 [zfs]
[ 2585.537574] [] ? mutex_lock+0x1d/0x50
[ 2585.539304] [] ? vdev_dtl_contains+0x6c/0x90 [zfs]
[ 2585.541051] [] ? vdev_raidz_io_start+0x3e0/0x6b0 [zfs]
[ 2585.542800] [] ? vdev_raidz_asize+0x60/0x60 [zfs]
[ 2585.544552] [] zio_buf_alloc+0x23/0x30 [zfs]
[ 2585.546293] [] arc_get_data_buf.isra.20+0x27d/0x450 [zfs]
[ 2585.548052] [] arc_buf_alloc+0xba/0xf0 [zfs]
[ 2585.549812] [] arc_read_nolock+0x42d/0x7c0 [zfs]
[ 2585.551564] [] ? hrtick_update+0x38/0x40
[ 2585.553317] [] ? dequeue_task_fair+0xb8/0x100
[ 2585.555067] [] ? __switch_to+0xf5/0x360
[ 2585.556812] [] arc_read+0x82/0x170 [zfs]
[ 2585.558557] [] dsl_read+0x31/0x40 [zfs]
[ 2585.560292] [] traverse_prefetcher+0x10e/0x150 [zfs]
[ 2585.562038] [] traverse_visitbp+0x238/0x5b0 [zfs]
[ 2585.563785] [] ? arc_buf_remove_ref+0x110/0x110 [zfs]
[ 2585.565529] [] ? default_spin_lock_flags+0x9/0x10
[ 2585.567290] [] traverse_dnode+0x80/0x110 [zfs]
[ 2585.569059] [] traverse_visitbp+0x41d/0x5b0 [zfs]
[ 2585.570829] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.572606] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.574405] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.576216] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.578021] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.579942] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.581755] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.583576] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.585402] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.587241] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.589079] [] ? arc_read+0xca/0x170 [zfs]
[ 2585.590918] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.592770] [] traverse_dnode+0x80/0x110 [zfs]
[ 2585.594624] [] traverse_visitbp+0x4ba/0x5b0 [zfs]
[ 2585.596493] [] traverse_prefetch_thread+0x7d/0xb0 [zfs]
[ 2585.598371] [] ? dmu_recv_end+0x230/0x230 [zfs]
[ 2585.600227] [] taskq_thread+0x23b/0x590 [spl]
[ 2585.602071] [] ? finish_task_switch+0x4a/0xf0
[ 2585.603910] [] ? try_to_wake_up+0x200/0x200
[ 2585.605708] [] ? task_done+0x140/0x140 [spl]
[ 2585.607454] [] kthread+0x8c/0xa0
[ 2585.609151] [] kernel_thread_helper+0x4/0x10
[ 2585.610807] [] ? flush_kthread_worker+0xa0/0xa0
[ 2585.612424] [] ? gs_change+0x13/0x13
[ 2585.613994] Code: 66 66 90 f6 05 15 28 01 00 01 49 89 fc 89 75 94 74 0d f6 05 0f 28 01 00 08 0f 85 10 01 00 00 49 8d 84 24 60 08 00 00 48 89 45 88 41 ff 84 24 60 08 00 00 9c 58 66 66 90 66 90 49 89 c6 fa 66
[ 2585.617352] RIP [] spl_kmem_cache_alloc+0x4c/0xdd0 [spl]
[ 2585.619017] RSP
[ 2585.673447] ---[ end trace 404707b601d563e0 ]---
[ 2585.673450] BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
[ 2585.673455] IP: [] zio_vdev_child_io+0x2d/0xe0 [zfs]
[ 2585.673492] PGD 1c4b6b067 PUD 1c49ab067 PMD 0
[ 2585.673496] Oops: 0000 [#2] SMP
[ 2585.673498] CPU 3
[ 2585.673499] Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) dm_crypt snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq parport_pc snd_timer eeepc_wmi snd_seq_device ppdev asus_wmi bnep sparse_keymap rfcomm snd mac_hid dm_multipath serio_raw lp soundcore snd_page_alloc mei(C) bluetooth parport squashfs overlayfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc dm_raid45 xor dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c usbhid hid wmi i915 r8169 drm_kms_helper drm i2c_algo_bit video
[ 2585.673531]
[ 2585.673534] Pid: 5220, comm: zpool Tainted: P D C O 3.2.0-29-generic #46-Ubuntu System manufacturer System Product Name/P8H67-M LE
[ 2585.673538] RIP: 0010:[] [] zio_vdev_child_io+0x2d/0xe0 [zfs]
[ 2585.673568] RSP: 0018:ffff8801c84470d8 EFLAGS: 00010202
[ 2585.673571] RAX: ffff8801c62a0480 RBX: 0000000000000001 RCX: 000000000103f000
[ 2585.673573] RDX: 0000000000000000 RSI: ffff8801c62a0480 RDI: ffff8801c62a0420
[ 2585.673575] RBP: ffff8801c8447138 R08: ffffc900077cbc00 R09: 0000000000000400
[ 2585.673577] R10: 0000000000000000 R11: 0000000000000003 R12: 00000000001f0000
[ 2585.673579] R13: 0000000000000001 R14: ffff88022d76d250 R15: 0000000000000400
[ 2585.673582] FS: 00007f752835ab80(0000) GS:ffff88023fb80000(0000) knlGS:0000000000000000
[ 2585.673585] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2585.673587] CR2: 0000000000000090 CR3: 000000022c66f000 CR4: 00000000000406e0
[ 2585.673589] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2585.673591] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2585.673594] Process zpool (pid: 5220, threadinfo ffff8801c8446000, task ffff8801c4bb5c00)
[ 2585.673596] Stack:
[ 2585.673597] 00000030c8447118 ffff88022d76d240 0000000000000003 ffff88022d76d240
[ 2585.673601] ffff8801c62a0480 ffff8801c62a0420 ffff88022d76d240 0000000000000001
[ 2585.673604] ffff88022d76d250 0000000000000400 ffff8801c62a0420 ffff88022d76d240
[ 2585.673608] Call Trace:
[ 2585.673639] [] vdev_mirror_io_start+0x24c/0x3c0 [zfs]
[ 2585.673668] [] ? vdev_mirror_map_free+0x30/0x30 [zfs]
[ 2585.673697] [] ? spa_config_enter+0xb3/0x100 [zfs]
[ 2585.673726] [] zio_vdev_io_start+0x237/0x2f0 [zfs]
[ 2585.673753] [] zio_nowait+0xaf/0x130 [zfs]
[ 2585.673781] [] spa_load_verify_cb+0x8e/0xb0 [zfs]
[ 2585.673804] [] traverse_visitbp+0x238/0x5b0 [zfs]
[ 2585.673832] [] ? vdev_mirror_map_free+0x30/0x30 [zfs]
[ 2585.673848] [] ? arc_buf_remove_ref+0x110/0x110 [zfs]
[ 2585.673853] [] ? default_spin_lock_flags+0x9/0x10
[ 2585.673875] [] traverse_dnode+0x80/0x110 [zfs]
[ 2585.673896] [] traverse_visitbp+0x41d/0x5b0 [zfs]
[ 2585.673917] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.673938] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.673958] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.673978] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.673999] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.674019] [] traverse_visitbp+0x339/0x5b0 [zfs]
[ 2585.674039] [] traverse_dnode+0x80/0x110 [zfs]
[ 2585.674059] [] traverse_visitbp+0x4ba/0x5b0 [zfs]
[ 2585.674064] [] ? __wake_up+0x53/0x70
[ 2585.674084] [] traverse_impl+0x172/0x2f0 [zfs]
[ 2585.674111] [] ? spa_error_entry_compare+0x30/0x30 [zfs]
[ 2585.674132] [] traverse_dataset+0x37/0x40 [zfs]
[ 2585.674153] [] traverse_pool+0x19d/0x310 [zfs]
[ 2585.674180] [] ? spa_error_entry_compare+0x30/0x30 [zfs]
[ 2585.674208] [] spa_load_verify+0x73/0x1cf [zfs]
[ 2585.674236] [] spa_load+0xed3/0x1430 [zfs]
[ 2585.674263] [] spa_load_best+0x4e/0x200 [zfs]
[ 2585.674289] [] spa_import+0x190/0x690 [zfs]
[ 2585.674319] [] zfs_ioc_pool_import+0xf4/0x130 [zfs]
[ 2585.674347] [] zfsdev_ioctl+0xdc/0x1b0 [zfs]
[ 2585.674352] [] do_vfs_ioctl+0x8a/0x340
[ 2585.674355] [] ? putname+0x35/0x50
[ 2585.674358] [] ? do_sys_open+0x171/0x220
[ 2585.674362] [] sys_ioctl+0x91/0xa0
[ 2585.674365] [] system_call_fastpath+0x16/0x1b
[ 2585.674367] Code: 89 e5 41 54 53 48 83 ec 50 66 66 66 66 90 8b 5d 10 48 89 f0 49 89 d2 83 fb 01 75 09 48 85 f6 0f 85 99 00 00 00 41 bc 00 00 17 00 <49> 83 ba 90 00 00 00 00 48 8d 91 00 00 40 00 44 8b 9f 58 02 00
[ 2585.674392] RIP [] zio_vdev_child_io+0x2d/0xe0 [zfs]
[ 2585.674419] RSP
[ 2585.674421] CR2: 0000000000000090
[ 2585.674447] ---[ end trace 404707b601d563e1 ]---

@olegkrutov
Copy link
Author

Some additional data. This pool is unimportable under last version of Nexenta too (crash and reboot); zdb -ddddd -e crashes on one of regular files so I can not even get info about files next after crash point (automatically). But if I say to zdb to show details of file following one on that zdb is crashing - I get proper data. I think that if I could to somehow "free" this wrong file record on which zdb is crashing then I could write a program to recover pool data. Is it possible? I know number and dataset name of element on what zdb is crashing. But AFAIK I can not directly write to ZFS disks as simply as I can read via zdb or other official tool ... Any ideas? Can zinject be right tool for it?

@behlendorf
Copy link
Contributor

The pool is probably savable with enough effort. Have you tried importing the pool using the previous txg number? zpool import -T txg pool. As for other recovery tools zdb is read only for investigating what happened and everything else should be cleanly handled by zpool import. Obviously in this case it's not so we'll want to run down the bug.

@olegkrutov
Copy link
Author

Ok, I tried zpool import -T txg -f pool; as txg I took numbers less or equal than last txg number from labels, am I right? It does nothing just says cannot import 'data': one or more devices is currently unavailable. Same behaviour on stable and daily branches (0.6.0.86/87). But all devices are shown as online by zpooldata import.

@ryao
Copy link
Contributor

ryao commented Nov 30, 2012

How much lower are these numbers that you choose? It should have worked with the most recent transaction number - 1.

Would you try zpool import -T txg -f -o readonly=on pool and see what happens?

@ryao
Copy link
Contributor

ryao commented Nov 30, 2012

@olegkrutov I think I see the problem. The label transaction numbers are not the most recent ones for your pool. You should do zdb -lu /path/to/device to get the transaction numbers from the uberblock history. Use the second latest one.

@FransUrbo
Copy link
Contributor

@olegkrutov @ryao @behlendorf Considering this is old (almost a year and a half) and against an old/ancient version of ZoL and kernel, should we keep this open or close it as stale?

@behlendorf behlendorf modified the milestone: 0.6.6 Nov 8, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants