----- On Mar 29, 2021, at 12:58 PM, Bernd Lentes bernd.lentes(a)helmholtz-muenchen.de
wrote:
Hi,
we have a two-node cluster with pacemaker a SAN.
The resources are inside virtual domains.
The images of the virtual disks reside on the SAN.
On one domain i have errors from the hd in my log:
2021-03-24T21:02:28.416504+01:00 geneious kernel: [2159685.909613] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:02:46.505323+01:00 geneious kernel: [2159704.012213] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:02:55.573149+01:00 geneious kernel: [2159713.078560] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:03:23.702946+01:00 geneious kernel: [2159741.202546] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:03:30.289606+01:00 geneious kernel: [2159747.796192] ------------[
cut here ]------------
2021-03-24T21:03:30.289635+01:00 geneious kernel: [2159747.796207] WARNING: CPU:
0 PID: 457 at ../fs/buffer.c:1108 mark_buffer_dirty+0xe8/0x100
2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796208] Modules
linked in: st sr_mod cdrom lp parport_pc ppdev parport xfrm_user xfrm_algo
binfmt_misc uinput nf_log_ipv6 xt_comme
nt nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft
iscsi_boot_sysfs ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT
xt_pkttype xt_tcpudp iptable_filter ip6table_mangl
e nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4
nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c ip6table_filter
ip6_tables x_tables joydev virtio_net net_fai
lover failover virtio_balloon i2c_piix4 qemu_fw_cfg pcspkr button ext4 crc16
jbd2 mbcache ata_generic hid_generic usbhid ata_piix sd_mod virtio_rng ahci
floppy libahci serio_raw ehci_pci bo
chs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
uhci_hcd ehci_hcd usbcore virtio_pci
2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796374]
drm_panel_orientation_quirks libata dm_mirror dm_region_hash dm_log sg
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_
dh_alua scsi_mod autofs4 [last unloaded: parport_pc]
2021-03-24T21:03:30.289643+01:00 geneious kernel: [2159747.796400] Supported:
Yes
2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID:
457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5
2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796406] Hardware
name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796407] task:
ffff8ba32766c380 task.stack: ffff99954124c000
2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796409] RIP:
0010:mark_buffer_dirty+0xe8/0x100
2021-03-24T21:03:30.289646+01:00 geneious kernel: [2159747.796409] RSP:
0018:ffff99954124fcf0 EFLAGS: 00010246
2021-03-24T21:03:30.289650+01:00 geneious kernel: [2159747.796413] RAX:
0000000000a20828 RBX: ffff8ba209a58d90 RCX: ffff8ba3292d7958
2021-03-24T21:03:30.289651+01:00 geneious kernel: [2159747.796413] RDX:
ffff8ba209a585b0 RSI: ffff8ba24270b690 RDI: ffff8ba3292d7958
2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796414] RBP:
ffff8ba3292d7958 R08: ffff8ba209a585b0 R09: 0000000000000001
2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796415] R10:
ffff8ba328c1c0b0 R11: ffff8ba287805380 R12: ffff8ba3292d795a
2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796415] R13:
0000000000000000 R14: ffff8ba3292d7958 R15: ffff8ba209a58d90
2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796417] FS:
0000000000000000(0000) GS:ffff8ba333c00000(0000) knlGS:0000000000000000
2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796417] CS: 0010 DS:
0000 ES: 0000 CR0: 0000000080050033
2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796418] CR2:
0000000099bff000 CR3: 0000000101b06000 CR4: 00000000000006f0
2021-03-24T21:03:30.289655+01:00 geneious kernel: [2159747.796424] Call Trace:
2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796470]
__jbd2_journal_refile_buffer+0xbb/0xe0 [jbd2]
2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796479]
jbd2_journal_commit_transaction+0xf1a/0x1870 [jbd2]
2021-03-24T21:03:30.289657+01:00 geneious kernel: [2159747.796489] ?
__switch_to_asm+0x41/0x70
2021-03-24T21:03:30.289658+01:00 geneious kernel: [2159747.796490] ?
__switch_to_asm+0x35/0x70
2021-03-24T21:03:30.289662+01:00 geneious kernel: [2159747.796493]
kjournald2+0xbb/0x230 [jbd2]
2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796499] ?
wait_woken+0x80/0x80
2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796503]
kthread+0xf6/0x130
2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796508] ?
commit_timeout+0x10/0x10 [jbd2]
2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796510] ?
kthread_bind+0x10/0x10
2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796511]
ret_from_fork+0x35/0x40
2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796517] Code: 1b 48
8b 03 48 8b 7b 08 48 83 c3 18 48 89 ee e8 bf 42 76 00 48 8b 03 48 85 c0 75 e8
e9 3c ff ff ff 48 89 df 5b 5d e9
c8 35 fb ff <0f> 0b e9 26 ff ff ff 48 83 e8 01 e9 5b ff ff ff 0f 1f 84 00 00
2021-03-24T21:03:30.289670+01:00 geneious kernel: [2159747.796533] ---[ end
trace db796891c8ff94af ]---
2021-03-24T21:03:46.593225+01:00 geneious kernel: [2159764.100145] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:05:09.372772+01:00 geneious kernel: [2159846.877201] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:06:39.943519+01:00 geneious kernel: [2159937.381068] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:07:42.364311+01:00 geneious kernel: [2159999.793805] JBD2:
Detected IO errors while flushing file data on dm-1-8
2021-03-24T21:07:57.822133+01:00 geneious kernel: [2160015.291776] JBD2:
Detected IO errors while flushing file data on dm-1-8
First i'm wondering: what is dm-1-8 ?
I don't have a device like that.
geneious:~ # find /dev -iname '*dm*'
/dev/dm-1
/dev/dm-0
/dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6lq0ntdZJxxLIIp5G8XihsuYrTbx7Rs0vc
/dev/disk/by-id/dm-name-vg_local-lv_var
/dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6l3fdsOpBFoDWral3Fa7c6ZeYECmLd6FFj
/dev/disk/by-id/dm-name-vg_local-lv_root
/dev/cpu_dma_latency
I just find /proc/fs/jbd2/dm-1-8.
There is a file /proc/fs/jbd2/dm-1-8/info:
453005 transactions (319055 requested), each up to 8192 blocks
average:
0ms waiting for transaction
12ms request delay
5124ms running transaction
0ms transaction was being locked
0ms flushing data (in ordered mode)
44ms logging transaction
8031us average transaction commit time
64 handles per transaction
5 blocks per transaction
6 logged blocks per transaction
What is that ?
The logfile says also something about dm-0-8:
2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID:
457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5
geneious:~ # find / -iname dm-0-8
/proc/fs/jbd2/dm-0-8
geneious:~ # ll /proc/fs/jbd2/dm-0-8
total 0
-r--r--r-- 1 root root 0 Mar 29 12:56 info
geneious:~ # cat /proc/fs/jbd2/dm-0-8/info
7356 transactions (556 requested), each up to 8192 blocks
average:
0ms waiting for transaction
20ms request delay
5628ms running transaction
4ms transaction was being locked
0ms flushing data (in ordered mode)
132ms logging transaction
134769us average transaction commit time
52 handles per transaction
18 blocks per transaction
19 logged blocks per transaction
geneious:~ #
I assume i have a harddisk problem. I'm checking currently the SAN with its own
tools, via a web interface.
Afterwards i want to stop the domain, boot it with a live cd and run badblocks
and fsck.ext3.
What else can i do ?
Bernd
I forgot:
host is SLES 12 SP5, virtual domain too.
The image file is in raw format.