
I am able to boot VMs by using rbd as the root disk. When I restart and stop the VM everything works fine however, anytime the host goes down say due to a power outage, when next I try to boot the VM, I run into a situation where the root disk gets corrupted and get stuck at "initramfs". I have tried to fix this but to no avail. Here are the errors I get when I try yo use fix the fs issue with fsck manually. done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [ 7.760625] Btrfs loaded, crc32c=crc32c-intel, zoned=yes, fsverity=yes Scanning for Btrfs filesystems done. Begin: Will now check root file system ... fsck from util-linux 2.37.2 [/usr/sbin/fsck.ext4 (1) -- /dev/vda1] fsck.ext4 -a -C0 /dev/vda1 [ 7.866954] blk_update_request: I/O error, dev vda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 cloudimg-rootfs: recovering journal [ 8.164279] blk_update_request: I/O error, dev vda, sector 227328 op 0x1:(WRITE) flags 0x800 phys_seg 24 prio class 0 [ 8.168272] Buffer I/O error on dev vda1, logical block 0, lost async page write [ 8.170413] Buffer I/O error on dev vda1, logical block 1, lost async page write [ 8.172545] Buffer I/O error on dev vda1, logical block 2, lost async page write [ 8.174601] Buffer I/O error on dev vda1, logical block 3, lost async page write [ 8.176651] Buffer I/O error on dev vda1, logical block 4, lost async page write [ 8.178694] Buffer I/O error on dev vda1, logical block 5, lost async page write [ 8.180601] Buffer I/O error on dev vda1, logical block 6, lost async page write [ 8.182641] Buffer I/O error on dev vda1, logical block 7, lost async page write [ 8.184710] Buffer I/O error on dev vda1, logical block 8, lost async page write [ 8.186744] Buffer I/O error on dev vda1, logical block 9, lost async page write [ 8.188748] blk_update_request: I/O error, dev vda, sector 229392 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 8.191433] blk_update_request: I/O error, dev vda, sector 229440 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0 [ 8.194204] blk_update_request: I/O error, dev vda, sector 229480 op 0x1:(WRITE) flags 0x800 phys_seg 16 prio class 0 [ 8.196976] blk_update_request: I/O error, dev vda, sector 229512 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 8.243612] blk_update_request: I/O error, dev vda, sector 229544 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 8.246068] blk_update_request: I/O error, dev vda, sector 229640 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0 [ 8.248668] blk_update_request: I/O error, dev vda, sector 229688 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 8.251174] blk_update_request: I/O error, dev vda, sector 229704 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 fsck.ext4: Input/output error while recovering journal of cloudimg-rootfs fsck.ext4: unable to set superblock flags on cloudimg-rootfs cloudimg-rootfs: ********** WARNING: Filesystem still has errors ********** fsck exited with status code 12 done. Failure: File system check of the root filesystem failed The root filesystem on /dev/vda1 requires a manual fsck BusyBox v1.30.1 (Ubuntu 1:1.30.1-7ubuntu3.1) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) fsck.ext4 -f -y /dev/vda1 e2fsck 1.46.5 (30-Dec-2021) [ 24.286341] print_req_error: 174 callbacks suppressed [ 24.286358] blk_update_request: I/O error, dev vda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 cloudimg-rootfs: recovering journal [ 24.552343] blk_update_request: I/O error, dev vda, sector 227328 op 0x1:(WRITE) flags 0x800 phys_seg 24 prio class 0 [ 24.556674] buffer_io_error: 5222 callbacks suppressed [ 24.558925] Buffer I/O error on dev vda1, logical block 0, lost async page write [ 24.562116] Buffer I/O error on dev vda1, logical block 1, lost async page write [ 24.565161] Buffer I/O error on dev vda1, logical block 2, lost async page write [ 24.567872] Buffer I/O error on dev vda1, logical block 3, lost async page write [ 24.570586] Buffer I/O error on dev vda1, logical block 4, lost async page write [ 24.573418] Buffer I/O error on dev vda1, logical block 5, lost async page write [ 24.575940] Buffer I/O error on dev vda1, logical block 6, lost async page write [ 24.578622] Buffer I/O error on dev vda1, logical block 7, lost async page write [ 24.581386] Buffer I/O error on dev vda1, logical block 8, lost async page write [ 24.583873] Buffer I/O error on dev vda1, logical block 9, lost async page write [ 24.586410] blk_update_request: I/O error, dev vda, sector 229392 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 24.589821] blk_update_request: I/O error, dev vda, sector 229440 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0 [ 24.593380] blk_update_request: I/O error, dev vda, sector 229480 op 0x1:(WRITE) flags 0x800 phys_seg 16 prio class 0 [ 24.596615] blk_update_request: I/O error, dev vda, sector 229512 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 24.643829] blk_update_request: I/O error, dev vda, sector 229544 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 24.646924] blk_update_request: I/O error, dev vda, sector 229640 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0 [ 24.650051] blk_update_request: I/O error, dev vda, sector 229688 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 [ 24.653128] blk_update_request: I/O error, dev vda, sector 229704 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 0 fsck.ext4: Input/output error while recovering journal of cloudimg-rootfs fsck.ext4: unable to set superblock flags on cloudimg-rootfs cloudimg-rootfs: ********** WARNING: Filesystem still has errors ********** This is what my rbd disk looks like <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <auth username='dove'> <secret type='ceph' uuid='b608caae-5eb4-45cc-bfd4-0b4ac11c7613'/> </auth> <source protocol='rbd' name='vms/wing-64700f1d-8c469a54-3f50-4d1e-9db2-2b6ea5f3d14a'> <host name='x.168.1.x' port='6789'/> </source> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> So, my questions are; - How do I prevent this from happening as i have tried different options like changing the "cache" value for the disk template? - How can this be fixed? Thanks