[libvirt] can't start domain with a corrupted disk attatched

Hi all, I came across below issue when testing: 1.make a volume and attach it to a domain A 2.unplug the vg from the host in order to emulating a volume failure 3.start domain A(failed) In step 3 can't start domainA . because can't find disk listed in xml when create the Domain. I'm not sure if it is reasonable. In common sense, we can still start our system even if we have a corrupt data disk .And also ,if in data center we carelessly attatch a corrupt volumn to all the guest, it will result in all guests fail to boot . I suggest to automatically detach a disk if it can't be found and just give out a warning.Please let me know your opinion about if it is a bug or a feature.Thanks.

On Thu, Nov 03, 2011 at 05:33:56PM +0800, lvroyce wrote:
Hi all,
I came across below issue when testing:
1.make a volume and attach it to a domain A 2.unplug the vg from the host in order to emulating a volume failure 3.start domain A(failed)
In step 3 can't start domainA . because can't find disk listed in xml when create the Domain.
I'm not sure if it is reasonable. In common sense, we can still start our system even if we have a corrupt data disk .And also ,if in data center we carelessly attatch a corrupt volumn to all the guest, it will result in all guests fail to boot .
I suggest to automatically detach a disk if it can't be found and just give out a warning.Please let me know your opinion about if it is a bug or a feature.Thanks.
You are not simulating a corrupt data disk in your example, you are simulating a completely missing disk. This is simply administrative error and thus is it perfectly reasonable for the guest to not start when its disk is missing. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 2011-11-3 18:12, Daniel P. Berrange wrote:
On Thu, Nov 03, 2011 at 05:33:56PM +0800, lvroyce wrote:
Hi all,
I came across below issue when testing:
1.make a volume and attach it to a domain A 2.unplug the vg from the host in order to emulating a volume failure 3.start domain A(failed)
In step 3 can't start domainA . because can't find disk listed in xml when create the Domain.
I'm not sure if it is reasonable. In common sense, we can still start our system even if we have a corrupt data disk .And also ,if in data center we carelessly attatch a corrupt volumn to all the guest, it will result in all guests fail to boot .
I suggest to automatically detach a disk if it can't be found and just give out a warning.Please let me know your opinion about if it is a bug or a feature.Thanks. You are not simulating a corrupt data disk in your example, you are simulating a completely missing disk. This is simply administrative error and thus is it perfectly reasonable for the guest to not start when its disk is missing. I think the she was talking about a non-root disk missing. In physical machine, if a non-root disk was gone, the system can also start up without the disk.
Daniel

The 03/11/11, lvroyce wrote:
Hi all,
I came across below issue when testing:
1.make a volume and attach it to a domain A 2.unplug the vg from the host in order to emulating a volume failure 3.start domain A(failed)
In step 3 can't start domainA . because can't find disk listed in xml when create the Domain.
I'm not sure if it is reasonable. In common sense, we can still start our system even if we have a corrupt data disk .And also ,if in data center we carelessly attatch a corrupt volumn to all the guest, it will result in all guests fail to boot .
I suggest to automatically detach a disk if it can't be found and just give out a warning.Please let me know your opinion about if it is a bug or a feature.Thanks.
For most my use cases, I'd rather the guest not to start at all. The reason is that any attached disk is a mount point to somewhere. Not having the disk means that the dedicated space for this mount point is missing. Think about a guest (system with few disk space) doing a local mirror of a distribution (data with a lot of space). Starting the synchronisation script from cron without the disk for data attached means that the synchronisation will restart from scratch and that the system will fail by running out of disk space. Since attached disks are used in many different use cases, I think it would be a real gain to have an option per disk to tell if it's critical or not for the guest to start. -- Nicolas Sebrecht

On 03.11.2011 10:33, lvroyce wrote:
Hi all,
I came across below issue when testing:
1.make a volume and attach it to a domain A 2.unplug the vg from the host in order to emulating a volume failure 3.start domain A(failed)
In step 3 can't start domainA . because can't find disk listed in xml when create the Domain.
I'm not sure if it is reasonable. In common sense, we can still start our system even if we have a corrupt data disk .And also ,if in data center we carelessly attatch a corrupt volumn to all the guest, it will result in all guests fail to boot .
I suggest to automatically detach a disk if it can't be found and just give out a warning.Please let me know your opinion about if it is a bug or a feature.Thanks.
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
I think I should join this discussion as I am the author of e5a84d74a2789a917bf394f15de9989ec48fded0 aka startupPolicy; I think it is reasonable to allow users to drop any disk on startup as one can do this with real host. Although, users should take special care to not remove root/boot disk. But I don't think that is something libvirt should try to avoid. This is why we currently support this feature only on cdrom & floppy. Modification of libvirt code should be simple: 1) in src/conf/domain_conf.c:2740 change check, so other disks can have 'optional' or 'mandatory' values assigned *only*. 2) change qemuDomainCheckDiskPresence (src/qemu/qemu_domain.c) so it does not drop disk->src on non-cdrom disk, but whole disk. Although there might be something more to be done, I have not tried to code this. Moreover, dropping whole disk may result in changed domain XML, so be careful. Michal
participants (5)
-
Daniel P. Berrange
-
lvroyce
-
Michal Privoznik
-
Nicolas Sebrecht
-
shu ming