>>> Hi all:
>>> Suppose we have a guest domain which is pvops, for example, rhel6.4.
>>>
>>> Steps to produce the problem:
>>> 1 start the guest by virDomainCreate()
>>> 2 the API returns before the guest domain fully available, which
means,
>> the disks, network interfaces and some import services are not available
inside
>> the guest.
>>> 3 we call virDomainShutdown() to shutdown the guest.
>>>
>>> Expected result:
>>> The guest got shutdown.
>>>
>>> The result in fact:
>>> Because the guest is not available when we call
virDomainShutdown(),
>> it couldn't respond to our 'shutdown' xenstore request, the guest
turns on
>> later, rather than shutting down.
>>
>> I don't think this is unique to a pvops guest kernel, or even a xen stack. I
see
>> the same behavior with qemu. 'virsh create dom.xml && virsh shutdown
dom'
>> results in the guest kernel missing the shutdown event and booting anyhow. I
>> guess SeaBIOS could still be loading when the shutdown event is issued :-).
The
>> virDomainShutdownFlags documentation even states "that the guest OS may
>> ignore
>> the request". In my example, the guest OS isn't even alive yet.
>>
>>> So , the question is:
>>> In libxl_driver( xen-hypervisor environment), how can we tell that
the
>> guest is available or not, and is it suitable to shutdown the guest at that
>> moment?
>>
>> libxl has no API to determine if a guest OS has booted. In a qemu/kvm stack, I
>> suppose qemu-ga is the preferred way to know when a guest OS has booted,
or
>> is
>> far enough along to respond to shutdown events.
>>
>> One possible approach in xen, which is not supported by libvirt, would be to
>> monitor the state of a device frontend in xenstore. E.g. when
>> /local/domain/<domid>/device/vif/<vifid>/state reaches 4 (connected),
you'll
at
>> least know the driver in the guest is up and running.
> I've tried that way, but even the device state is not trustable, because inside
the guest, it calls "add_disk" after the device state changes to 4, and before
it
could respond the 'shutdown' xenstore request, which takes a while to
complete.
Yeah, I thought that was a longshot. Synchronization of the front and backend
drivers doesn't necessarily mean the OS is in a position to respond to the
shutdown event. Lacking a guest agent, another option would be to wait for the
guests network stack to come alive, e.g. responds to pings or connection
requests.
In practice, the host and guest may not be connectable because their network are
separated.
So, does that mean, only a guest agent could solve this problem?
Regards,
Jim