On Tue, Jan 05, 2016 at 10:23:59 +0100, Michal Privoznik wrote:
On 04.01.2016 16:28, Peter Krempa wrote:
> On Tue, Dec 22, 2015 at 15:41:16 +0100, Michal Privoznik wrote:
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
>>
>> I've came across a bit of a silly scenario, but long story short:
>> after domain was resumed, the virDomainSetTime() API hung for 5
>> seconds after which it failed with an error. Problem was, that
>> there is no guest agent installed in the domain. But this got me
>> thinking and digging into the code. So even though we do listen
>> to VSERPORT_CHANGE events and connect and disconnect ourselves to
>> guest agent, we still do connect to the guest agent at both
>> domain startup and resume. This is a bit silly as it produces the
>> delay - basically, because we connect to the guest agent,
>> priv->agent is not NULL. Therefore qemuDomainAgentAvailable()
>> will return true. And the place where we hang? Well,
>> historically, when there was no VSERPORT_CHANGE event, we used a
>> dummy ping command to the guest which has 5 seconds timeout. And
>> it's still there and effective. So there's where the delay comes
>> from.
>> What's the resolution? Well, I'm introducing new capability
>> QEMU_CAPS_VSERPORT_CHANGE that is enabled iff qemu is capable of
>> the event. And, during domain startup, reconnect after resume and
>> attach we do connect to the agent socket iff the capability is
>> NOT set.
>
[...]
>
>> + config->state == VIR_DOMAIN_CHR_DEVICE_STATE_DISCONNECTED) {
>> + /* With newer qemus capable of VSERPORT_CHANGE event, we are connecting
and
>> + * disconnecting to the guest agent as it connects or disconnects to
the
>> + * channel within the guest. So, it's important to connect here iff
we are
>> + * running qemu not capable of the event or we are called from the
event
>> + * callback (@force == true). */
>
> So basically this gets called with @force true just from the callback.
> But in the callback you call this function if and only if the serial
> port state is _CONNECTED. All other callers call this with unknown port
> state. The @force argument thus doesn't make much sense.
Not true. We set the @config->state to _CONNECTED if and only if
connecting to guest agent succeeded. Prior that @config->state is left
untouched, i.e. is in _DISCONNECTED or _DEFAULT state.
Okay, I had to actually look at the code at this point. So the above
statement is not entirely true either. There is a bug in the code that
would not set the state to _CONNECTED if qemuConnectAgent failed.
Otherwise the state is always updated and for non-agent VIRTIO channels
it even doesn't have a condition that would allow this to fail.
As the channel state doesn't necessarily have to denote that the agent
socket is connected due to various reasons the state should always be
updated.
Moreover, I realized we will need to format @config->state into domain
XML more often - e.g. during migration - we want to connect on the other
side iff we are connected on the source. I have v3 patches ready.
I don't think so. This state can be re-queried at the destination of the
migration once the migration is complete and thus it does not need to be
transported. The only reason why the virtio channel state is parsed from
the XML is to report correct transitions in the event. Other than that I
don't think that any situation would require us to actually remember or
transport the original state since it can always be queried and acted
upon at the point where it's needed.
Michal