On Jun 20, 2024, at 4:06 PM, Peter Xu <peterx(a)redhat.com>
wrote:
!-------------------------------------------------------------------|
CAUTION: External Email
|-------------------------------------------------------------------!
On Thu, Jun 20, 2024 at 07:45:42PM +0000, Jon Kohler wrote:
>
>
>> On Jun 20, 2024, at 4:30 AM, Jiri Denemark <jdenemar(a)redhat.com> wrote:
>>
>> !-------------------------------------------------------------------|
>> CAUTION: External Email
>>
>> |-------------------------------------------------------------------!
>>
>> On Tue, Jun 18, 2024 at 16:14:29 +0100, Daniel P. Berrangé wrote:
>>> On Tue, Jun 18, 2024 at 08:06:06AM -0700, Jon Kohler wrote:
>>>> diff --git a/include/libvirt/libvirt-domain.h
b/include/libvirt/libvirt-domain.h
>>>> index 2f5b01bbfe..9543629f30 100644
>>>> --- a/include/libvirt/libvirt-domain.h
>>>> +++ b/include/libvirt/libvirt-domain.h
>>>> @@ -1100,6 +1100,17 @@ typedef enum {
>>>> * Since: 8.5.0
>>>> */
>>>> VIR_MIGRATE_ZEROCOPY = (1 << 20),
>>>> +
>>>> + /* Use switchover ack migration capability to reduce downtime on
VFIO
>>>> + * device migration. This prevents the source from stopping the VM
and
>>>> + * completing the migration until an ACK is received from the
destination
>>>> + * that it's OK to do so. Thus, a VFIO device can make sure that
its
>>>> + * initial bytes were sent and loaded in the destination before
the
>>>> + * source VM is stopped.
>>>> + *
>>>> + * Since: 10.5.0
>>>> + */
>>>> + VIR_MIGRATE_SWITCHOVER_ACK = (1 << 21),
>>>> } virDomainMigrateFlags;
>>>
>>> Do we really need a flag for this ? Is there a credible scenario
>>> in which this flag works, and yet shouldn't be used by libvirt ?
>>>
>>> IOW, can we just "do the right thing" and always enable this,
>>> except for TUNNELLED mode.
>>
>> I discussed this capability some time ago with Peter (I think) and if
>> IIRC there was some downside when the capability is enabled for domains
>> that do not use VFIO. I don't remember exactly what it was about, but
>> perhaps introducing an extra delay in migration switchover? Peter, can
>> you add the details, please?
>
> Thanks - @Peter, if you have additional info on that, would love to know
> what the non-VFIO downsides are here.
So far, VFIO is the only one who will register this "ACK needed" hook.
When nobody registers with it, the ACK will be sent upfront of a migration
when return path is established. That happens at the very beginning of a
migration, and that ACK will be completely meaningless in that case.
Said that, it may not be too bad either to have that meaningless ACK, if
that will simply Libvirt. That only happens once per migration, and after
sent once it should work exactly the same as when switchover-ack not enabled.
RE Simplicity - thats exactly my thought here. If it is effectively a no-op, then just
enabling switchover-ack full-time on all migrations would make this quite easy.
Said another way, I take your comments to mean that there is no functional or
performance downside to enabling switchover-ack on non-VFIO, therefore, might
as well keep it simple and not special-case it in libvirt side, right?
Thanks,
Jon
Thanks,
--
Peter Xu