On 8/24/20 8:01 PM, Laine Stump wrote:
On 8/24/20 6:28 PM, Daniel Henrique Barboza wrote:
>
>
> On 8/23/20 1:52 AM, Laine Stump wrote:
>> Back when macvtap support was added in commit 315baab9443 in Feb. 2010
>> (libvirt-0.7.7), it was setup to autogenerate a name for the device if
>> one wasn't supplied, in the pattern "macvtap%d" (or
"macvlan%d"),
>> similar to the way an unspecified standard tap device name will lead
>> to an autogenerated "vnet%d".
>>
>> As a matter of fact, in commit ca1b7cc8e45 added in May 2010, the code
>> was changed to *always* ignore a supplied device name for macvtap
>> interfaces by deleting any name immediately during the <interface>
>> parsing (this was intended to prevent one domain which had failed to
>> completely start from deleting the macvtap device of another domain
>> which had subsequently been provided the same device name (this will
>> seem mildly ironic later). This was later fixed to only clear the
>> device name when inactive XML was being parsed. HOWEVER - this was
>> only done if the xml was <interface type='direct'> - autogenerated
>> names were not cleared for <interface type='network'> (which could
>> also result in a macvtap device).
>>
>> Although the names of "vnetX" tap devices had always been
>> automatically cleared when parsing <interface> (see commit d1304583d
>> from July 2008 (!)), at the time macvtap support was added, both vnetX
>> and macvtapX device names were always included when formatting the
>> XML.
>>
>> Then in commit a8be259d0cc (July 2011, libvirt-0.9.4), <interface>
>> formatting was changed to also clear out "vnetX" device names during
>> XML formatting as well. However the same treatment wasn't given to
>> "macvtapX".
>>
>> The result of all this (among other things) was that when a running guest
>
> I guess this is a stray sentence ^
You are guessing correctly.
My ferret-like brain is easily distracted in the middle of a thought, and sometimes when
I pop my stack back to where I was before, I forget to clean up some loose ends. In this
case, the above fragment just needs to be removed.
>
>>
>> Now in 2020, there has been a report that a failed migration leads to
>> the macvtap device of some other unrelated guest on the destination
>> host losing its network connectivity. It was determined that this was
>> due to the domain XML in the migration containing a macvtap device
>> name, e.g. "macvtap0", that was already in use on the
>> destination. Normally this wouldn't be a problem, because libvirt
>> would see that the device was already in use, and then find a
>> different unused name. But in this case, other external problems were
>> causing the migration to fail prior to selecting a macvtap device and
>> successfully opening it, and during error recovery, qemuProcessStop()
>> was called, which went through all def->nets objects and (if they were
>> macvtap) deleted the device specified in net->ifname; since libvirt
>> hadn't gotten to the point of replacing the incoming "macvtap0"
with
>> the name of a device it actually created for this guest, that meant
>> that "macvtap0" was deleted, *even though it was currently in use by a
>> different guest*!
>>
>> Whew!
>>
>> So, it turns out that when formatting "migratable" XML,
"vnetX"
>> devices are omitted, just as when formatting "inactive" XML (I'm
sure
>> there's a bit of code somewhere that says "if (migratable) then set
>> inactive too", but found it was easier to just try it out with "virsh
>> dumpxml blah --migratable"). By making the code in both interface
>> parsing and formatting consistent for "vnetX", "macvtapX",
and
>> "macvlanX", we can thus make sure that the autogenerated (and unneeded
>> / completely *not* wanted) macvtap device name will not be sent with
>> the migration XML. This way when a migration fails, net->ifname will
>> be NULL, and libvirt won't have any device to try and delete.
>>
>> Signed-off-by: Laine Stump <laine(a)redhat.com>
>> ---
>
> Cool story and good patch.
Well, I live for cool stories, after all :-) (seriously, I don't know if anyone else
so frequently has commit log messages longer than the actual patch. My wife would say
I'm confusing the situation by giving too much information; I say I'm being
thorough :-))
I don't shy away from long commit messages as well:
https://github.com/qemu/qemu/commit/6ca080453ea403959ccde661030ca16264acc181
I'm keeping it under control nowadays, but sometimes a relapse hits and when
I notice it's a 100 line commit msg to explain why I'm removing an "IF"
clause :)
DHB