On 08/09/2012 09:42 AM, Shradha Shah wrote:
Hello All,
I have recently found a bug in my patches to support forward mode="hostdev".
The bug is that libvirtd restart forgets about the netdef->forwardIfs.
Steps to reproduce:
1) virsh net-define hostdev.xml
<network>
<name>hostdev</name>
<uuid>81ff0d90-c91e-6742-64da-4a736edb9a8f</uuid>
<forward mode='hostdev' managed='yes'>
<pf dev='eth4'/>
</forward>
</network>
2) virsh define guest.xml
3) virsh start guest
[root@c6100m libvirt]# virsh net-dumpxml hostdev
<network>
<name>hostdev</name>
<uuid>81ff0d90-c91e-6742-64da-4a736edb9a8f</uuid>
<forward managed='yes' mode='hostdev'>
<pf dev='eth4'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x00' function='0x2'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x00' function='0x4'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x00' function='0x6'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x01' function='0x0'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x01' function='0x2'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x01' function='0x4'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x01' function='0x6'/>
<address type='pci' domain='0x0000' bus='0x04'
slot='0x02' function='0x0'/>
</forward>
</network>
4) service libvirtd restart
After libvirtd is restarted I observe that the guest is shutdown.
/var/log/libvirt/libvirtd.log complains with the following error message:
networkNotifyActualDevice:3299 : internal error network 'hostdev' doesn't
have dev in use by domain.
Using gdb when I breakpoint on networkNotifyActualDevice on libvirtd start I observe that
netdef->nForwardIfs has been reset to 0, but netdef->nForwardPfs is correctly
assigned to 1.
So my question is does libvirtd restart cause the network to remember only the inactive
XML and not the active XML?
(This is a very long answer to a very short question, but that's kind of
in my nature, and hopefully the extra background will be helpful :-)
Well, both are actually remembered, but in this case the definition of
"active XML" isn't quite what you would think. The active xml is
whatever was last written to /var/lib/libvirt/network/${net-name}.xml,
and the inactive XML is what was last written to
/etc/libvirt/qemu/network/${net-name}.xml. When a network is defined,
the config is put into /etc, (and saved in netobj->def) and when it's
started, it's put into /var. If you edit a network's config while it is
active, the new config is stored into /etc (and netobj->newDef) (i.e.
that will be the "new definition" the next time the network is started),
but /var (and netobj->def) are left unchanged.
When libvirtd is restarted, the "active XML" from /var is put into
netobj->def, and the "inactive (or next) XML" is again put in
netobj->newDef.
In the qemu driver, live modifications to the state of the domain are
saved in domobj->def as well as /var/lib/libvirt/qemu, but up until now
no live changes to the network state have been supported (that's about
to change, but that's another story). Even in the case of the
forwardPfs, auto-populating the forwardIfs isn't really changing the
state of the network, it's just filling in some state that's already there.
In a way, this is similar to the usageCount (renamed to "connections" in
my recent patchset which isn't yet reviewed/pushed) - it's incremented
by one for every guest interface using a particular forwardIf, but that
information isn't saved to the file in /var (partly because doing so
could lead to an erroneous count if a qemu process was terminated while
libvirtd was in the middle of restarting). Instead, when libvirtd is
restarted, the network driver reloads all the networks with the
usageCounts all set to 0. Then, as the qemu driver reconnects to all of
the running domains, it calls into networkNotifyActualDevice() for each
guest interface connected to a network. networkNotifyActualDevice() does
some things similar to networkAllocateActualDevice(), except that
instead of searching for a device to use, qemu tells the network driver
which device it's already using. If that device isn't available, then
the network driver throws up its hands in error (because that shouldn't
ever happen), but otherwise just increments the usageCount and returns
success. This way, by the time qemu is finished reconnecting all its
domains, the usageCounts are all proper. (Note that libvirtd will not
allow anyone to start a new domain, or attach a new interface until
after all of the domains have been reconnected, so there is no chance of
some new domain/interface taking over a physical interface previously
allocated to some other domain.)
You need to repopulate the forwardifs from forwardPfs in
networkNotifyActualDevice just as you do in networkAllocateActualDevice.
That's something we should have caught in the review of commit 42c81d18.
(An alternate fix would be to do that populating of the forwardIfs from
forwardPfs when the network is (re)started, rather than waiting until
the first time an allocation is attempted.)
As I alluded to above, in the future it will be possible to make some
changes to the network definition and have them take effect immediately
(without restarting the network). When this is done, the "def + /var"
and "newDef + /etc" sets of data will behave more like they do for
domains. But even then, I think it's proper to not save the
autogenerated forwardIfs to the XML file, and to regenerate them on the
fly when libvirtd restarts.