Re: [libvirt] [Qemu-devel] [RFC 0/7] Live Migration with Pass-through Devices proposal

22 Apr 2015


      * Daniel P. Berrange (berrange@redhat.com) wrote:
...
On Wed, Apr 22, 2015 at 06:12:25PM +0100, Dr. David Alan Gilbert wrote:
...
* Daniel P. Berrange (berrange@redhat.com) wrote:
...
On Wed, Apr 22, 2015 at 06:01:56PM +0100, Dr. David Alan Gilbert wrote:
...
* Daniel P. Berrange (berrange@redhat.com) wrote:
...
On Fri, Apr 17, 2015 at 04:53:02PM +0800, Chen Fan wrote:
...
backgrond:
Live migration is one of the most important features of virtualization technology.
With regard to recent virtualization techniques, performance of network I/O is critical.
Current network I/O virtualization (e.g. Para-virtualized I/O, VMDq) has a significant
performance gap with native network I/O. Pass-through network devices have near
native performance, however, they have thus far prevented live migration. No existing
methods solve the problem of live migration with pass-through devices perfectly.
There was an idea to solve the problem in website:
https://www.kernel.org/doc/ols/2008/ols2008v2-pages-261-267.pdf
Please refer to above document for detailed information.
So I think this problem maybe could be solved by using the combination of existing
technology. and the following steps are we considering to implement:
-  before boot VM, we anticipate to specify two NICs for creating bonding device
   (one plugged and one virtual NIC) in XML. here we can specify the NIC's mac addresses
   in XML, which could facilitate qemu-guest-agent to find the network interfaces in guest.
-  when qemu-guest-agent startup in guest it would send a notification to libvirt,
   then libvirt will call the previous registered initialize callbacks. so through
   the callback functions, we can create the bonding device according to the XML
   configuration. and here we use netcf tool which can facilitate to create bonding device
   easily.
I'm not really clear on why libvirt/guest agent needs to be involved in this.
I think configuration of networking is really something that must be left to
the guest OS admin to control. I don't think the guest agent should be trying
to reconfigure guest networking itself, as that is inevitably going to conflict
with configuration attempted by things in the guest like NetworkManager or
systemd-networkd.
IOW, if you want to do this setup where the guest is given multiple NICs connected
to the same host LAN, then I think we should just let the gues admin configure
bonding in whatever manner they decide is best for their OS install.
I disagree; there should be a way for the admin not to have to do this manually;
however it should interact well with existing management stuff.
At the simplest, something that marks the two NICs in a discoverable way
so that they can be seen that they're part of a set;  with just that ID system
then an installer or setup tool can notice them and offer to put them into
a bond automatically; I'd assume it would be possible to add a rule somewhere
that said anything with the same ID would automatically be added to the bond.
I didn't mean the admin would literally configure stuff manually. I really
just meant that the guest OS itself should decide how it is done, whether
NetworkManager magically does the right thing, or the person building the
cloud disk image provides a magic udev rule, or $something else. I just
don't think that the QEMU guest agent should be involved, as that will
definitely trample all over other things that manage networking in the
guest.
OK, good, that's about the same level I was at.
...
I could see this being solved in the cloud disk images by using
cloud-init metadata to mark the NICs as being in a set, or perhaps there
is some magic you could define in SMBIOS tables, or something else again.
A cloud-init based solution wouldn't need any QEMU work, but an SMBIOS
solution might.
Would either of these work with hotplug though?   I guess as the VM starts
off with the pair of NICs, then when you remove one and add it back after
migration then you don't need any more information added; so yes
cloud-init or SMBIOS would do it.  (I was thinking SMBIOS stuff
in the way that you get device/slot numbering that NIC naming is sometimes based
off).
What about if we hot-add a new NIC later on (not during migration);
a normal hot-add of a NIC now turns into a hot-add of two new NICs; how
do we pass the information at hot-add time to provide that?
Hmm, yes, actually hotplug would be a problem with that.
A even simpler idea would be to just keep things real dumb and simply
use the same MAC address for both NICs. Once you put them in a bond
device, the kernel will be copying the MAC address of the first NIC
into the second NIC anyway, so unless I'm missing something, we might
as well just use the same MAC address for both right away. That makes
it easy for guest to discover NICs in the same set and works with
hotplug trivially.
I bet you need to distinguish the two NICs though; you'd want the bond
to send all the traffic through the real NIC during normal use;
and how does the guest know when it sees the hotplug of the 1st NIC in the pair
that this is a special NIC that it's about to see it's sibbling arrive.

Dave
...
Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK