On Tue, May 19, 2015 at 06:08:10PM +0200, Michael S. Tsirkin wrote:
On Tue, May 19, 2015 at 04:45:03PM +0100, Daniel P. Berrange wrote:
> On Tue, May 19, 2015 at 05:39:05PM +0200, Michael S. Tsirkin wrote:
> > On Tue, May 19, 2015 at 04:35:08PM +0100, Daniel P. Berrange wrote:
> > > On Tue, May 19, 2015 at 04:03:04PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrange (berrange(a)redhat.com) wrote:
> > > > > On Tue, May 19, 2015 at 10:15:17AM -0400, Laine Stump wrote:
> > > > > > On 05/19/2015 05:07 AM, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Apr 22, 2015 at 10:23:04AM +0100, Daniel P.
Berrange wrote:
> > > > > > >> On Fri, Apr 17, 2015 at 04:53:02PM +0800, Chen Fan
wrote:
> > > > > > >>> backgrond:
> > > > > > >>> Live migration is one of the most important
features of virtualization technology.
> > > > > > >>> With regard to recent virtualization
techniques, performance of network I/O is critical.
> > > > > > >>> Current network I/O virtualization (e.g.
Para-virtualized I/O, VMDq) has a significant
> > > > > > >>> performance gap with native network I/O.
Pass-through network devices have near
> > > > > > >>> native performance, however, they have thus
far prevented live migration. No existing
> > > > > > >>> methods solve the problem of live migration
with pass-through devices perfectly.
> > > > > > >>>
> > > > > > >>> There was an idea to solve the problem in
website:
> > > > > > >>>
https://www.kernel.org/doc/ols/2008/ols2008v2-pages-261-267.pdf
> > > > > > >>> Please refer to above document for detailed
information.
> > > > > > >>>
> > > > > > >>> So I think this problem maybe could be solved
by using the combination of existing
> > > > > > >>> technology. and the following steps are we
considering to implement:
> > > > > > >>>
> > > > > > >>> - before boot VM, we anticipate to specify
two NICs for creating bonding device
> > > > > > >>> (one plugged and one virtual NIC) in XML.
here we can specify the NIC's mac addresses
> > > > > > >>> in XML, which could facilitate
qemu-guest-agent to find the network interfaces in guest.
> > > > > > >>>
> > > > > > >>> - when qemu-guest-agent startup in guest it
would send a notification to libvirt,
> > > > > > >>> then libvirt will call the previous
registered initialize callbacks. so through
> > > > > > >>> the callback functions, we can create the
bonding device according to the XML
> > > > > > >>> configuration. and here we use netcf tool
which can facilitate to create bonding device
> > > > > > >>> easily.
> > > > > > >> I'm not really clear on why libvirt/guest
agent needs to be involved in this.
> > > > > > >> I think configuration of networking is really
something that must be left to
> > > > > > >> the guest OS admin to control. I don't think
the guest agent should be trying
> > > > > > >> to reconfigure guest networking itself, as that is
inevitably going to conflict
> > > > > > >> with configuration attempted by things in the
guest like NetworkManager or
> > > > > > >> systemd-networkd.
> > > > > > > There should not be a conflict.
> > > > > > > guest agent should just give NM the information, and
have NM do
> > > > > > > the right thing.
> > > > > >
> > > > > > That assumes the guest will have NM running. Unless you
want to severely
> > > > > > limit the scope of usefulness, you also need to handle
systems that have
> > > > > > NM disabled, and among those the different styles of system
network
> > > > > > config. It gets messy very fast.
> > > > >
> > > > > Also OpenStack already has a way to pass guest information about
the
> > > > > required network setup, via cloud-init, so it would not be
interested
> > > > > in any thing that used the QEMU guest agent to configure
network
> > > > > manager. Which is really just another example of why this does
not
> > > > > belong anywhere in libvirt or lower. The decision to use NM is
a
> > > > > policy decision that will always be wrong for a non-negligble
set
> > > > > of use cases and as such does not belong in libvirt or QEMU. It
is
> > > > > the job of higher level apps to make that kind of policy
decision.
> > > >
> > > > This is exactly my worry though; why should every higher level
management
> > > > system have it's own way of communicating network config for
hotpluggable
> > > > devices. You shoudln't need to reconfigure a VM to move it
between them.
> > > >
> > > > This just makes it hard to move it between management layers; there
needs
> > > > to be some standardisation (or abstraction) of this; if libvirt
isn't the place
> > > > to do it, then what is?
> > >
> > > NB, openstack isn't really defining a custom thing for networking
here. It
> > > is actually integrating with the standard cloud-init guest tools for this
> > > task. Also note that OpenStack has defined a mechanism that works for
> > > guest images regardless of what hypervisor they are running on - ie does
> > > not rely on any QEMU or libvirt specific functionality here.
> >
> > I'm not sure what the implication is. No new functionality should be
> > implemented unless we also add it to vmware? People that don't want kvm
> > specific functionality, won't use it.
>
> I'm saying that standardization of virtualization policy in libvirt is the
> wrong solution, because different applications will have different viewpoints
> as to what "standardization" is useful / appropriate. Creating a
standardized
> policy in libvirt for KVM, does not help OpenStack may help people who only
> care about KVM, but that is not the entire ecosystem. OpenStack has a
> standardized solution for guest configuration imformation that works across
> all the hypervisors it targets. This is just yet another example of exactly
> why libvirt aims to design its APIs such that it exposes direct mechanisms
> and leaves usage policy decisions upto the management applications. Libvirt
> is not best placed to decide which policy all these mgmt apps must use for
> this task.
>
> Regards,
> Daniel
I don't think we are pushing policy in libvirt here.
What we want is a mechanism that let users specify in the XML:
interface X is fallback for pass-through device Y
Then when requesting migration, specify that it should use
device Z on destination as replacement for Y.
We are asking libvirt to automatically
1.- when migration is requested, request unplug of Y
2.- wait until Y is deleted
3.- start migration
4.- wait until migration is completed
5.- plug device Z on destination
I don't see any policy above: libvirt is in control of migration and
seems best placed to implement this.
Even this implies policy in libvirt about handling of failure conditions.
How long to wait for unplug. What todo when unplug fails. What todo it
plug fails on the target. It is hard to report these errors to application
and when multiple devices are to be plugged/unplugged, the application will
also have trouble determining whether some or all of the devices are still
present after failure. Even beyond that, this is pointless as all 5 steps
you describe here are already possible to perform with existing functionality
in libvirt, with the application having direct control over what todo in the
failure scenarios.
Regards,
Daniel
--
|: