Re: [libvirt] feature suggestion: migration network

On Thu, Jan 10, 2013 at 10:45:42AM +0800, Mark Wu wrote:
On 01/08/2013 10:46 PM, Yaniv Kaul wrote:
On 08/01/13 15:04, Dan Kenigsberg wrote:
There's talk about this for ages, so it's time to have proper discussion and a feature page about it: let us have a "migration" network role, and use such networks to carry migration data
When Engine requests to migrate a VM from one node to another, the VM state (Bios, IO devices, RAM) is transferred over a TCP/IP connection that is opened from the source qemu process to the destination qemu. Currently, destination qemu listens for the incoming connection on the management IP address of the destination host. This has serious downsides: a "migration storm" may choke the destination's management interface; migration is plaintext and ovirtmgmt includes Engine which sits may sit the node cluster.
With this feature, a cluster administrator may grant the "migration" role to one of the cluster networks. Engine would use that network's IP address on the destination host when it requests a migration of a VM. With proper network setup, migration data would be separated to that network.
=== Benefit to oVirt === * Users would be able to define and dedicate a separate network for migration. Users that need quick migration would use nics with high bandwidth. Users who want to cap the bandwidth consumed by migration could define a migration network over nics with bandwidth limitation. * Migration data can be limited to a separate network, that has no layer-2 access from Engine
=== Vdsm === The "migrate" verb should be extended with an additional parameter, specifying the address that the remote qemu process should listen on. A new argument is to be added to the currently-defined migration arguments: * vmId: UUID * dst: management address of destination host * dstparams: hibernation volumes definition * mode: migration/hibernation * method: rotten legacy * ''New'': migration uri, according to http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2 such as tcp://<ip of migration network on remote node>
=== Engine === As usual, complexity lies here, and several changes are required:
1. Network definition. 1.1 A new network role - not unlike "display network" should be added.Only one migration network should be defined on a cluster. 1.2 If none is defined, the legacy "use ovirtmgmt for migration" behavior would apply. 1.3 A migration network is more likely to be a ''required'' network, but a user may opt for non-required. He may face unpleasant surprises if he wants to migrate his machine, but no candidate host has the network available. 1.4 The "migration" role can be granted or taken on-the-fly, when hosts are active, as long as there are no currently-migrating VMs.
2. Scheduler 2.1 when deciding which host should be used for automatic migration, take into account the existence and availability of the migration network on the destination host. 2.2 For manual migration, let user migrate a VM to a host with no migration network - if the admin wants to keep jamming the management network with migration traffic, let her.
3. VdsBroker migration verb. 3.1 For the a modern cluster level, with migration network defined on the destination host, an additional ''miguri'' parameter should be added to the "migrate" command
_______________________________________________ Arch mailing list Arch@ovirt.org http://lists.ovirt.org/mailman/listinfo/arch
How is the authentication of the peers handled? Do we need a cert per each source/destination logical interface? Y. In my understanding, using a separate migration network doesn't change the current peers authentication. We still use the URI ''qemu+tls://remoeHost/system' to connect the target libvirt service if ssl enabled, and the remote host should be the ip address of management interface. But we can choose other interfaces except the manage interface to transport the migration data. We just change the migrateURI, so the current authentication mechanism should still work for this new feature.
vdsm-vdsm and libvirt-libvirt communication is authenticated, but I am not sure at all that qemu-qemu communication is. After qemu is sprung up on the destination with -incoming <some ip>:<some port> , anything with access to that address could hijack the process. Our migrateURI starts with "tcp://" with all the consequences of this. That a good reason to make sure <some ip> has as limited access as possible. But maybe I'm wrong here, and libvir-list can show me the light. Dan.

On Thu, Jan 10, 2013 at 10:45:42AM +0800, Mark Wu wrote:
On 01/08/2013 10:46 PM, Yaniv Kaul wrote:
There's talk about this for ages, so it's time to have proper discussion and a feature page about it: let us have a "migration" network role, and use such networks to carry migration data
When Engine requests to migrate a VM from one node to another, the VM state (Bios, IO devices, RAM) is transferred over a TCP/IP connection that is opened from the source qemu process to the destination qemu. Currently, destination qemu listens for the incoming connection on the management IP address of the destination host. This has serious downsides: a "migration storm" may choke the destination's management interface; migration is plaintext and ovirtmgmt includes Engine which sits may sit the node cluster.
With this feature, a cluster administrator may grant the "migration" role to one of the cluster networks. Engine would use that network's IP address on the destination host when it requests a migration of a VM. With proper network setup, migration data would be separated to that network.
=== Benefit to oVirt === * Users would be able to define and dedicate a separate network for migration. Users that need quick migration would use nics with high bandwidth. Users who want to cap the bandwidth consumed by migration could define a migration network over nics with bandwidth limitation. * Migration data can be limited to a separate network, that has no layer-2 access from Engine
=== Vdsm === The "migrate" verb should be extended with an additional parameter, specifying the address that the remote qemu process should listen on. A new argument is to be added to the currently-defined migration arguments: * vmId: UUID * dst: management address of destination host * dstparams: hibernation volumes definition * mode: migration/hibernation * method: rotten legacy * ''New'': migration uri, according to http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2 such as tcp://<ip of migration network on remote node>
=== Engine === As usual, complexity lies here, and several changes are required:
1. Network definition. 1.1 A new network role - not unlike "display network" should be added.Only one migration network should be defined on a cluster. 1.2 If none is defined, the legacy "use ovirtmgmt for migration" behavior would apply. 1.3 A migration network is more likely to be a ''required'' network, but a user may opt for non-required. He may face unpleasant surprises if he wants to migrate his machine, but no candidate host has the network available. 1.4 The "migration" role can be granted or taken on-the-fly, when hosts are active, as long as there are no currently-migrating VMs.
2. Scheduler 2.1 when deciding which host should be used for automatic migration, take into account the existence and availability of the migration network on the destination host. 2.2 For manual migration, let user migrate a VM to a host with no migration network - if the admin wants to keep jamming the management network with migration traffic, let her.
3. VdsBroker migration verb. 3.1 For the a modern cluster level, with migration network defined on the destination host, an additional ''miguri'' parameter should be added to the "migrate" command
_______________________________________________ Arch mailing list Arch@ovirt.org http://lists.ovirt.org/mailman/listinfo/arch How is the authentication of the peers handled? Do we need a cert
On 08/01/13 15:04, Dan Kenigsberg wrote: per each source/destination logical interface? Y. In my understanding, using a separate migration network doesn't change the current peers authentication. We still use the URI ''qemu+tls://remoeHost/system' to connect the target libvirt service if ssl enabled, and the remote host should be the ip address of management interface. But we can choose other interfaces except the manage interface to transport the migration data. We just change the migrateURI, so the current authentication mechanism should still work for this new feature. vdsm-vdsm and libvirt-libvirt communication is authenticated, but I am not sure at all that qemu-qemu communication is. AFAIK, there's not authentication between qemu-qemu communications.
After qemu is sprung up on the destination with -incoming <some ip>:<some port> , anything with access to that address could hijack the process. Our migrateURI starts with "tcp://" Dest libvirtd starts qemu with listening on that address/port, and qemu will close the listening socket on <some ip>:<some port> as soon as
On 01/10/2013 08:00 PM, Dan Kenigsberg wrote: the src host connects to it successfully. So it just listens in a very small window, but still possible to be hijacked. We could use iptables to only open the access to src host dynamically on migration for secure.
with all the consequences of this. That a good reason to make sure <some ip> has as limited access as possible
But maybe I'm wrong here, and libvir-list can show me the light.
Dan.

On Thu, Jan 10, 2013 at 02:00:57PM +0200, Dan Kenigsberg wrote:
vdsm-vdsm and libvirt-libvirt communication is authenticated, but I am not sure at all that qemu-qemu communication is.
After qemu is sprung up on the destination with -incoming <some ip>:<some port> , anything with access to that address could hijack the process. Our migrateURI starts with "tcp://" with all the consequences of this. That a good reason to make sure <some ip> has as limited access as possible.
The QEMU<->QEMU communication channel is neither authenticated or encrypted, so if you are allowing migration directly over QEMU TCP channels you have a requirement for a trusted, secure mgmt network for this traffic. If your network is not trusted, then currently the only alternative is to make use of libvirt tunnelled migration. I would like to see QEMU gain support for using TLS on its migration sockets, so that you can have a secure QEMU<->QEMU path without needing to tunnel via libvirtd. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (3)
-
Dan Kenigsberg
-
Daniel P. Berrange
-
Mark Wu