[libvirt-users] live migration via unix socket

David Vossel

28 Aug 2018 28 Aug '18

9:07 p.m.

Hey, Over in KubeVirt we're investigating a use case where we'd like to perform a live migration within a network namespace that does not provide libvirtd with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds. virsh example of an attempted migration via unix socket. virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock In this example, the src libvirtd is able to establish a connection to the destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport. Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm wrong. Is there a way to achieve this migration via unix socket functionality this using Libvirt? Also, is there a reason why the migration uri is limited to tcp/rdma Thanks! - David [1] https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_migration.c#L27...

Attachments:

attachment.html (text/html — 1.8 KB)

Show replies by date

Daniel P. Berrangé

29 Aug 29 Aug

8:55 a.m.

On Tue, Aug 28, 2018 at 05:07:18PM -0400, David Vossel wrote:

...

Hey,

Over in KubeVirt we're investigating a use case where we'd like to perform a live migration within a network namespace that does not provide libvirtd with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds.

virsh example of an attempted migration via unix socket.

virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock

In this example, the src libvirtd is able to establish a connection to the destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport.

qemu+unix is a syntax for libvirt's URI format. The URI scheme for migration is not the same, so you can't simply plug in qemu+unix here.

...

Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm wrong.

Is there a way to achieve this migration via unix socket functionality this using Libvirt? Also, is there a reason why the migration uri is limited to tcp/rdma

Internally libvirt does exactly this when using its TUNNELLED live migration mode. In this QEMU is passed an anonymous UNIX socket and the data is all copied over the libvirtd <-> libvirtd connection and then copied again back to QEMU on another UNIX socket. This was done because QEMU has long had no ability to encrypt live migration, so tunnelling over libvirtd's own TLS secured connection was only secure mechanism. We've done work in QEMU to natively support TLS now so that we can get rid of this tunnelling, as this architecture decreased performance and consumed precious CPU memory bandwidth, which is particularly bad when libvirtd and QEMU were on different NUMA nodes. It is already a challenge to get live migration to successfully complete even with a direct network connection. Although QEMU can do it at the low level, we've never exposed anything other than direct network transports at the API level. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

David Vossel

10 Sep 10 Sep

6:38 p.m.

On Wed, Aug 29, 2018 at 4:55 AM, Daniel P. Berrangé <berrange@redhat.com> wrote:

...

...
Hey,

Over in KubeVirt we're investigating a use case where we'd like to

...
a live migration within a network namespace that does not provide

...
with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds.

virsh example of an attempted migration via unix socket.

virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock

In this example, the src libvirtd is able to establish a connection to

On Tue, Aug 28, 2018 at 05:07:18PM -0400, David Vossel wrote: perform libvirtd the

...
destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport.

qemu+unix is a syntax for libvirt's URI format. The URI scheme for migration is not the same, so you can't simply plug in qemu+unix here.

...
Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm wrong.

Is there a way to achieve this migration via unix socket functionality

this

...
using Libvirt? Also, is there a reason why the migration uri is limited to tcp/rdma

Internally libvirt does exactly this when using its TUNNELLED live migration mode. In this QEMU is passed an anonymous UNIX socket and the data is all copied over the libvirtd <-> libvirtd connection and then copied again back

Sorry for the delayed response here, I've only just picked this task back up again recently. With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow the libvirtd <-> libvirtd connection over a unix socket. Libvirt returns this error "Attempt to migrate guest to the same host". The virDomainMigrateCheckNotLocal() function ensures that a peer2peer migration won't occur when the destination is a unix socket. Is there anyway around this? We'd like to tunnel the destination connection through a unix socket. The other side of the unix socket is a network proxy in a different network namespace which properly performs the remote connection.

...

to QEMU on another UNIX socket. This was done because QEMU has long had no ability to encrypt live migration, so tunnelling over libvirtd's own TLS secured connection was only secure mechanism.

...

We've done work in QEMU to natively support TLS now so that we can get rid of this tunnelling, as this architecture decreased performance and consumed precious CPU memory bandwidth, which is particularly bad when libvirtd and QEMU were on different NUMA nodes. It is already a challenge to get live migration to successfully complete even with a direct network connection. Although QEMU can do it at the low level, we've never exposed anything other than direct network transports at the API level.

Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/ dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/ dberrange :|

Martin Kletzander

12 Sep 12 Sep

10:59 a.m.

On Mon, Sep 10, 2018 at 02:38:48PM -0400, David Vossel wrote:

...

On Wed, Aug 29, 2018 at 4:55 AM, Daniel P. Berrangé <berrange@redhat.com> wrote:

...
...
Hey,

Over in KubeVirt we're investigating a use case where we'd like to

...
a live migration within a network namespace that does not provide

...
with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds.

virsh example of an attempted migration via unix socket.

virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock

In this example, the src libvirtd is able to establish a connection to

On Tue, Aug 28, 2018 at 05:07:18PM -0400, David Vossel wrote: perform libvirtd the

...
destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport.

qemu+unix is a syntax for libvirt's URI format. The URI scheme for migration is not the same, so you can't simply plug in qemu+unix here.

...
Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm wrong.

Is there a way to achieve this migration via unix socket functionality

this

...
using Libvirt? Also, is there a reason why the migration uri is limited to tcp/rdma

Internally libvirt does exactly this when using its TUNNELLED live migration mode. In this QEMU is passed an anonymous UNIX socket and the data is all copied over the libvirtd <-> libvirtd connection and then copied again back

Sorry for the delayed response here, I've only just picked this task back up again recently.

With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow the libvirtd <-> libvirtd connection over a unix socket.

Libvirt returns this error "Attempt to migrate guest to the same host". The virDomainMigrateCheckNotLocal() function ensures that a peer2peer migration won't occur when the destination is a unix socket.

Is there anyway around this? We'd like to tunnel the destination connection through a unix socket. The other side of the unix socket is a network proxy in a different network namespace which properly performs the remote connection.

IMHO that is there just for additional safety since the check with serves the same purpose is done again in more sensible matter later on (checking that the hostnames and UUIDs are different). Actually it's just an older check before the UUID and hostname were sent in the migration cookie. And that's there for quite some time. IMHO that check can go. In the worst case we can skip that check (!tempuri->server) if you ask for unsafe migration. Also, just to try it out, you *might* be able to work around that check by using something like unix://localhost.localdomain/path/to/unix.socket (basically adding any hostname different than localhost there), but I might be wrong there.

...

...
to QEMU on another UNIX socket. This was done because QEMU has long had no ability to encrypt live migration, so tunnelling over libvirtd's own TLS secured connection was only secure mechanism.

...
We've done work in QEMU to natively support TLS now so that we can get rid of this tunnelling, as this architecture decreased performance and consumed precious CPU memory bandwidth, which is particularly bad when libvirtd and QEMU were on different NUMA nodes. It is already a challenge to get live migration to successfully complete even with a direct network connection. Although QEMU can do it at the low level, we've never exposed anything other than direct network transports at the API level.

Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/ dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/ dberrange :|

David Vossel

14 Sep 14 Sep

4:55 p.m.

On Wed, Sep 12, 2018 at 6:59 AM, Martin Kletzander <mkletzan@redhat.com> wrote:

...

On Mon, Sep 10, 2018 at 02:38:48PM -0400, David Vossel wrote:

...
On Wed, Aug 29, 2018 at 4:55 AM, Daniel P. Berrangé <berrange@redhat.com> wrote:

On Tue, Aug 28, 2018 at 05:07:18PM -0400, David Vossel wrote:

...
...
Hey,

Over in KubeVirt we're investigating a use case where we'd like to perform a live migration within a network namespace that does not provide libvirtd with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds.

virsh example of an attempted migration via unix socket.

virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock

In this example, the src libvirtd is able to establish a connection to the destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport.

qemu+unix is a syntax for libvirt's URI format. The URI scheme for migration is not the same, so you can't simply plug in qemu+unix here.

...
Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm

wrong.

...
Is there a way to achieve this migration via unix socket functionality

this

...
using Libvirt? Also, is there a reason why the migration uri is limited to tcp/rdma

Internally libvirt does exactly this when using its TUNNELLED live migration mode. In this QEMU is passed an anonymous UNIX socket and the data is all copied over the libvirtd <-> libvirtd connection and then copied again back

Sorry for the delayed response here, I've only just picked this task back up again recently.

With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow the libvirtd <-> libvirtd connection over a unix socket.

Libvirt returns this error "Attempt to migrate guest to the same host". The virDomainMigrateCheckNotLocal() function ensures that a peer2peer migration won't occur when the destination is a unix socket.

Is there anyway around this? We'd like to tunnel the destination connection through a unix socket. The other side of the unix socket is a network proxy in a different network namespace which properly performs the remote connection.

IMHO that is there just for additional safety since the check with serves the same purpose is done again in more sensible matter later on (checking that the hostnames and UUIDs are different). Actually it's just an older check before the UUID and hostname were sent in the migration cookie. And that's there for quite some time.

IMHO that check can go. In the worst case we can skip that check (!tempuri->server) if you ask for unsafe migration.

Also, just to try it out, you *might* be able to work around that check by using something like unix://localhost.localdomain/path/to/unix.socket (basically adding any hostname different than localhost there), but I might be wrong there.

I tried a few variations of this and none of them worked :( Any chance we can get the safety check removed for the next Libvirt release? Does there need to be an issue opened to track this?

...

...
to QEMU on another UNIX socket. This was done because QEMU has long had no

...
ability to encrypt live migration, so tunnelling over libvirtd's own TLS secured connection was only secure mechanism.

We've done work in QEMU to natively support TLS now so that we can get rid

...
of this tunnelling, as this architecture decreased performance and consumed precious CPU memory bandwidth, which is particularly bad when libvirtd and QEMU were on different NUMA nodes. It is already a challenge to get live migration to successfully complete even with a direct network connection. Although QEMU can do it at the low level, we've never exposed anything other than direct network transports at the API level.

Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/ dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/ dberrange :|

Fabian Deutsch

17 Sep 17 Sep

12:17 p.m.

On Fri, Sep 14, 2018 at 6:55 PM David Vossel <dvossel@redhat.com> wrote:

...

On Wed, Sep 12, 2018 at 6:59 AM, Martin Kletzander <mkletzan@redhat.com> wrote:

...
On Mon, Sep 10, 2018 at 02:38:48PM -0400, David Vossel wrote:

...
On Wed, Aug 29, 2018 at 4:55 AM, Daniel P. Berrangé <berrange@redhat.com

...
wrote:

On Tue, Aug 28, 2018 at 05:07:18PM -0400, David Vossel wrote:

...
...
Hey,

Over in KubeVirt we're investigating a use case where we'd like to perform a live migration within a network namespace that does not provide libvirtd with network access. In this scenario we would like to perform a live migration by proxying the migration through a unix socket to a process in another network namespace that does have network access. That external process would live on every node in the cluster and know how to correctly route connections between libvirtds.

virsh example of an attempted migration via unix socket.

virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm qemu+unix:///system?socket=destination-host-proxy-sock

In this example, the src libvirtd is able to establish a connection to the destination libvirtd via the unix socket proxy. However, the migration-uri appears to require either tcp or rdma network connection. If I force the migration-uri to be a unix socket, I receive an error [1] indicating that qemu+unix is not a valid transport.

qemu+unix is a syntax for libvirt's URI format. The URI scheme for migration is not the same, so you can't simply plug in qemu+unix here.

...
Technically with qemu+kvm I believe what we're attempting should be possible (even though it is inefficient). Please correct me if I'm

wrong.

...
Is there a way to achieve this migration via unix socket functionality

...
using Libvirt? Also, is there a reason why the migration uri is

this limited to

...
tcp/rdma

Internally libvirt does exactly this when using its TUNNELLED live migration mode. In this QEMU is passed an anonymous UNIX socket and the data is all copied over the libvirtd <-> libvirtd connection and then copied again back

Sorry for the delayed response here, I've only just picked this task back up again recently.

With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow the libvirtd <-> libvirtd connection over a unix socket.

Libvirt returns this error "Attempt to migrate guest to the same host". The virDomainMigrateCheckNotLocal() function ensures that a peer2peer migration won't occur when the destination is a unix socket.

Is there anyway around this? We'd like to tunnel the destination connection through a unix socket. The other side of the unix socket is a network proxy in a different network namespace which properly performs the remote connection.

IMHO that is there just for additional safety since the check with serves the same purpose is done again in more sensible matter later on (checking that the hostnames and UUIDs are different). Actually it's just an older check before the UUID and hostname were sent in the migration cookie. And that's there for quite some time.

IMHO that check can go. In the worst case we can skip that check (!tempuri->server) if you ask for unsafe migration.

Also, just to try it out, you *might* be able to work around that check by using something like unix://localhost.localdomain/path/to/unix.socket (basically adding any hostname different than localhost there), but I might be wrong there.

I tried a few variations of this and none of them worked :(

Any chance we can get the safety check removed for the next Libvirt release? Does there need to be an issue opened to track this?

Regardless of Martin's answer :): Please file one. Please file an RFE requesting the change and stating the motivation. - fabian

...

...
...
to QEMU on another UNIX socket. This was done because QEMU has long had

...
no ability to encrypt live migration, so tunnelling over libvirtd's own TLS secured connection was only secure mechanism.

We've done work in QEMU to natively support TLS now so that we can get

...
rid of this tunnelling, as this architecture decreased performance and consumed precious CPU memory bandwidth, which is particularly bad when libvirtd and QEMU were on different NUMA nodes. It is already a challenge to get live migration to successfully complete even with a direct network connection. Although QEMU can do it at the low level, we've never exposed anything other than direct network transports at the API level.

Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/ dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/ dberrange :|

Martin Kletzander

12 Oct 12 Oct

8:50 a.m.

On Mon, Sep 17, 2018 at 02:17:39PM +0200, Fabian Deutsch wrote:

...

On Fri, Sep 14, 2018 at 6:55 PM David Vossel <dvossel@redhat.com> wrote:

...
Any chance we can get the safety check removed for the next Libvirt release? Does there need to be an issue opened to track this?

Regardless of Martin's answer :): Please file one. Please file an RFE requesting the change and stating the motivation.

Is there any BZ or issue created where I could post an update? I spent some time with this and I got stuck at what looks like the daemon not having a remote driver instantiated "at some times". Either it's something very peculiar or I'm missing something.

David Vossel

5:50 p.m.

On Fri, Oct 12, 2018 at 4:50 AM Martin Kletzander <mkletzan@redhat.com> wrote:

...

On Mon, Sep 17, 2018 at 02:17:39PM +0200, Fabian Deutsch wrote:

...
On Fri, Sep 14, 2018 at 6:55 PM David Vossel <dvossel@redhat.com> wrote:

...
Any chance we can get the safety check removed for the next Libvirt release? Does there need to be an issue opened to track this?

Regardless of Martin's answer :): Please file one. Please file an RFE requesting the change and stating the motivation.

Is there any BZ or issue created where I could post an update? I spent some time with this and I got stuck at what looks like the daemon not having a remote driver instantiated "at some times". Either it's something very peculiar or I'm missing something.

Hey, I've created a few BZ's for issues we encountered attempting to introduce live migrations into KubeVirt. Unable to migrate between to libvirt environments with the same hostname https://bugzilla.redhat.com/show_bug.cgi?id=1638882 Unable to perform tunnelled migration to destination libvirt over unix socket https://bugzilla.redhat.com/show_bug.cgi?id=1638889 Libvirt Lifecycle events not firing after migration. https://bugzilla.redhat.com/show_bug.cgi?id=1638894

2635

Age (days ago)

2680

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Daniel P. Berrangé
David Vossel
Fabian Deutsch
Martin Kletzander