[libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add

I am attempting to enhance libvirt's virDomainUpdateDeviceFlags() API to support changing "just about anything" about the host side of a PCI network device without actually detaching the PCI device from the guest. Here is a patch I sent to the libvirt mailing list that I had thought would accomplish this task: https://www.redhat.com/archives/libvir-list/2012-October/msg00546.html I am using qemu-kvm-1.2-0.1.20120806git3e430569.fc17.x86_64 on Fedora 17 for my testing. Since the host side and guest side are created (and deleted) with separate monitor commands ("netdev_(add|del)" vs. "device_(add|del)", we had thought that it would be possible to use netdev_del to disconnect everything from the host side, [*not disconnect the guest side*], then create a new tap device and connect it with netdev_add(). And, actually, the netdev_del+netdev_add sequence does complete without error; unfortunately, no traffic is visible on the tap device (looking from the host with tcpdump). When I modify the patch above to also include the device_del and device_add monitor calls (with a 3 second delay in between to allow for the guest's PCI detach to complete), then the device does work properly. Of course in this case (1) the guest sees the device completely disappear for a period, then reappear, which is more disruption than I want, and (2) because qemu has no asynchronous event to notify libvirt when the guest's PCI detach has actually completed, I have to stick in an arbitrary call to sleep() which is generally *way* too long, but may be too short in some cases of extremely high load. The only comment I got from IRC on Friday afternoon (I know - not a good time to be looking for people) was that they would be "surprised if it did work". So, I have the following questions: 1) Should this work? If it's supposed to work now: 2) can you give hints (aside from watching the qemu monitor commands and responses with stap) on what I might need to change, or how to further debug my problem within qemu? (I'm pretty well convinced that the libvirt code is doing the tap device creation/etc correctly). 3) alternately can you verify that this is a known bug? Is fixing it on anyone's todo list? If it's not supposed to work now: 4) Does it sound like a reasonable thing for qemu to support? 5) Is there some other formal way to request addition of this functionality (aside from figuring it out myself and posting a patch)? ******************************************** For reference, here is the sequence of qemu monitor commands sent by libvirt to fully detach, then fully reattach a network device. Note that fd is a newly opened TAP device. Also note the 3 second interval between the netdev_del and the next command: 96.671 > 0x7f8e20000c90 {"execute":"device_del","arguments":{"id":"net0"},"id":"libvirt-25"} 96.673 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-25"} 96.674 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-26"} 96.695 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-26"} 99.777 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-27"} (fd=27) 99.777 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-27"} 99.778 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-28"} 99.778 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-28"} 99.779 > 0x7f8e20000c90 {"execute":"device_add","arguments":{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"52:54:00:d8:bd:b9","bus":"pci.0","addr":"0x4"},"id":"libvirt-29"} 99.780 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-29"} After this sequence is done, the guest network device is fully functioning. Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. Oh - the extra "set_link" command at the end is because I noticed that the flags shown in ifconfig in the guest switched from: UP BROADCAST RUNNING MULTICAST to UP BROADCAST MULTICAST when reconnecting in this way, so I was hoping that forcing the interface up would solve my problems. It didn't :-/ (Another note: I also tried adding a delay after the netdev_del, and that also did nothing.)

On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host.
What you are trying to do isn't possible today. The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? Stefan

On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host.
What you are trying to do isn't possible today.
The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev.
It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev?
We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Another requirement is to be able to start a guest with a "null" backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running). Regards, Daniel [1] Obviously the guest might need to reconfigure its IP or re-run DHCP though. -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 10/15/2012 05:25 AM, Daniel P. Berrange wrote:
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today.
Well, at least it's good to know that I should stop trying to make it work :-) Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a-->b relationship although that is the case, and that 2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.)
The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev.
It is certainly possible to implement a command to switch netdevs
At this point yes, it would be better to have a new command rather than to make netdev_add work in the way I've attempted - this way there would be a new command whose presence libvirt could use to decide whether or not to support this functionality.
but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca.
Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add.

Laine Stump <laine@redhat.com> writes:
On 10/15/2012 05:25 AM, Daniel P. Berrange wrote:
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"}
This deletes the backend, and leaves the frontend NIC without a backend. Such as NIC behaves / should behave like it's not connected to anything (link down).
168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90
This creates a new backend, not connected to any frontend. The fact that it has the same ID as some deleted backend is completely immaterial.
{"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
This orders the NIC to change the link status to "up". Can't work, because it's still not connected to anything. It succeeds anyway, which could be regarded as a bug.
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today.
Well, at least it's good to know that I should stop trying to make it work :-)
Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a-->b relationship although that is the case, and that
It isn't really the case. Network frontend and backend are really separate things, but...
2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.)
The device associates with the netdev during initialization only - there
... the connection between the two can only be made during frontend initialization. Not because of design limitations, just because more dynamic connecting hasn't been implemented.
is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev.
It is certainly possible to implement a command to switch netdevs
At this point yes, it would be better to have a new command rather than to make netdev_add work in the way I've attempted - this way there would be a new command whose presence libvirt could use to decide whether or not to support this functionality.
Besides, I'd oppose ID magic like making netdev_add behave differently when the ID matches some previously used ID.
but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca.
Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add.
Wanting to connect a frontend NIC to a different backend seems entirely fair to me. Patches welcome :)

On Mon, Oct 15, 2012 at 11:15:30AM -0400, Laine Stump wrote:
On 10/15/2012 05:25 AM, Daniel P. Berrange wrote:
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today.
Well, at least it's good to know that I should stop trying to make it work :-)
Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a-->b relationship although that is the case, and that 2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.)
The commands are historic (at least to me) and we have to make the best of them.
but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca.
Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add.
Yep, I just checked too. macvlan_dev->lowerdev is only set in macvlan_common_newlink(). There is no way to change it once the link has been created. Stefan

On Mon, Oct 15, 2012 at 10:25:58AM +0100, Daniel P. Berrange wrote:
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host.
What you are trying to do isn't possible today.
The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev.
It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev?
We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca.
Another requirement is to be able to start a guest with a "null" backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running).
virtio-net presents a challenge because checksum offload and other advanced features are announced in the virtio feature bits. Virtio feature bits don't change during the lifetime of the device and there's no way to notify existing guests to re-negotiate them besides taking down the device. The offload feature bits are tied to the netdev in QEMU, especially the tap driver's vnet_hdr feature which allows the guest to pass through offload flags to the host network stack. QEMU does not emulate these today and only enables them when the netdev supports vnet_hdr. In other words, virtio-net is tied to its netdev. Changing from -netdev tap,vhost=on setup to a -netdev user is difficult. Two possibilities: 1. Add offload emulation code to QEMU. 2. Place sufficient checks in QEMU and libvirt so that only "safe" netdev changes can be made. #1 is unattractive because this code path will rarely be used but is complex (a bunch of buffer munging and memory management). Are you trying to change netdev without involving the guest? In that case the link must stay up and libvirt needs to ensure that the new netdev will have a compatible network configuration (subnet, gateway, IP address details). Stefan

On Tue, Oct 16, 2012 at 10:08:21AM +0200, Stefan Hajnoczi wrote:
On Mon, Oct 15, 2012 at 10:25:58AM +0100, Daniel P. Berrange wrote:
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote:
On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote:
Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems):
168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"}
After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host.
What you are trying to do isn't possible today.
The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev.
It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev?
We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca.
Another requirement is to be able to start a guest with a "null" backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running).
virtio-net presents a challenge because checksum offload and other advanced features are announced in the virtio feature bits. Virtio feature bits don't change during the lifetime of the device and there's no way to notify existing guests to re-negotiate them besides taking down the device.
The offload feature bits are tied to the netdev in QEMU, especially the tap driver's vnet_hdr feature which allows the guest to pass through offload flags to the host network stack. QEMU does not emulate these today and only enables them when the netdev supports vnet_hdr.
In other words, virtio-net is tied to its netdev. Changing from -netdev tap,vhost=on setup to a -netdev user is difficult.
Urgh, so much for there being a clean separation between frontend and backend :-(
Two possibilities:
1. Add offload emulation code to QEMU.
2. Place sufficient checks in QEMU and libvirt so that only "safe" netdev changes can be made.
#1 is unattractive because this code path will rarely be used but is complex (a bunch of buffer munging and memory management).
Agreed, sounds like 2 is the only practical option.
Are you trying to change netdev without involving the guest? In that case the link must stay up and libvirt needs to ensure that the new netdev will have a compatible network configuration (subnet, gateway, IP address details).
We'd leave the decision about that upto the management tool using these APIs. In some cases the subnet/ip/gateway/etc might remain the same, in other cases, the mgmt tool may want to set the link down, change backend then set the link online again, to make NetworkManager (or whatever) redo DHCP in the guest. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (4)
-
Daniel P. Berrange
-
Laine Stump
-
Markus Armbruster
-
Stefan Hajnoczi