[libvirt] BUG: attaching - detaching network device only works 7 times

With the current tip: While extending a test case I found that attaching and detaching the following network device works only 7 times with the below script: <interface type='bridge'> <source bridge='static'/> <mac address='52:54:00:4d:a2:58'/> <target dev='attach0'/> </interface> let c=1; while test 1; do virsh attach-device acl attach.xml ; virsh detach-device acl attach.xml; echo ${c}; let c=c+1; done Then the following error occurs: error: Failed to attach device from attach.xml error: operation failed: parsing pci_add reply failed: Too Many NICs failed to add macaddr=52:54:00:4d:a2:58,vlan=1,name=net1 It looks like the detachment of the device is not done by qemu? Regards, Stefan

On Wed, Apr 14, 2010 at 08:16:22PM -0400, Stefan Berger wrote:
With the current tip: While extending a test case I found that attaching and detaching the following network device works only 7 times with the below script:
<interface type='bridge'> <source bridge='static'/> <mac address='52:54:00:4d:a2:58'/> <target dev='attach0'/> </interface>
let c=1; while test 1; do virsh attach-device acl attach.xml ; virsh detach-device acl attach.xml; echo ${c}; let c=c+1; done
Then the following error occurs:
error: Failed to attach device from attach.xml error: operation failed: parsing pci_add reply failed: Too Many NICs failed to add macaddr=52:54:00:4d:a2:58,vlan=1,name=net1
It looks like the detachment of the device is not done by qemu?
Yeah, sounds like it - what version of QEMU do you have ? Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 05:22:22 AM:
Please respond to "Daniel P. Berrange"
On Wed, Apr 14, 2010 at 08:16:22PM -0400, Stefan Berger wrote:
With the current tip: While extending a test case I found that
attaching
and detaching the following network device works only 7 times with the below script:
<interface type='bridge'> <source bridge='static'/> <mac address='52:54:00:4d:a2:58'/> <target dev='attach0'/> </interface>
let c=1; while test 1; do virsh attach-device acl attach.xml ; virsh detach-device acl attach.xml; echo ${c}; let c=c+1; done
Then the following error occurs:
error: Failed to attach device from attach.xml error: operation failed: parsing pci_add reply failed: Too Many NICs failed to add macaddr=52:54:00:4d:a2:58,vlan=1,name=net1
It looks like the detachment of the device is not done by qemu?
Yeah, sounds like it - what version of QEMU do you have ?
rpm -q --whatprovides /usr/bin/qemu-kvm qemu-system-x86-0.12.3-6.fc13.x86_64 Regards, Stefan

On Thu, Apr 15, 2010 at 07:34:54AM -0400, Stefan Berger wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 05:22:22 AM:
Please respond to "Daniel P. Berrange"
On Wed, Apr 14, 2010 at 08:16:22PM -0400, Stefan Berger wrote:
With the current tip: While extending a test case I found that
attaching
and detaching the following network device works only 7 times with the below script:
<interface type='bridge'> <source bridge='static'/> <mac address='52:54:00:4d:a2:58'/> <target dev='attach0'/> </interface>
let c=1; while test 1; do virsh attach-device acl attach.xml ; virsh detach-device acl attach.xml; echo ${c}; let c=c+1; done
Then the following error occurs:
error: Failed to attach device from attach.xml error: operation failed: parsing pci_add reply failed: Too Many NICs failed to add macaddr=52:54:00:4d:a2:58,vlan=1,name=net1
It looks like the detachment of the device is not done by qemu?
Yeah, sounds like it - what version of QEMU do you have ?
rpm -q --whatprovides /usr/bin/qemu-kvm qemu-system-x86-0.12.3-6.fc13.x86_64
It looks like PCI device delete is completely fubar in qemu 0.12.x to me # qemu-kvm -monitor stdio QEMU 0.12.1 monitor - type 'help' for more information (qemu) device_add lsi,id=foo (qemu) info qtree bus: main-system-bus type System dev: i440FX-pcihost, id "" bus: pci.0 type PCI dev: lsi53c895a, id "foo" bus-prop: addr = 04.0 bus-prop: romfile = <null> bus-prop: rombar = 1 class SCSI controller, addr 00:04.0, pci id 1000:0012 (sub 1af4:1000) bar 0: i/o at 0xffffffffffffffff [0xfe] bar 1: mem at 0xffffffffffffffff [0x3fe] bar 2: mem at 0xffffffffffffffff [0x1ffe] bus: foo.0 type SCSI ...snip... (qemu) device_del foo (qemu) info qtree bus: main-system-bus type System dev: i440FX-pcihost, id "" bus: pci.0 type PCI dev: lsi53c895a, id "foo" bus-prop: addr = 04.0 bus-prop: romfile = <null> bus-prop: rombar = 1 class SCSI controller, addr 00:04.0, pci id 1000:0012 (sub 1af4:1000) bar 0: i/o at 0xffffffffffffffff [0xfe] bar 1: mem at 0xffffffffffffffff [0x3fe] bar 2: mem at 0xffffffffffffffff [0x1ffe] bus: foo.0 type SCSI ...snip... Notice 'device_del' completed without error, but didn't actually delete the device. The same seems to be true of 'pci_del' :-( Even if that wasn't broken though, I don't see how NIC hotplug would work in your scenario. That error message about Too Many NICs is becuase the 'nd_table' in QEMU's net.c has all fields set 'used = 1'. I don't see any code which ever sets 'used = 0'. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 07:50:51 AM:
Even if that wasn't broken though, I don't see how NIC hotplug would
work
in your scenario. That error message about Too Many NICs is becuase the 'nd_table' in QEMU's net.c has all fields set 'used = 1'. I don't see any code which ever sets 'used = 0'.
no code there that ever decreases nb_nic, so unplug doesn't seem to be supported Would it be worth having such a simple test in libvirt repository itself or is that a case for the TCK project? Stefan

On Thu, Apr 15, 2010 at 12:19:48PM -0400, Stefan Berger wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 07:50:51 AM:
Even if that wasn't broken though, I don't see how NIC hotplug would
work
in your scenario. That error message about Too Many NICs is becuase the 'nd_table' in QEMU's net.c has all fields set 'used = 1'. I don't see any code which ever sets 'used = 0'.
no code there that ever decreases nb_nic, so unplug doesn't seem to be supported
Would it be worth having such a simple test in libvirt repository itself or is that a case for the TCK project?
Yep, this is perfect candidate for a TCK test case. Take the 210-nic-hotplug.t test case, and make it attempt to plug+unplug a NIC 35 times in a row. This should test this particular bug, and also validate that PCI addresses are being reused correctly (there're only 31 pci slots that can be used at any 1 time) Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 12:48:59 PM:
Please respond to "Daniel P. Berrange"
On Thu, Apr 15, 2010 at 12:19:48PM -0400, Stefan Berger wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 07:50:51 AM:
Even if that wasn't broken though, I don't see how NIC hotplug would
in your scenario. That error message about Too Many NICs is becuase
work the
'nd_table' in QEMU's net.c has all fields set 'used = 1'. I don't see any code which ever sets 'used = 0'.
no code there that ever decreases nb_nic, so unplug doesn't seem to be
supported
Would it be worth having such a simple test in libvirt repository itself or is that a case for the TCK project?
Yep, this is perfect candidate for a TCK test case. Take the 210-nic-hotplug.t test case, and make it attempt to plug+unplug a NIC 35 times in a row. This should test this particular bug, and also validate that PCI addresses are being reused correctly (there're only 31 pci slots that can be used at any 1 time)
another idea ... how about a daily 'weather report' from the Tck test suite sent to the mailing list? Stefan

On Thu, Apr 15, 2010 at 01:00:08PM -0400, Stefan Berger wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 12:48:59 PM:
Please respond to "Daniel P. Berrange"
On Thu, Apr 15, 2010 at 12:19:48PM -0400, Stefan Berger wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote on 04/15/2010 07:50:51 AM:
Even if that wasn't broken though, I don't see how NIC hotplug would
in your scenario. That error message about Too Many NICs is becuase
work the
'nd_table' in QEMU's net.c has all fields set 'used = 1'. I don't see any code which ever sets 'used = 0'.
no code there that ever decreases nb_nic, so unplug doesn't seem to be
supported
Would it be worth having such a simple test in libvirt repository itself or is that a case for the TCK project?
Yep, this is perfect candidate for a TCK test case. Take the 210-nic-hotplug.t test case, and make it attempt to plug+unplug a NIC 35 times in a row. This should test this particular bug, and also validate that PCI addresses are being reused correctly (there're only 31 pci slots that can be used at any 1 time)
another idea ... how about a daily 'weather report' from the Tck test suite sent to the mailing list?
We'd certainly like to get that going some time. It can optionally output the results in both HTML and XML formats. Our long term plan is to have a wide variety of OS, hypervisors + libvirt versions being tested and collating reports for them all. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Wed, Apr 14, 2010 at 08:16:22PM -0400, Stefan Berger wrote:
With the current tip: While extending a test case I found that attaching and detaching the following network device works only 7 times with the below script:
<interface type='bridge'> <source bridge='static'/> <mac address='52:54:00:4d:a2:58'/> <target dev='attach0'/> </interface>
let c=1; while test 1; do virsh attach-device acl attach.xml ; virsh detach-device acl attach.xml; echo ${c}; let c=c+1; done
Then the following error occurs:
error: Failed to attach device from attach.xml error: operation failed: parsing pci_add reply failed: Too Many NICs failed to add macaddr=52:54:00:4d:a2:58,vlan=1,name=net1
It looks like the detachment of the device is not done by qemu?
I've investigated a little more and it appears this is onyl a problem when using the legacy mode of setting up NICs using -net nic,... When using -device this does not appear to be triggered. There is a bug in libvirt though, which means even though your QEMU 0.12 supports -device we're only using that at boot time. I forgot to port hotplug of NICs to use device_add. Danie -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
Stefan Berger