[libvirt] [PATCH] util: increase libnl buffer size to 1M

nl_recv() returns the error "No buffer space available" when using virsh destroy domain with 240 or more passhthrough network interfaces. The patch increases libnl sock receive buffer size to 1M, and nl_recv() doesn't return error when destroying domain with 512 network interfaces. Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn> --- src/util/virnetlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 92ecf77..bb56c54 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -189,10 +189,10 @@ virNetlinkCreateSocket(int protocol) goto error; } - if (virNetlinkSetBufferSize(nlhandle, 131702, 0) < 0) { + if (virNetlinkSetBufferSize(nlhandle, 1048576, 0) < 0) { virReportSystemError(errno, "%s", _("cannot set netlink socket buffer " - "size to 128k")); + "size to 1M")); goto error; } nl_socket_enable_msg_peek(nlhandle); -- 1.8.3.1

On 06/29/2017 02:05 PM, ZhiPeng Lu wrote:
nl_recv() returns the error "No buffer space available" when using virsh destroy domain with 240 or more passhthrough network interfaces.
pass-through
The patch increases libnl sock receive buffer size to 1M, and nl_recv() doesn't return error when destroying domain with 512 network interfaces.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn> --- src/util/virnetlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
This feels like something that perhaps should be configurable - that is some /etc/libvirt/libvirtd.conf variable; otherwise, we'll keep hitting some conflated maximum based on the size of something. John There's quite a bit of history in the archives from the original implementation of this API... Not sure if you read it or not, but since I was looking through the history, here v1: https://www.redhat.com/archives/libvir-list/2015-December/msg00407.html v1 followup: https://www.redhat.com/archives/libvir-list/2016-January/msg00107.html v2: https://www.redhat.com/archives/libvir-list/2016-January/msg00339.html v3: https://www.redhat.com/archives/libvir-list/2016-January/msg00342.html v4: https://www.redhat.com/archives/libvir-list/2016-January/msg00866.html
diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 92ecf77..bb56c54 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -189,10 +189,10 @@ virNetlinkCreateSocket(int protocol) goto error; }
- if (virNetlinkSetBufferSize(nlhandle, 131702, 0) < 0) { + if (virNetlinkSetBufferSize(nlhandle, 1048576, 0) < 0) { virReportSystemError(errno, "%s", _("cannot set netlink socket buffer " - "size to 128k")); + "size to 1M")); goto error; } nl_socket_enable_msg_peek(nlhandle);

On Mon, Jul 10, 2017 at 02:51:34PM -0400, John Ferlan wrote:
On 06/29/2017 02:05 PM, ZhiPeng Lu wrote:
nl_recv() returns the error "No buffer space available" when using virsh destroy domain with 240 or more passhthrough network interfaces.
pass-through
The patch increases libnl sock receive buffer size to 1M, and nl_recv() doesn't return error when destroying domain with 512 network interfaces.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn> --- src/util/virnetlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
This feels like something that perhaps should be configurable - that is some /etc/libvirt/libvirtd.conf variable; otherwise, we'll keep hitting some conflated maximum based on the size of something.
1 MB matches what systemed/udevd uses, so if we hit that limit, then the system as a whole is going to struggle already. So I don't think we need make it configurable. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

5Y+R5Lu25Lq677yaIDxiZXJyYW5nZUByZWRoYXQuY29tPg0K5pS25Lu25Lq677yaIDxqZmVybGFu QHJlZGhhdC5jb20+DQrmioTpgIHkurrvvJroiqblv5fmnIsxMDEwODI3MiA8bGlidmlyLWxpc3RA cmVkaGF0LmNvbT4NCuaXpSDmnJ8g77yaMjAxN+W5tDA35pyIMTHml6UgMTc6MTENCuS4uyDpopgg 77yaUmU6IFtsaWJ2aXJ0XSBbUEFUQ0hdIHV0aWw6IGluY3JlYXNlIGxpYm5sIGJ1ZmZlciBzaXpl IHRvIDFNDQoNCg0KDQoNCg0KDQo+Pk9uIE1vbiwgSnVsIDEwLCAyMDE3IGF0IDAyOjUxOjM0UE0g LTA0MDAsIEpvaG4gRmVybGFuIHdyb3RlOg0KPj4NCj4gPg0KPj4gT24gMDYvMjkvMjAxNyAwMjow NSBQTSwgWmhpUGVuZyBMdSB3cm90ZToNCj4+ID4gbmxfcmVjdigpIHJldHVybnMgdGhlIGVycm9y ICJObyBidWZmZXIgc3BhY2UgYXZhaWxhYmxlIg0KPj4gPiB3aGVuIHVzaW5nIHZpcnNoIGRlc3Ry b3kgZG9tYWluIHdpdGggMjQwIG9yIG1vcmUNCj4+ID4gcGFzc2h0aHJvdWdoIG5ldHdvcmsgaW50 ZXJmYWNlcy4NCj4+IA0KPj4gcGFzcy10aHJvdWdoDQo+PiANCj4+ID4gVGhlIHBhdGNoIGluY3Jl YXNlcyBsaWJubCBzb2NrIHJlY2VpdmUgYnVmZmVyIHNpemUgdG8gMU0sDQo+PiA+IGFuZCBubF9y ZWN2KCkgZG9lc24ndCByZXR1cm4gZXJyb3Igd2hlbiBkZXN0cm95aW5nIGRvbWFpbg0KPj4gPiB3 aXRoIDUxMiBuZXR3b3JrIGludGVyZmFjZXMuDQo+PiA+IA0KPj4gPiBTaWduZWQtb2ZmLWJ5OiBa aGlQZW5nIEx1IDxsdS56aGlwZW5nQHp0ZS5jb20uY24+DQo+PiA+IC0tLQ0KPj4gPiAgc3JjL3V0 aWwvdmlybmV0bGluay5jIHwgNCArKy0tDQo+PiA+ICAxIGZpbGUgY2hhbmdlZCwgMiBpbnNlcnRp b25zKCspLCAyIGRlbGV0aW9ucygtKQ0KPj4gPiANCj4+IA0KPj4gVGhpcyBmZWVscyBsaWtlIHNv bWV0aGluZyB0aGF0IHBlcmhhcHMgc2hvdWxkIGJlIGNvbmZpZ3VyYWJsZSAtIHRoYXQgaXMNCj4+ IHNvbWUgL2V0Yy9saWJ2aXJ0L2xpYnZpcnRkLmNvbmYgdmFyaWFibGUgb3RoZXJ3aXNlLCB3ZSds bCBrZWVwIGhpdHRpbmcNCj4+IHNvbWUgY29uZmxhdGVkIG1heGltdW0gYmFzZWQgb24gdGhlIHNp emUgb2Ygc29tZXRoaW5nLg0KDQo+MSBNQiBtYXRjaGVzIHdoYXQgc3lzdGVtZWQvdWRldmQgdXNl cywgc28gaWYgd2UgaGl0IHRoYXQgbGltaXQsIHRoZW4gIHRoZQ0KPnN5c3RlbSBhcyBhIHdob2xl IGlzIGdvaW5nIHRvIHN0cnVnZ2xlIGFscmVhZHkuIFNvIEkgZG9uJ3QgdGhpbmsgd2UgbmVlZA0K Pm1ha2UgaXQgY29uZmlndXJhYmxlLg0KDQoNCg0KDQotLS0tLS0tLS0tMU0gaXMganVzdCBhbiBl eHBlcmllbmNlIHZhbHVlLCBhbmQgSSBmZWVsIHRoYXQgc2V0dGluZyB1cCBhIGNvbmZpZ3VyYXRp b24gaXRlbQ0KDQoNCiBjYW4gYWNjb21tb2RhdGUgdGhlIG5lZWRzIG9mIGRpZmZlcmVudCB1c2Vy cy4gDQoNCg0KSW4gYWRkaXRpb24sIHRoaXMgaXMgb25seSB0aGUgbmV0bGluayBzb2NrZXQgcmVj ZWl2ZSBidWZmZXIgc2l6ZSB0aGF0IHdpbGwgbm90IGFmZmVjdCB0aGUgZW50aXJlIHN5c3RlbS4N Cg0KDQpUaGUgZGVmYXVsdCB2YWx1ZSBjYW4gc3RpbGwgYmUgc2V0IHRvIDEyOEsuDQoNCg0KDQoN Cg0KDQoNCg0KDQoNCg0KDQoNCg0K6Iqm5b+X5pyLIGx1emhpcGVuZw0KDQoNCg0KDQoNCg0KSVTl vIDlj5Hlt6XnqIvluIggSVQgRGV2ZWxvcG1lbnQKRW5naW5lZXINCuaTjeS9nOezu+e7n+S6p+WT gemDqC/kuK3lv4PnoJTnqbbpmaIv57O757uf5Lqn5ZOBIE9TIFByb2R1Y3QgRGVwdC4vQ2VudHJh bCBS77yGRCBJbnN0aXR1dGUvU3lzdGVtIFByb2R1Y3QNCg0KDQoNCg0KDQoNCg0KDQoNCua3seWc s+W4guWNl+WxseWMuuenkeaKgOWNl+i3rzU15Y+35Lit5YW06YCa6K6v56CU5Y+R5aSn5qW8MzPm pbwgDQozMy9GLCBSJkQgQnVpbGRpbmcsIFpURQpDb3Jwb3JhdGlvbiBIaS10ZWNoIFJvYWQgU291 dGgsIA0KSGktdGVjaApJbmR1c3RyaWFsIFBhcmsgTmFuc2hhbiBEaXN0cmljdCwgU2hlbnpoZW4s IFAuUi5DaGluYSwgNTE4MDU3IA0KVDogKzg2IDc1NSB4eHh4eHh4eCBGOis4NiA3NTUgeHh4eHh4 eHggDQpNOiArODYgeHh4eHh4eHh4eHggDQpFOiBsdS56aGlwZW5nQHp0ZS5jb20uY24gDQp3d3cu enRlLmNvbS5jbg0KDQoNCg0KDQoNCg0K5Y6f5aeL6YKu5Lu2DQoNCg0KDQrlj5Hku7bkurrvvJog PGJlcnJhbmdlQHJlZGhhdC5jb20+DQrmlLbku7bkurrvvJogPGpmZXJsYW5AcmVkaGF0LmNvbT4N CuaKhOmAgeS6uu+8muiKpuW/l+acizEwMTA4MjcyIDxsaWJ2aXItbGlzdEByZWRoYXQuY29tPg0K 5pelIOacnyDvvJoyMDE35bm0MDfmnIgxMeaXpSAxNzoxMQ0K5Li7IOmimCDvvJpSZTogW2xpYnZp cnRdIFtQQVRDSF0gdXRpbDogaW5jcmVhc2UgbGlibmwgYnVmZmVyIHNpemUgdG8gMU0NCg0KDQoN Cg0KDQpPbiBNb24sIEp1bCAxMCwgMjAxNyBhdCAwMjo1MTozNFBNIC0wNDAwLCBKb2huIEZlcmxh biB3cm90ZToNCj4gDQo+IA0KPiBPbiAwNi8yOS8yMDE3IDAyOjA1IFBNLCBaaGlQZW5nIEx1IHdy b3RlOg0KPiA+IG5sX3JlY3YoKSByZXR1cm5zIHRoZSBlcnJvciAiTm8gYnVmZmVyIHNwYWNlIGF2 YWlsYWJsZSINCj4gPiB3aGVuIHVzaW5nIHZpcnNoIGRlc3Ryb3kgZG9tYWluIHdpdGggMjQwIG9y IG1vcmUNCj4gPiBwYXNzaHRocm91Z2ggbmV0d29yayBpbnRlcmZhY2VzLg0KPiANCj4gcGFzcy10 aHJvdWdoDQo+IA0KPiA+IFRoZSBwYXRjaCBpbmNyZWFzZXMgbGlibmwgc29jayByZWNlaXZlIGJ1 ZmZlciBzaXplIHRvIDFNLA0KPiA+IGFuZCBubF9yZWN2KCkgZG9lc24ndCByZXR1cm4gZXJyb3Ig d2hlbiBkZXN0cm95aW5nIGRvbWFpbg0KPiA+IHdpdGggNTEyIG5ldHdvcmsgaW50ZXJmYWNlcy4N Cj4gPiANCj4gPiBTaWduZWQtb2ZmLWJ5OiBaaGlQZW5nIEx1IDxsdS56aGlwZW5nQHp0ZS5jb20u Y24+DQo+ID4gLS0tDQo+ID4gIHNyYy91dGlsL3Zpcm5ldGxpbmsuYyB8IDQgKystLQ0KPiA+ICAx IGZpbGUgY2hhbmdlZCwgMiBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQ0KPiA+IA0KPiAN Cj4gVGhpcyBmZWVscyBsaWtlIHNvbWV0aGluZyB0aGF0IHBlcmhhcHMgc2hvdWxkIGJlIGNvbmZp Z3VyYWJsZSAtIHRoYXQgaXMNCj4gc29tZSAvZXRjL2xpYnZpcnQvbGlidmlydGQuY29uZiB2YXJp YWJsZSBvdGhlcndpc2UsIHdlJ2xsIGtlZXAgaGl0dGluZw0KPiBzb21lIGNvbmZsYXRlZCBtYXhp bXVtIGJhc2VkIG9uIHRoZSBzaXplIG9mIHNvbWV0aGluZy4NCg0KMSBNQiBtYXRjaGVzIHdoYXQg c3lzdGVtZWQvdWRldmQgdXNlcywgc28gaWYgd2UgaGl0IHRoYXQgbGltaXQsIHRoZW4gIHRoZQ0K c3lzdGVtIGFzIGEgd2hvbGUgaXMgZ29pbmcgdG8gc3RydWdnbGUgYWxyZWFkeS4gU28gSSBkb24n dCB0aGluayB3ZSBuZWVkDQptYWtlIGl0IGNvbmZpZ3VyYWJsZS4NCg0KDQpSZWdhcmRzLA0KRGFu aWVsDQotLSANCnw6IGh0dHBzOi8vYmVycmFuZ2UuY29tICAgICAgLW8tICAgIGh0dHBzOi8vd3d3 LmZsaWNrci5jb20vcGhvdG9zL2RiZXJyYW5nZSA6fA0KfDogaHR0cHM6Ly9saWJ2aXJ0Lm9yZyAg ICAgICAgIC1vLSAgICAgICAgICAgIGh0dHBzOi8vZnN0b3AxMzguYmVycmFuZ2UuY29tIDp8DQp8 OiBodHRwczovL2VudGFuZ2xlLXBob3RvLm9yZyAgICAtby0gICAgaHR0cHM6Ly93d3cuaW5zdGFn cmFtLmNvbS9kYmVycmFuZ2UgOnw=

On 07/11/2017 05:06 AM, Daniel P. Berrange wrote:
On Mon, Jul 10, 2017 at 02:51:34PM -0400, John Ferlan wrote:
nl_recv() returns the error "No buffer space available" when using virsh destroy domain with 240 or more passhthrough network interfaces.
On 06/29/2017 02:05 PM, ZhiPeng Lu wrote: pass-through
(Actually, neither is correct. If you're talking about devices that are assigned with VFIO, then they are "SRIOV VFs assigned with VFIO" ("passthrough" is a misnomer left over from Xen), and if you're talking about macvtap, then according to the macvtap spec, the proper term is "passthru", while the libvirt XML calls them "passthrough" (and in either case, you should stipulate that you're talking about macvtap network interfaces).)
The patch increases libnl sock receive buffer size to 1M, and nl_recv() doesn't return error when destroying domain with 512 network interfaces.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn> --- src/util/virnetlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
This feels like something that perhaps should be configurable - that is some /etc/libvirt/libvirtd.conf variable; otherwise, we'll keep hitting some conflated maximum based on the size of something.
1 MB matches what systemed/udevd uses, so if we hit that limit, then the system as a whole is going to struggle already. So I don't think we need make it configurable.
What bothers me about this (now that John's research has finally reminded me of the complete gory history that I was deeply involved in but had already totally forgotten) is that turning on netlink message peeking (which we've done) is supposed to eliminate the need for a large initial buffer (at the expense of making every read a bit less efficient) *That* was supposed to be a permanent solution to the problem, but apparently hasn't helped. ZhiPeng - what version of libnl and kernel are you using? The message history John points to reminded me that there are some versions of libnl where message peeking doesn't work properly; maybe that's the issue and the proper solution for you is to update your libnl. Or it's possible that we need to do something else to make message peeking work properly? (The latter doesn't seem likely - I just looked it up, and MSG_PEEK has actually been enabled in libnl by default since this commit: https://github.com/thom311/libnl/commit/55ea6e6b6cd805f441b410971c9dd7575e78... which was in libnl 3.2.29 - even if we got it wrong, it should still be enabled). I *was* going to suggest that pushing this patch (rather than the one making the initial buffer size configurable) was the best course of action. But after reading through the history and remembering everything, I've changed my mind - I think we need to figure out why MSG_PEEK isn't working properly on your system. The first step is to learn the version of libnl in use on the system that's failing.
participants (5)
-
Daniel P. Berrange
-
John Ferlan
-
Laine Stump
-
lu.zhipeng@zte.com.cn
-
ZhiPeng Lu