[libvirt] FW: Using dlls for Windows provided in http://libvirt.org/sources/win32_experimental/Libvirt-0.8.7-2.exe

Hello libvirt-list, We have a problem. It concerns CPU usage of libvirt library in Windows. It's not a problem in Linux. See attach. At the moment we have a workaround for item 1 - we just calculate the number of handles which are leaked and restart our service if the number exceeds 10.000 As for item 2 - we have no real workaround. In 99.99% it should not happen, but there is still 0.01% In libvirt.log you can find more info as suggested by Daniel. ================================================================ In the attached file, you will find detailed information regarding the case 100 percent CPU usage. Our test was performed on the following system: Windows XP SP3; Libvirt-0.8.8; Run the following command: virsh -c qemu+tcp: / /172.17.46.88:135/system Port 135 was one of the ports on which our service is trying to connect. ================================================================ Could you help us here? Thanks Alexander -----Original Message----- From: Aliaksandr Chabatar Sent: Tuesday, March 15, 2011 3:52 PM To: Ihar Smertsin Subject: FW: [libvirt] Using dlls for Windows provided in http://libvirt.org/sources/win32_experimental/Libvirt-0.8.7-2.exe Hi Ihar, Could you provide more information (log files, see below) so we could address this issue to libvir-list@redhat.com ? Mfg Alexander -----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Tuesday, March 15, 2011 3:08 PM To: Aliaksandr Chabatar Cc: Hempfer, Siegfried; Boehme, Alfred; Schnizer, Monika Subject: Re: [libvirt] Using dlls for Windows provided in http://libvirt.org/sources/win32_experimental/Libvirt-0.8.7-2.exe On Tue, Mar 15, 2011 at 04:01:31PM +0200, Aliaksandr Chabatar wrote:
Dear Daniel,
I have another question. It concerns CPU usage of libvirt library in Windows. It's not a problem in Linux. See attach.
At the moment we have a workaround for item 1 - we just calculate the number of handles which are leaked and restart our service if the number exceeds 10.000
It sounds like we have some crazy resource leak in a piece of code. I'm not too familiar with Windows, but if you re-send this mail of yours to libvir-list@redhat.com, I expect one of the community members who knows Windows will be able to advise.
As for item 2 - we have no real workaround. In 99.99% it should not happen, but there is still 0.01%
Yeah, that sounds like some piece of code is missing correct error checking. It would be useful to try and obtain a couple of stack traces when it is showing 100% cpu usage. Or capture a libvirt debug log, eg by setting an environment variable in your client application LIBVIRT_LOG_FILTERS="1:libvirt 1:util 1:remote" LIBVIRT_LOG_OUTPUTS="1:file:libvirt.log" Again, sending this log + the info from your mail to the libvir-list would be best. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Quite a long mail, I'm skipping all but the original report here.
From: Ihar Smertsin <I.Smertsin@sam-solutions.net> To: Aliaksandr Chabatar <A.Chabatar@sam-solutions.net> Date: Tue, 8 Feb 2011 15:05:59 +0200 Subject: Libvirt issues in windows Hello Alexandr,
On the client version of the library under Windows detected the following errors:
1. When you call some library functions is a growth of handles of resources. Growth handles can be found as follows:
run the task manager;
run the utility virsh;
connect this tool to the server is running libvirtd service; (virsh -c qemu + tcp: / / 192.168.117.107/system)
execute any command, for example list;
if we continue to call a command list, then the task manager can be found that the number of handles will increase by about 5-6,
The exact same situation arises every time you call some library functions, such as: virConnectListDomains, virConnectNumOfDomains, virDomainLookupByID, virDomainLookupByName and others;
I can reproduce this problem and fixed it. See https://www.redhat.com/archives/libvir-list/2011-March/msg00809.html for the patch. The next libvirt release scheduled for end of March (IIRC) will contain it. The problem was that libvirt uses a conditional variable during remote communication. This conditional variable wasn't freed correctly. Resulting in leaking one handle per remote call. virsh commands like list do multiple remote calls resulting in leaking multiple handles at once.
2. In some cases, there is a growth of CPU resources. It happened in the following situations: We have a service that detects different types of virtual systems, including KVM. This service uses the client library libvirt. Our network has a host that is running windows 2008, which supports hyper-v virtualization. Our service is trying to determine the type of virtualization trying to connect to the system under different protocols. When the turn of the KVM, then there is the following situation. When we call the function virConnectOpen with parameter (qemu + tcp: / / 192.168.117.178:135 / system) error occurs. (Error: internal error received hangup / error event on socket) And then our service starts to take the CPU up to 20-25%.
I didn't look into this one in detail yet. It could be that something is not properly cleanup when this error occurs and this results in the high CPU load. This will need some further investigation. Matthias

Hello Matthias, Thanks a lot for your quick response. I am looking forward for the item#2 solution. Alexander -----Original Message----- From: Matthias Bolte [mailto:matthias.bolte@googlemail.com] Sent: Thursday, March 17, 2011 6:09 PM To: Aliaksandr Chabatar Cc: libvir-list@redhat.com; Ihar Smertsin Subject: Re: [libvirt] FW: Using dlls for Windows provided in http://libvirt.org/sources/win32_experimental/Libvirt-0.8.7-2.exe Quite a long mail, I'm skipping all but the original report here.
From: Ihar Smertsin <I.Smertsin@sam-solutions.net> To: Aliaksandr Chabatar <A.Chabatar@sam-solutions.net> Date: Tue, 8 Feb 2011 15:05:59 +0200 Subject: Libvirt issues in windows Hello Alexandr,
On the client version of the library under Windows detected the following errors:
1. When you call some library functions is a growth of handles of resources. Growth handles can be found as follows:
run the task manager;
run the utility virsh;
connect this tool to the server is running libvirtd service; (virsh -c qemu + tcp: / / 192.168.117.107/system)
execute any command, for example list;
if we continue to call a command list, then the task manager can be found that the number of handles will increase by about 5-6,
The exact same situation arises every time you call some library functions, such as: virConnectListDomains, virConnectNumOfDomains, virDomainLookupByID, virDomainLookupByName and others;
I can reproduce this problem and fixed it. See https://www.redhat.com/archives/libvir-list/2011-March/msg00809.html for the patch. The next libvirt release scheduled for end of March (IIRC) will contain it. The problem was that libvirt uses a conditional variable during remote communication. This conditional variable wasn't freed correctly. Resulting in leaking one handle per remote call. virsh commands like list do multiple remote calls resulting in leaking multiple handles at once.
2. In some cases, there is a growth of CPU resources. It happened in the following situations: We have a service that detects different types of virtual systems, including KVM. This service uses the client library libvirt. Our network has a host that is running windows 2008, which supports hyper-v virtualization. Our service is trying to determine the type of virtualization trying to connect to the system under different protocols. When the turn of the KVM, then there is the following situation. When we call the function virConnectOpen with parameter (qemu + tcp: / / 192.168.117.178:135 / system) error occurs. (Error: internal error received hangup / error event on socket) And then our service starts to take the CPU up to 20-25%.
I didn't look into this one in detail yet. It could be that something is not properly cleanup when this error occurs and this results in the high CPU load. This will need some further investigation. Matthias
participants (2)
-
Aliaksandr Chabatar
-
Matthias Bolte