Note: A couple of times I have seen errors while closing the connection (the trace ending with virFree). Also, a few times I have seen backtraces which show the flow going via esx driver and finally failing to close connection. Unfortunately I dont have those logs anymore.

The execution results I have shared have been obtained using Libvirt built from source using the following config parameters:
# configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-esx=yes 

The libvirt java bindings I have taken from : https://github.com/libvirt/libvirt-java
I have added a sysout in the "finalize" method of Connect class in there, thats why the "Connect finalizing.." messages in execution results.

Thanks & Regards
Sachin Soman

 


On Thu, Apr 18, 2019 at 10:46 PM Sachin Soman <sachonline.soman@gmail.com> wrote:
I am attaching the execution results. At the top of each file I have mentioned the environment details.

Following is the test program I have used:

==================================================

package org.libvirt;


import org.libvirt.jna.Libvirt;


public class LibvirtCrashTest {

void createAndDestroyDefaultAuthConnection() {

ConnectAuth ca = new ConnectAuthDefault();

try {

System.out.println("Starting new connection with default auth");

Connect connect = new Connect("esx://x.x.x.x/?no_verify=1", ca, 0);

Thread.sleep(1000);

System.out.println("Explicit connection closure");

connect.close();

Thread.sleep(5000);

} catch (Exception e) {

e.printStackTrace();

} 

}


public static void main(String[] args) throws Exception {

LibvirtCrashTest testInstance = new LibvirtCrashTest();


for(int counter = 0; counter < 3; counter++) {

testInstance.createAndDestroyDefaultAuthConnection();

System.out.println("gc'ing");

System.gc(); 

System.out.println("gc'd");

int tCounter = 0;

while(tCounter++ < 20) {

System.out.println("waiting.. " + tCounter); 

Thread.sleep(1000);

}

}

System.out.println("Going down...");

}


}

==================================================


Thanks & Regards
Sachin Soman

 


On Thu, Apr 18, 2019 at 9:25 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Thu, Apr 18, 2019 at 05:51:06PM +0200, Michal Prívozník wrote:
> On 4/17/19 10:24 AM, Sachin Soman wrote:
> > Hi,
> >
> > Could you tell me if the following is some known issue?
> >
> > While performing the following simple test, I see my JVM crashing
> > (consistently):
> > 1. Open a connection to an ESXi driver/host (passing ConnectAuthDefault
> > instance).
> > 2. Close the connection.
> > 3. Invoke GC
> >
> > When GC is triggered, at some point, some unallocated native memory is
> > being tried to release. That's failing.
> >
> > The error thrown is:
> >
> > java(78745,0x70000241e000) malloc: *** error for object 0x7fd5df561390:
> > pointer being freed was not allocated
> >
> > *** set a breakpoint in malloc_error_break to debug
> >
> >
> > Frames from core dump:
> >
> >     frame #0: 0x00007fff5b274b66 libsystem_kernel.dylib`__pthread_kill + 10
> >
> >     frame #1: 0x00007fff5b43f080 libsystem_pthread.dylib`pthread_kill + 333
> >
> >     frame #2: 0x00007fff5b1d01ae libsystem_c.dylib`abort + 127
> >
> >     frame #3: 0x00007fff5b2ce8a6 libsystem_malloc.dylib`free + 521
> >
> >     frame #4: 0x00000001127f43a7
> >
> >     frame #5: 0x00000001127e3ffd
> >
> >     frame #6: 0x00000001127e3ffd
> >
> >     frame #7: 0x00000001127e3ffd
> >
> >     frame #8: 0x00000001127e3ffd
> >
> >     frame #9: 0x00000001127e4042
> >
> >     frame #10: 0x00000001127e3ffd
> >
> >     frame #11: 0x00000001127e3ffd
> >
> >     frame #12: 0x00000001127dc4e7
> >
> >     frame #13: 0x000000010c0e235e
> > libjvm.dylib`JavaCalls::call_helper(JavaValue*, methodHandle*,
> > JavaCallArguments*, Thread*) + 1710
> >
> >     frame #14: 0x000000010c0e2b02
> > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*,
> > Symbol*, JavaCallArguments*, Thread*) + 356
> >
> >     frame #15: 0x000000010c0e2cae
> > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle,
> > Symbol*, Symbol*, Thread*) + 74
> >
> >     frame #16: 0x000000010c1208ee libjvm.dylib`thread_entry(JavaThread*,
> > Thread*) + 124
> >
> >     frame #17: 0x000000010c33e84d
> > libjvm.dylib`JavaThread::thread_main_inner() + 155
> >
> >     frame #18: 0x000000010c33ff12 libjvm.dylib`JavaThread::run() + 448
> >
> >     frame #19: 0x000000010c26058a libjvm.dylib`java_start(Thread*) + 246
> >
> >     frame #20: 0x00007fff5b43c661 libsystem_pthread.dylib`_pthread_body +
> > 340
> >
> >     frame #21: 0x00007fff5b43c50d libsystem_pthread.dylib`_pthread_start +
> > 377
> >
> >     frame #22: 0x00007fff5b43bbf9 libsystem_pthread.dylib`thread_start + 13
> >
> >
> > I have installed Libvirt 5.2.0.
> > Java bindings libvirt-java 0.5.1
> > JNA 4.0.0
> > Tested Java environments: Oracle Java 8 and OpenJDK 8 on MAC, OpenJDK 11 on
> > Ubuntu 16
>
> The backtrace does not suggest it's libvirt related, but I wouldn't be
> surprised if our Java bindings mangled memory somewhere. They are
> heavily unmaintained.

It could just as easily be a memory corruption bug in the ESX libvirt
driver, since that runs directly in the applicatin process as it is a
stateless client side driver.

We would probably need to have an small demo program that can reproduce
the problem in an isolated fashion, in order to try to debug it, along
with full libvirt debug logs.


Regards,
Daniel
--
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|