[PATCH] REDSAP: Fix double exception possibilty

Test failed with the following: -------------------------------------------------------------------- KVMRedirectionSAP - 01_enum_KVMredSAP.py: FAIL ERROR - Failed to enumerate the class of KVM_KVMRedirectionSAP ERROR - Exception details: Got more than one record for: test_kvmredsap_dom ERROR - Exception details: 'ElementName' Value Mismatch, Expected 5988:-1, Got 5988:0 ERROR - Exception: Failed to verify information for the defined dom:test_kvmredsap_dom -------------------------------------------------------------------- There are two exceptions listed because the 'enum_redsap()' method was perusing a list, finding a match, declaring success, and continuing to peruse the list. Then found another match declared an exception and returned with status = PASS. The caller just checked the status before continuing. This patch doesn't resolve the underlying cause, but does avoid the call to 'verify_redsap_values()' which will also fail... --- NOTE: I was able to reproduce the initial exception if I run this test very quickly two times in succession. I debated adding a 'sleep(5)' (or similar) prior to the end of the test to ensure whatever teardown was not occurring in a timely manner (by the networking code) could happen, but I believe that wouldn't necessarily fix the problem - just make it less likely to happen. The resource in question is a specific port in the underlying VNC technology. That code may have built in 'safeguards' to help with throttling, e.g. reuse a recently used port thus make it unreliable for the test to rely on getting the "first" port. Since the test isn't designed to run twice in a row, I don't think it's a real issue... .../libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/suites/libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py b/suites/libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py index 6001405..af9b5c6 100644 --- a/suites/libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py +++ b/suites/libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py @@ -50,7 +50,7 @@ test_dom = 'test_kvmredsap_dom' def enum_redsap(server, virt, classname): redsap_insts = { } - status = FAIL + status = PASS try: redsap_list = EnumInstances(server, classname) @@ -58,11 +58,11 @@ def enum_redsap(server, virt, classname): if redsap.SystemName == test_dom: if redsap.Classname not in redsap_insts.keys(): redsap_insts[redsap.Classname] = redsap - status = PASS else: raise Exception("Got more than one record for: %s" \ % test_dom) except Exception, details: + status = FAIL logger.error(CIM_ERROR_ENUMERATE, classname) logger.error("Exception details: %s", details) @@ -139,7 +139,7 @@ def main(): raise Exception("Failed to verify information for the defined "\ "dom:%s" % test_dom) - # For now verifying KVMRedirectoinSAP only for a defined LXC guest. + # For now verifying KVMRedirectionSAP only for a defined LXC guest. # Once complete Graphics support for LXC is in, we need to verify the # KVMRedirectionSAP for a running guest. if virt == 'LXC': @@ -150,10 +150,10 @@ def main(): status = vsxml.cim_start(server) if not ret: raise Exception("Failed to start the dom: %s" % test_dom) + action_start = True status, redsap_inst = enum_redsap(server, virt, classname) if status != PASS: - action_start = True raise Exception("Failed to get information for running dom:%s" \ % test_dom) @@ -162,7 +162,6 @@ def main(): status = verify_redsap_values(val_list, redsap_inst, classname) if status != PASS: - action_start = True raise Exception("Failed to verify information for running dom:%s" \ % test_dom) -- 1.8.3.1

On 09/06/2013 03:46 PM, John Ferlan wrote:
Test failed with the following:
-------------------------------------------------------------------- KVMRedirectionSAP - 01_enum_KVMredSAP.py: FAIL ERROR - Failed to enumerate the class of KVM_KVMRedirectionSAP ERROR - Exception details: Got more than one record for: test_kvmredsap_dom ERROR - Exception details: 'ElementName' Value Mismatch, Expected 5988:-1, Got 5988:0 ERROR - Exception: Failed to verify information for the defined dom:test_kvmredsap_dom --------------------------------------------------------------------
There are two exceptions listed because the 'enum_redsap()' method was perusing a list, finding a match, declaring success, and continuing to peruse the list. Then found another match declared an exception and returned with status = PASS.
The caller just checked the status before continuing.
This patch doesn't resolve the underlying cause, but does avoid the call to 'verify_redsap_values()' which will also fail... ---
NOTE: I was able to reproduce the initial exception if I run this test very quickly two times in succession. I debated adding a 'sleep(5)' (or similar) prior to the end of the test to ensure whatever teardown was not occurring in a timely manner (by the networking code) could happen, but I believe that wouldn't necessarily fix the problem - just make it less likely to happen. The resource in question is a specific port in the underlying VNC technology. That code may have built in 'safeguards' to help with throttling, e.g. reuse a recently used port thus make it unreliable for the test to rely on getting the "first" port. Since the test isn't designed to run twice in a row, I don't think it's a real issue...
.../libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
The patch fixes the previously incorrect error handling, so I would suggest to push it anyway. The failure doesn't seem to happen because of a lingering socket but rather becasue the guest wasn't destroyed and/or undefined (returned port is 0). Did this failure ever reoccur *after* you have applied your change? -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/09/2013 07:12 AM, Viktor Mihajlovski wrote:
On 09/06/2013 03:46 PM, John Ferlan wrote:
Test failed with the following:
-------------------------------------------------------------------- KVMRedirectionSAP - 01_enum_KVMredSAP.py: FAIL ERROR - Failed to enumerate the class of KVM_KVMRedirectionSAP ERROR - Exception details: Got more than one record for: test_kvmredsap_dom ERROR - Exception details: 'ElementName' Value Mismatch, Expected 5988:-1, Got 5988:0 ERROR - Exception: Failed to verify information for the defined dom:test_kvmredsap_dom --------------------------------------------------------------------
There are two exceptions listed because the 'enum_redsap()' method was perusing a list, finding a match, declaring success, and continuing to peruse the list. Then found another match declared an exception and returned with status = PASS.
The caller just checked the status before continuing.
This patch doesn't resolve the underlying cause, but does avoid the call to 'verify_redsap_values()' which will also fail... ---
NOTE: I was able to reproduce the initial exception if I run this test very quickly two times in succession. I debated adding a 'sleep(5)' (or similar) prior to the end of the test to ensure whatever teardown was not occurring in a timely manner (by the networking code) could happen, but I believe that wouldn't necessarily fix the problem - just make it less likely to happen. The resource in question is a specific port in the underlying VNC technology. That code may have built in 'safeguards' to help with throttling, e.g. reuse a recently used port thus make it unreliable for the test to rely on getting the "first" port. Since the test isn't designed to run twice in a row, I don't think it's a real issue...
.../libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
The patch fixes the previously incorrect error handling, so I would suggest to push it anyway. The failure doesn't seem to happen because of a lingering socket but rather becasue the guest wasn't destroyed and/or undefined (returned port is 0). Did this failure ever reoccur *after* you have applied your change?
Yes - unfortunately. Although not very repeatable. Sometimes running twice in a row works and sometimes it doesn't. Chasing after those kind of timing problems is never easy. When I worked my way through the API's, I ended up in the VNC port allocation routines - I think get_vnc_sessions() - but I was never quite sure exactly where the fault was and chasing it seemed to be a too time consuming task. John

On 09/09/2013 02:46 PM, John Ferlan wrote:
Yes - unfortunately. Although not very repeatable. Sometimes running twice in a row works and sometimes it doesn't. Chasing after those kind of timing problems is never easy. When I worked my way through the API's, I ended up in the VNC port allocation routines - I think get_vnc_sessions() - but I was never quite sure exactly where the fault was and chasing it seemed to be a too time consuming task.
it still looks to me as if there was a "dangling" running domain. This here looks suspicious ... status = vsxml.cim_start(server) if not ret: raise Exception("Failed to start the dom: %s" % test_dom) and looks as if could trigger a false error condition, resulting in an undestroyed test doamain, should probably be ret = vsxml.cim_start(server) -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/09/2013 09:55 AM, Viktor Mihajlovski wrote:
On 09/09/2013 02:46 PM, John Ferlan wrote:
Yes - unfortunately. Although not very repeatable. Sometimes running twice in a row works and sometimes it doesn't. Chasing after those kind of timing problems is never easy. When I worked my way through the API's, I ended up in the VNC port allocation routines - I think get_vnc_sessions() - but I was never quite sure exactly where the fault was and chasing it seemed to be a too time consuming task.
it still looks to me as if there was a "dangling" running domain. This here looks suspicious ...
status = vsxml.cim_start(server) if not ret: raise Exception("Failed to start the dom: %s" % test_dom)
and looks as if could trigger a false error condition, resulting in an undestroyed test doamain, should probably be
ret = vsxml.cim_start(server)
Yep - that's what I first saw... I just happened to try a bit of "extended testing" and tripped across the anomoly that I saw. John

On 09/06/2013 09:46 AM, John Ferlan wrote:
Test failed with the following:
-------------------------------------------------------------------- KVMRedirectionSAP - 01_enum_KVMredSAP.py: FAIL ERROR - Failed to enumerate the class of KVM_KVMRedirectionSAP ERROR - Exception details: Got more than one record for: test_kvmredsap_dom ERROR - Exception details: 'ElementName' Value Mismatch, Expected 5988:-1, Got 5988:0 ERROR - Exception: Failed to verify information for the defined dom:test_kvmredsap_dom --------------------------------------------------------------------
There are two exceptions listed because the 'enum_redsap()' method was perusing a list, finding a match, declaring success, and continuing to peruse the list. Then found another match declared an exception and returned with status = PASS.
The caller just checked the status before continuing.
This patch doesn't resolve the underlying cause, but does avoid the call to 'verify_redsap_values()' which will also fail... ---
NOTE: I was able to reproduce the initial exception if I run this test very quickly two times in succession. I debated adding a 'sleep(5)' (or similar) prior to the end of the test to ensure whatever teardown was not occurring in a timely manner (by the networking code) could happen, but I believe that wouldn't necessarily fix the problem - just make it less likely to happen. The resource in question is a specific port in the underlying VNC technology. That code may have built in 'safeguards' to help with throttling, e.g. reuse a recently used port thus make it unreliable for the test to rely on getting the "first" port. Since the test isn't designed to run twice in a row, I don't think it's a real issue...
.../libvirt-cim/cimtest/KVMRedirectionSAP/01_enum_KVMredSAP.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
Tks for the review - this is now pushed. John
participants (2)
-
John Ferlan
-
Viktor Mihajlovski