[libvirt] Entering freeze for libvirt-4.6.0

As suggested yesterday, I have just tagged the release candidate 1 in git, and pushed signed tarbal and rpms to the usual place: ftp://libvirt.org/libvirt/ seems to work fine with my limited testing, and https://ci.centos.org/view/libvirt/ is all green (except for virt-viewer-master-rpm ?) so things looks pretty good for me but please try it out on different systems and OSes. If everything goes well I will push rc2 on Tuesday targetting Thursday for the final release (or Friday if I get stuck in travels). thanks in advance for trying it out ! Daniel -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/

On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
As suggested yesterday, I have just tagged the release candidate 1 in git, and pushed signed tarbal and rpms to the usual place:
ftp://libvirt.org/libvirt/
seems to work fine with my limited testing, and https://ci.centos.org/view/libvirt/ is all green (except for virt-viewer-master-rpm ?)
This was caused by virt-viewer recently bumping their minimum spice-gtk version to 0.35, which is not available on CentOS or Fedora older than 28. It's since been addressed, and all dots are back to green now :)
so things looks pretty good for me but please try it out on different systems and OSes.
If everything goes well I will push rc2 on Tuesday targetting Thursday for the final release (or Friday if I get stuck in travels).
thanks in advance for trying it out !
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace: #0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6 I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace. It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad. Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson. -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Jul 30, 2018 at 05:20:01PM +0200, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
As suggested yesterday, I have just tagged the release candidate 1 in git, and pushed signed tarbal and rpms to the usual place:
ftp://libvirt.org/libvirt/
seems to work fine with my limited testing, and https://ci.centos.org/view/libvirt/ is all green (except for virt-viewer-master-rpm ?)
This was caused by virt-viewer recently bumping their minimum spice-gtk version to 0.35, which is not available on CentOS or Fedora older than 28. It's since been addressed, and all dots are back to green now :)
Ok, cool thanks for the update !
so things looks pretty good for me but please try it out on different systems and OSes.
If everything goes well I will push rc2 on Tuesday targetting Thursday for the final release (or Friday if I get stuck in travels).
thanks in advance for trying it out !
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Ok,so far I'm not seeing any suggestion on this issue, is there an entry in bugzilla about it ? I will push RC2 today but we can wait to push final on Friday or this week-end if no solution is found and we need to revert back to old lib thanks ! Daniel -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/

On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib: x11-libs/gtk gnome-base/gnome-shell So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols. Michal

On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against
For the libraries installed on my Fedora 28, objdump shows json_object_iter_next as the only conflicting symbol, maybe we can get away with using a different iterator as a quick fix.
the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib:
x11-libs/gtk gnome-base/gnome-shell
So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols.
For the client library, we can just compile out JSON - it should not be needed for anything. And we can generate the data for libvirt_nss in some simpler format. Jano
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Jul 31, 2018 at 12:18:16PM +0200, Ján Tomko wrote:
On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against
For the libraries installed on my Fedora 28, objdump shows json_object_iter_next as the only conflicting symbol, maybe we can get away with using a different iterator as a quick fix.
the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib:
x11-libs/gtk gnome-base/gnome-shell
So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols.
For the client library, we can just compile out JSON - it should not be needed for anything. And we can generate the data for libvirt_nss in some simpler format.
There's no such concept of 'client library', our libvirt.so library is used in both client and server sides, providing the shared code to both. Changing the NSS format is doable, but not before release. In fact bearing this problem in mind, I tend to think we should perhaps make sure that the NSS library doesn't link to anything except glibc. It can be loaded into any process running on the host, so the less we load from it the safer we'll be. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 07/31/2018 12:25 PM, Daniel P. Berrangé wrote:
On Tue, Jul 31, 2018 at 12:18:16PM +0200, Ján Tomko wrote:
On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against
For the libraries installed on my Fedora 28, objdump shows json_object_iter_next as the only conflicting symbol, maybe we can get away with using a different iterator as a quick fix.
the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib:
x11-libs/gtk gnome-base/gnome-shell
So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols.
For the client library, we can just compile out JSON - it should not be needed for anything. And we can generate the data for libvirt_nss in some simpler format.
There's no such concept of 'client library', our libvirt.so library is used in both client and server sides, providing the shared code to both.
Changing the NSS format is doable, but not before release.
Actually it isn't. A small history window - JSON was used when Nehal was implementing support for reporting domain IP addresses. He created a small binary (leasehelper) that dnsmasq runs every time a lease is given to a domain. The leasehelper then stores assigned IP address into a file so that virDomainInterfaceAddresses can report it. Back then JSON was chosen because we have a set of good APIs to work with the format from C. The NSS plugin just made use of the stored data for NSS (9 releases later). Fast forward to today. So if we change NSS plugin (which is not the one to blame) then we have to change the leasehelper too. But more importantly, we would need a script (run at %post possibly) that reworks JSON to the new format so that the data is preserved. It's not only NSS plugin in play here.
In fact bearing this problem in mind, I tend to think we should perhaps make sure that the NSS library doesn't link to anything except glibc. It can be loaded into any process running on the host, so the less we load from it the safer we'll be.
Sure. That's why src/libvirt_nss.la is built - disabling all other features. So far we've been quite successful with this, the NSS links with jansson (apart from standard libs like libc). But we are losing the big picture here. Switching format that NSS plugin uses is not going to help, because we have apps (like gnome-boxes) that links both GTK and libvirt. And as I stated earlier, GTK drags in json-glib and libvirt.so drags in jansson. And I guess changing qemu monitor format from JSON to something else is not going to happen ;-) Michal

On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib:
x11-libs/gtk gnome-base/gnome-shell
So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols.
It could "help" (quotes for all the disadvantages that approach has). Not because it would not try to resolve it, but because we would have the `json_` symbols as 'local' thanks to our src/libvirt.syms. If the lib was added to our dynamic lib we would still need to use `-Bsymbolic-functions` so that our `json_` symbols don't call `json_` symbols from the dynamic one programs where it is loaded. However that has some issues with `LD_PRELOAD`. Maybe we could utilize the `-Bgroup` linker option, although I'm not sure how that is supposed to be used. In any case, this could be fixed in the respective libraries. The reasoning behind it is that since C doesn't support namespaces we namespace functions by a prefix (`vir` in libvirt), however that "namespace" needs to be unique. They should switch to `jansson_` or `glib_json_` prefixes and maybe provide macros for the previous names: #define json_auto_t jansson_auto_t ... I know it sounds like too big of a deal, but that's what happens in C world. The same would happen if libvirt used `json-glib` and some application linking with libvirt would start using jansson (and also use some specific functions). Not that we were guarded against that now. I'm not saying the release can go on, of course not, just that the ultimate fix is not something *we* should do. Querying the fedora repositories I haven't found any similar situation. Projects that use jansson are separated from those that use json-c and those that use json-glib. If they want to use each other though, we'll be in the same mess as we are now.
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Jul 31, 2018 at 02:04:40PM +0200, Martin Kletzander wrote:
On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Worse. querying gentoo portage I've found some important packages requiring json-glib:
x11-libs/gtk gnome-base/gnome-shell
So once users of these app update to latest libvirt they will see the crashes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Yes, any application can crash.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Well, what if we linked with jansson statically? I'm not sure if it is possible (and have no idea how to achieve that), but what if our dynamic libraries we produce already contained jansson and thus linker would not even try to resolve json_* symbols.
It could "help" (quotes for all the disadvantages that approach has). Not because it would not try to resolve it, but because we would have the `json_` symbols as 'local' thanks to our src/libvirt.syms. If the lib was added to our dynamic lib we would still need to use `-Bsymbolic-functions` so that our `json_` symbols don't call `json_` symbols from the dynamic one programs where it is loaded. However that has some issues with `LD_PRELOAD`.
Note there is no jansson static build in Fedora or RHEL, so it is somewhat academic right now.
Maybe we could utilize the `-Bgroup` linker option, although I'm not sure how that is supposed to be used.
In any case, this could be fixed in the respective libraries. The reasoning behind it is that since C doesn't support namespaces we namespace functions by a prefix (`vir` in libvirt), however that "namespace" needs to be unique. They should switch to `jansson_` or `glib_json_` prefixes and maybe provide macros for the previous names:
#define json_auto_t jansson_auto_t ...
I know it sounds like too big of a deal, but that's what happens in C world. The same would happen if libvirt used `json-glib` and some application linking with libvirt would start using jansson (and also use some specific functions). Not that we were guarded against that now. I'm not saying the release can go on, of course not, just that the ultimate fix is not something *we* should do.
Changing their API names isn't required unless you need to have separation of namespaces at the #include level, so you can pull in both. For ELF libraries, it would be sufficient to have symbol versioning to get separation.
Querying the fedora repositories I haven't found any similar situation. Projects that use jansson are separated from those that use json-c and those that use json-glib. If they want to use each other though, we'll be in the same mess as we are now.
GNOME control center + NetworkManager hit this problem in Fedora. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 07/31/2018 02:04 PM, Martin Kletzander wrote:
The same would happen if libvirt used `json-glib` and some application linking with libvirt would start using jansson (and also use some specific functions). Not that we were guarded against that now. I'm not saying the release can go on, of course not, just that the ultimate fix is not something *we* should do.
Definitely. But we are the ones under pressure here. So we need some workaround until the time that two libraries come to a conclusion. I've reported the issue: https://groups.google.com/forum/#!topic/jansson-users/7Efx-RI45IU https://gitlab.gnome.org/GNOME/json-glib/issues/33 Michal

On Mon, Jul 30, 2018 at 05:20:01PM +0200, Andrea Bolognani wrote:
On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
so things looks pretty good for me but please try it out on different systems and OSes.
If everything goes well I will push rc2 on Tuesday targetting Thursday for the final release (or Friday if I get stuck in travels).
thanks in advance for trying it out !
Unfortunately I've spotted an issue during my testing of rc1 today: with the libvirt_guest NSS module enabled, Evolution would crash a few seconds after being started. Here's the stack trace:
#0 0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0 #1 0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #2 0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2 #3 0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2 #4 0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2 #5 0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2 #6 0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2 #7 0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6 #8 0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6 #9 0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0 #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0 #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0 #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0 #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0 #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
I've talked about it with a few colleagues and we believe the issue to be caused by jansson and json-glib both exporting a symbol called json_object_iter_next: Evolution itself (indirectly?) links against the latter library, so when the libvirt_guest NSS module is loaded and attempts to process JSON using the former, it picks up the wrong implementation, leading to a crash. gnome-boxes also crashes with the same stack trace.
Frustratingly both jansson and json-glib use the same 'json_' prefix for all their functions, which was very unwise choice of namespaces. Despite this by some miracle 'json_object_iter_next' is the only one that clashes. This is compounded by neither library making use of symbol versioning which would have ensured they resolved to the correct libraries. We should talk to them about adding versioning to avoid this problem. I wondered if it is possible for libvirt to change its impl to avoid calling json_object_iter_next, but it doesn't look practicall, as 'json_object_foreach' just uses the iter behind the scenes.
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Sure, it affects applications for any library libvirt links to in fact. Most libraries use the library name as the base prefix for their methods to get good namespacing. Unfortunately this time both libraries used the generic 'json_' prefix. I'm not so worried about this though, as the code paths taken in the libvirt client shouldn't tickle the json parser. Of course if jansson gets resolved before json-glib, we could still break the apps own usage of json. The NSS module is the big worry since that's loadable by any process.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
Or just document it as a known issue in the NSS module for now and resolve in next release. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Jul 30, 2018 at 05:20:01PM +0200, Andrea Bolognani wrote:
It seems like a similar issue could affect any application linking both to libvirt and json-glib, regardless of whether or not the NSS plugin has been enabled, which is of course pretty bad.
Unfortunately, I don't have any bright ideas on how to solve this, so anyone who might: please step forward! We're just a few days away from the next release, and if we can't figure out a way around this soon I'm afraid the only reasonable course of action would be to (temporarily) revert the switch from yajl to jansson.
It turns out we're not the first people to hit this problem. NetworkManager uses jansson in its libnm-core.so library, and that caused crashes[1] when it was loaded into GNOME control center which uses json-glib. They came up with a clever but gross solution [2]. First stop linking to jansson at build time. Then have code that calls dlopen(jansson.so), passing RTLD_LAZY | RTLD_LOCAL which avoids jansson symbols polluting the entire application. Now use dlsym() to resolve ach jansson symbol they need to use and store them in function pointer variables. Their code can now indirect call jansson via these saved pointers. This sounds like a doable approach for this release at least, while we consider whether there's a better option long term. Regards, Daniel [1] https://bugzilla.redhat.com/show_bug.cgi?id=1535905 [2] https://github.com/NetworkManager/NetworkManager/blob/master/libnm-core/nm-j... -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

It is out, tagged in git and with signed tarball and rpms at the usual place: ftp://libvirt.org/libvirt/ in my limited testing it works but we have that pending issue raised by Andrea. worse case someone reverse the patches and allows to build against the old lib. Please give it some testing, I will watch the commits and if there isn't any solution 2 days from now I will postpone the GA by a couple of days. Hopefully that won't be needed :-) thanks in advance, Daniel -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/

Daniel Veillard <veillard@redhat.com> [2018-07-31, 12:19PM +0800]:
It is out, tagged in git and with signed tarball and rpms at the usual place:
ftp://libvirt.org/libvirt/
in my limited testing it works but we have that pending issue raised by Andrea. worse case someone reverse the patches and allows to build against the old lib.
Please give it some testing, I will watch the commits and if there isn't any solution 2 days from now I will postpone the GA by a couple of days. Hopefully that won't be needed :-)
thanks in advance,
Daniel
-- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
We have a test failure for the viriscsitest on s390x. I have not found out the reason but bisect showed this commit: commit e0b46ad62337ae770e22272ec8ee8ad8cebefde7 (HEAD, refs/bisect/good-e0b46ad62337ae770e22272ec8ee8ad8cebefde7) Author: Michal Privoznik <mprivozn@redhat.com> Date: Mon Jul 30 11:04:26 2018 +0200 Revert "util: cgroup: define cleanup function using VIR_DEFINE_AUTOPTR_FUNC" This reverts commit 4da4a9fe0c0956feefe3d592b4ba2b92b2a9a2f9. Turns out, our code relies on virCgroupFree(&var) setting var = NULL. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com> -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 02:21PM +0200]:
Daniel Veillard <veillard@redhat.com> [2018-07-31, 12:19PM +0800]:
It is out, tagged in git and with signed tarball and rpms at the usual place:
ftp://libvirt.org/libvirt/
in my limited testing it works but we have that pending issue raised by Andrea. worse case someone reverse the patches and allows to build against the old lib.
Please give it some testing, I will watch the commits and if there isn't any solution 2 days from now I will postpone the GA by a couple of days. Hopefully that won't be needed :-)
thanks in advance,
Daniel
-- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
We have a test failure for the viriscsitest on s390x. I have not found out the reason but bisect showed this commit:
Woops, my bisect-foo was off in the heat. This is the bad commit: 149f0c4e00f738a0a444325db9e2e81979256d65 is the first bad commit commit 149f0c4e00f738a0a444325db9e2e81979256d65 Author: Michal Privoznik <mprivozn@redhat.com> Date: Wed Jul 4 10:41:54 2018 +0200 viriscsitest: Extend virISCSIConnectionLogin test Extend this existing test so that a case when IQN is provided is tested too. Since a special iSCSI interface is created and its name is randomly generated at runtime we need to link with virrandommock to have predictable names. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> :040000 040000 ad6c77fb880d4b1bbbaac1ab2cd42c1ef4012ffa 6bdebc77e74bd3497422259049a4670d3e9d31e0 M tests I have not yet had the time to figure out what goes wrong, any ideas are welcome. Bjoern -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test. Bjoern -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 07/31/2018 03:54 PM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
Ugrh. Are you working on a fix or should I give it a try? Michal

Michal Privoznik <mprivozn@redhat.com> [2018-07-31, 04:10PM +0200]:
On 07/31/2018 03:54 PM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
Ugrh. Are you working on a fix or should I give it a try?
I didn't find a quick solution and can probably only only work on this tomorrow, so depending on the time frame for the relase, if you find a fix, go for it.
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Tue, Jul 31, 2018 at 03:54:43PM +0200, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
I'm not seeing why virRandomBytes is affected by endian-ness. It is simply populating an array of bytes, so there's no endin issues to consider there. Can you elaborate on the actual error messages you are getting from the tests, and what aspect makes you think virRandomBytes is the problem ? Seems more likely that it is something higher up the call stack. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Daniel P. Berrangé <berrange@redhat.com> [2018-08-01, 11:51AM +0100]:
On Tue, Jul 31, 2018 at 03:54:43PM +0200, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
I'm not seeing why virRandomBytes is affected by endian-ness. It is simply populating an array of bytes, so there's no endin issues to consider there.
Ahem, we are writing linearly to a byte array, of course this is dependend on the endianness of the architecture. The actual problem is that the mocked version does _not_ provide a random value, but a deterministic one, which is byte order reversed on big endianness.
Can you elaborate on the actual error messages you are getting from the tests, and what aspect makes you think virRandomBytes is the problem ?
The name of the interface is wrong compared to what is explicitly expected in the test case: 200 if (virAsprintf(&temp_ifacename, (gdb) 205 VIR_DEBUG("Attempting to create interface '%s' with IQN '%s'", (gdb) p temp_ifacename $1 = 0x1014320 "libvirt-iface-04050607"
Seems more likely that it is something higher up the call stack.
Everything else looks fine.
Regards, Daniel
Best, Bjoern

On Wed, Aug 01, 2018 at 01:41:48PM +0200, Bjoern Walk wrote:
Daniel P. Berrangé <berrange@redhat.com> [2018-08-01, 11:51AM +0100]:
On Tue, Jul 31, 2018 at 03:54:43PM +0200, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
I'm not seeing why virRandomBytes is affected by endian-ness. It is simply populating an array of bytes, so there's no endin issues to consider there.
Ahem, we are writing linearly to a byte array, of course this is dependend on the endianness of the architecture. The actual problem is that the mocked version does _not_ provide a random value, but a deterministic one, which is byte order reversed on big endianness.
There's no concept of reversed byte order when you're accessing an array of bytes. If you write elements 0, 1, 2, ...7, etc and then the caller reads elements 0, 1,2, ..., 7 everything is fine. Endianness only comes into play if you take that array of bytes and then cast/assign it to a larger type were endianess is relevant (eg int16/int32/int64) eg char bytes[8]; virRandomBytes(bytes, 8); uint64_t val = (uint64_t)bytes; the problem in such a case isn't virRandomBytes, it is the cast to the larger sized integer type.
Can you elaborate on the actual error messages you are getting from the tests, and what aspect makes you think virRandomBytes is the problem ?
The name of the interface is wrong compared to what is explicitly expected in the test case:
200 if (virAsprintf(&temp_ifacename, (gdb) 205 VIR_DEBUG("Attempting to create interface '%s' with IQN '%s'", (gdb) p temp_ifacename $1 = 0x1014320 "libvirt-iface-04050607"
Seems more likely that it is something higher up the call stack.
That looks like it uses virRandomBits rather than virRandomBytes. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Aug 01, 2018 at 01:10:08PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 01, 2018 at 01:41:48PM +0200, Bjoern Walk wrote:
Daniel P. Berrangé <berrange@redhat.com> [2018-08-01, 11:51AM +0100]:
On Tue, Jul 31, 2018 at 03:54:43PM +0200, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-07-31, 03:16PM +0200]:
I have not yet had the time to figure out what goes wrong, any ideas are welcome.
Ah, classic. The mocked virRandomBytes function is not endian-agnostic, generating a different interface name as expected by the test.
I'm not seeing why virRandomBytes is affected by endian-ness. It is simply populating an array of bytes, so there's no endin issues to consider there.
Ahem, we are writing linearly to a byte array, of course this is dependend on the endianness of the architecture. The actual problem is that the mocked version does _not_ provide a random value, but a deterministic one, which is byte order reversed on big endianness.
There's no concept of reversed byte order when you're accessing an array of bytes. If you write elements 0, 1, 2, ...7, etc and then the caller reads elements 0, 1,2, ..., 7 everything is fine.
Endianness only comes into play if you take that array of bytes and then cast/assign it to a larger type were endianess is relevant (eg int16/int32/int64)
eg
char bytes[8];
virRandomBytes(bytes, 8);
uint64_t val = (uint64_t)bytes;
the problem in such a case isn't virRandomBytes, it is the cast to the larger sized integer type.
Can you elaborate on the actual error messages you are getting from the tests, and what aspect makes you think virRandomBytes is the problem ?
The name of the interface is wrong compared to what is explicitly expected in the test case:
200 if (virAsprintf(&temp_ifacename, (gdb) 205 VIR_DEBUG("Attempting to create interface '%s' with IQN '%s'", (gdb) p temp_ifacename $1 = 0x1014320 "libvirt-iface-04050607"
Seems more likely that it is something higher up the call stack.
That looks like it uses virRandomBits rather than virRandomBytes.
Yes, it is virRandomBits that is the problem - it is taking a uint64_t variable and casting it to a "unsigned char *", which is not an endian safe thing todo. We need to rewrite virRandomBits to do char val[8]; virRandomBytes(val, sizeof(val)); uint64_t ret = val[0] | ((uint64_t)val[1]) << 8 | ((uint64_t)val[2]) << 16 | ((uint64_t)val[3]) << 24 | ((uint64_t)val[4]) << 32 | ((uint64_t)val[5]) << 40 | ((uint64_t)val[6]) << 48 | ((uint64_t)val[7]) << 56 | return ret & ((1ULL << nbits) -1); to make it endiansafe. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, Jul 31, 2018 at 12:19:28PM +0800, Daniel Veillard wrote:
It is out, tagged in git and with signed tarball and rpms at the usual place:
ftp://libvirt.org/libvirt/
in my limited testing it works but we have that pending issue raised by Andrea. worse case someone reverse the patches and allows to build against the old lib.
Please give it some testing, I will watch the commits and if there isn't any solution 2 days from now I will postpone the GA by a couple of days. Hopefully that won't be needed :-)
Since commit 50edca1331298bfcb2622e8fe588d493aff9ab68 qemu: monitor: Add the 'query-nodes' argument for query-blockstats https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=50edca133129 'virsh domblkstat' shows nothing for me with QEMU 2.9.0 (works with v3.0.0-rc0-66-gccf02d73d1) And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine? Jano
thanks in advance,
Daniel
-- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Jul 31, 2018 at 19:19:33 +0200, Ján Tomko wrote:
On Tue, Jul 31, 2018 at 12:19:28PM +0800, Daniel Veillard wrote:
It is out, tagged in git and with signed tarball and rpms at the usual place:
ftp://libvirt.org/libvirt/
in my limited testing it works but we have that pending issue raised by Andrea. worse case someone reverse the patches and allows to build against the old lib.
Please give it some testing, I will watch the commits and if there isn't any solution 2 days from now I will postpone the GA by a couple of days. Hopefully that won't be needed :-)
Since commit 50edca1331298bfcb2622e8fe588d493aff9ab68 qemu: monitor: Add the 'query-nodes' argument for query-blockstats https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=50edca133129
'virsh domblkstat' shows nothing for me with QEMU 2.9.0 (works with v3.0.0-rc0-66-gccf02d73d1)
This is probably a red herring. The 'B' modifier used in the patch does not put the argument unless it's true to the monitor. I traced the problem to a bug in commit 8d9ca6cdb3a58414 where I've changed 'nstats' to a pointer but did not fix the usage in the macro which gathers the stats. This means that the expanded code was incrementing the pointer which was not dereferenced aferwards rather than the value itself. I'll post a patch soon.

On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs? Michal

On Wed, Aug 01, 2018 at 12:40:14 +0200, Michal Privoznik wrote:
On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs?
We can fill in the definition and then copy it and validate it afterwards. That way the blanks are filled and we can then validate that it fits into the other definition. The description here sounds that in this release we made things worse than it was before though.

On 08/01/2018 06:47 AM, Peter Krempa wrote:
On Wed, Aug 01, 2018 at 12:40:14 +0200, Michal Privoznik wrote:
On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs?
We can fill in the definition and then copy it and validate it afterwards. That way the blanks are filled and we can then validate that it fits into the other definition. The description here sounds that in this release we made things worse than it was before though.
Like I said in response to Jano's note after the push: https://www.redhat.com/archives/libvir-list/2018-July/msg01736.html if the patch is reverted that's fine. The referenced bz is for someone that doesn't define an <address> for a <disk> or <hostdev> (I would think a common occurrence) and the counter concern is for someone that doesn't supply a <mac> for an <interface> (a perhaps less common, but equally concerning condition). In one instance (duplicated <address>) the subsequent guest boot fails. In other other instance (difference <mac>) the subsequent guest boot has a different mac. In both instances the <address> could be different and most likely will be. The docs are not precise on what happens when adding a <mac> to both --live and --config. The network docs indicate to allow libvirt to generate the mac to "assure that it is compatible with the idiosyncrasies of the platform where libvirt is running". Whether someone uses --config or --live or --config and --live it's perhaps very difficult to impossible to be able to provide a solution that will make "everyone" happy. To say we can "copy" from one or the other I would think is problematic since who's to say which is "more correct" once the guest is running. Someone could add 4 devices to --config, then 2 to --live, and then decide to add 1 to both - if they don't supply specific addresses, then IIRC even before this patch it's not possible to "match" the two. Having to have <address> code look through @def and @newDef to find an unused address that could be used for both could be a challenging algorithm especially since SCSI <disk>'s and <hostdev>'s can live on the same adapter. Furthermore, SCSI <disk>'s have this preference related to the @dst target name that really could get complicated with looking at both live and config. Perhaps this just becomes one of those "hypervisor defined" activities (regardless of whether the patch is removed or not) and we just move on. John [hopefully this makes it to the list - given the unreliability to receive libvir-list traffic over the last 24 hours here at Red Hat /-|]

On 08/01/2018 12:47 PM, Peter Krempa wrote:
On Wed, Aug 01, 2018 at 12:40:14 +0200, Michal Privoznik wrote:
On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs?
We can fill in the definition and then copy it and validate it afterwards. That way the blanks are filled and we can then validate that it fits into the other definition. The description here sounds that in this release we made things worse than it was before though.
But that might fail. For instance, we generate a PCI for a hot plugged device based on say live XML, but the address is already taken in config XML. I'm not sure if there's an easy way out of this. And regarding the revert - I guess it's just a matter of somebody posting the patch. Michal

(Let's see now.... which people were the ones who believe it's a huge insult to Cc them in list replies, and which are the ones who appreciate the Cc and use it as a flag to raise the visibility of the message in their mail client ..... ?????. Ah, I give up, just leaving in the Cc's and let the complaints fly)(For the record, I like explicit Cc's when you want to make sure I see something - that sends a copy to my phone, which can be mildly annoying if it's a series of 40 patches, but also very effective in getting my attention :-P) On 08/01/2018 06:40 AM, Michal Privoznik wrote:
On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
(Let's see, you said my name once in the Cc, and once here. You really need to say it *three* times to get my attention ala Beetlejuice or Biggy Smalls. (and it helps to recite it into a mirror in a dark room, apparently)).
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs?
Yeah, this has been a problem for a long time (it's also problematic for, e.g. the index of PCI controllers, which can cascade into differences in the PCI addresses of devices that are connected to those controllers; similar problem for other types of controllers of course). I think I even tried fixing it once, but the solution was inadequate. The problem comes when someone uses a mixture of --live only, --config only, and --live+--config device attaches/detaches. At the base of the problem is the fact that the live and config domaindef objects are stored completely independent of each other, and the code that assigns PCI addresses "seeds" the list of in-use address from a single domaindef object. Perhaps we need to merge these into a single object where each device is marked as "live", "config" or "both" - then any new device added would be guaranteed to not conflict with any existing device in live or config - the unified device list would be used to determine what addresses are used, and the new device would just be added once to one domaindef object (just with live and config flags set) so there would be no need to copy it. Or, well..., after about 30 seconds of thought I realize this would lead to different behavior if someone e.g. deleted a currently-active device only from the config, then wanted to add that same device back again - the re-added device would end up with a different PCI address. So maybe that isn't the right solution either. At any rate, there is no perfect solution in sight for the current release, so the question is whether the new (bad) behavior is better or worse than the old (also bad) behavior. My understanding is that the old behavior could lead to a config that had two devices at the same PCI address, which is definitely undesirable. The new behavior could lead to the PCI address of a newly-added device being different the next time the guest is shutdown and restarted. I would tend to prefer the latter, with the caveat that this new behavior provides a config that works (from libvirt's domain parsing POV), but might create a strange error in the guest that would be extremely difficult to troubleshoot (especially 6 months from now after we've all forgotten about this patch (and forgotten about the idea that a more complete fix was needed). So I'm undecided about my opinion. And when undecided I tend toward inaction. Now *that's* helpful, isn't it?

On 08/01/2018 04:44 PM, Laine Stump wrote:
(Let's see now.... which people were the ones who believe it's a huge insult to Cc them in list replies, and which are the ones who appreciate the Cc and use it as a flag to raise the visibility of the message in their mail client ..... ?????. Ah, I give up, just leaving in the Cc's and let the complaints fly)(For the record, I like explicit Cc's when you want to make sure I see something - that sends a copy to my phone, which can be mildly annoying if it's a series of 40 patches, but also very effective in getting my attention :-P)
On 08/01/2018 06:40 AM, Michal Privoznik wrote:
On 07/31/2018 07:19 PM, Ján Tomko wrote:
And I'm still unsure about leaving in commit 55ce65646348884656fd7bf3f109ebf8f7603494 qemu: Use the correct vm def on cold attach https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=55ce6564634 Which means attach-device --live --config will attach an interface with a different MAC address in live and persistent definition. Laine?
(Let's see, you said my name once in the Cc, and once here. You really need to say it *three* times to get my attention ala Beetlejuice or Biggy Smalls. (and it helps to recite it into a mirror in a dark room, apparently)).
Well, It's not only MAC address that can change. The device address might change too. This points to a broader problem. When we are parsing a device XML we fill in the blanks in postParse callbacks. However, those look only at either live or at inactive XML. Not at both at the same time. So how can we fill in the blanks that would be valid for both XMLs?
Yeah, this has been a problem for a long time (it's also problematic for, e.g. the index of PCI controllers, which can cascade into differences in the PCI addresses of devices that are connected to those controllers; similar problem for other types of controllers of course). I think I even tried fixing it once, but the solution was inadequate.
The problem comes when someone uses a mixture of --live only, --config only, and --live+--config device attaches/detaches. At the base of the problem is the fact that the live and config domaindef objects are stored completely independent of each other, and the code that assigns PCI addresses "seeds" the list of in-use address from a single domaindef object. Perhaps we need to merge these into a single object where each device is marked as "live", "config" or "both" - then any new device added would be guaranteed to not conflict with any existing device in live or config - the unified device list would be used to determine what addresses are used, and the new device would just be added once to one domaindef object (just with live and config flags set) so there would be no need to copy it.
Or, well..., after about 30 seconds of thought I realize this would lead to different behavior if someone e.g. deleted a currently-active device only from the config, then wanted to add that same device back again - the re-added device would end up with a different PCI address. So maybe that isn't the right solution either.
At any rate, there is no perfect solution in sight for the current release, so the question is whether the new (bad) behavior is better or worse than the old (also bad) behavior. My understanding is that the old behavior could lead to a config that had two devices at the same PCI address, which is definitely undesirable. The new behavior could lead to the PCI address of a newly-added device being different the next time the guest is shutdown and restarted. I would tend to prefer the latter, with the caveat that this new behavior provides a config that works (from libvirt's domain parsing POV), but might create a strange error in the guest that would be extremely difficult to troubleshoot (especially 6 months from now after we've all forgotten about this patch (and forgotten about the idea that a more complete fix was needed).
So I'm undecided about my opinion. And when undecided I tend toward inaction. Now *that's* helpful, isn't it?
Laine is stumped ;-). I'm still stumped over what strange error could be created. How is the virtual {PCI|SCSI} address changing any different from "real hardware" if someone adds something new into their physical system? Or the other fun adventure, a physical move from location A to location B where someone "forgot" to properly label what went where and upon reassembly things are ordered differently. Consider some (perhaps not so) long ago SCSI devices where you'd need to grab a small, sharp, pin like object to toggle switches to define the address for the device when it got plugged in. That means the user had to guess and figure out what address to use. Good luck if you conflicted with something else. After some time it was also difficult to tell which end was the low address toggle/bits. Still, while SCSI/PCI addresses were one part of the question - I think the second concern was that for the "live & config" case - the unprovided MAC would be different as well. What kind of problems could that lead to? John

On Wed, Aug 01, 2018 at 17:44:56 -0400, John Ferlan wrote:
On 08/01/2018 04:44 PM, Laine Stump wrote:
[...]
At any rate, there is no perfect solution in sight for the current release, so the question is whether the new (bad) behavior is better or worse than the old (also bad) behavior. My understanding is that the old behavior could lead to a config that had two devices at the same PCI address, which is definitely undesirable. The new behavior could lead to the PCI address of a newly-added device being different the next time the guest is shutdown and restarted. I would tend to prefer the latter, with the caveat that this new behavior provides a config that works (from libvirt's domain parsing POV), but might create a strange error in the guest that would be extremely difficult to troubleshoot (especially 6 months from now after we've all forgotten about this patch (and forgotten about the idea that a more complete fix was needed).
So I'm undecided about my opinion. And when undecided I tend toward inaction. Now *that's* helpful, isn't it?
Laine is stumped ;-).
I'm still stumped over what strange error could be created. How is the virtual {PCI|SCSI} address changing any different from "real hardware" if someone adds something new into their physical system? Or the other
Well, the issue is that you issue an API to add the device, which in real life would translate into plugging it into the machine. Afterwards you turn the machine off and back on. Without any API (or physical contact) you expect that the hardware will not move places by itself. If the API call includes both AFFECT_LIVE and AFFECT_CONFIG we should make sure that the device plugged in is exactly the same. If they are issued separately we don't care at all though. Btw, having this analogy, specifying only AFFECT_CONFIG is probably similar to putting a post-it note on top of the power button for the next person to attach the hardware prior to powering it up.
fun adventure, a physical move from location A to location B where someone "forgot" to properly label what went where and upon reassembly things are ordered differently.
Consider some (perhaps not so) long ago SCSI devices where you'd need to grab a small, sharp, pin like object to toggle switches to define the address for the device when it got plugged in.
Note that in that era it would be very bad in some cases when the hardware would change the jumper configuration by itself as sometimes the drivers could not autoconfigure. The addresses were put in config files.

On 08/02/2018 03:57 AM, Peter Krempa wrote:
On Wed, Aug 01, 2018 at 17:44:56 -0400, John Ferlan wrote:
On 08/01/2018 04:44 PM, Laine Stump wrote:
[...]
At any rate, there is no perfect solution in sight for the current release, so the question is whether the new (bad) behavior is better or worse than the old (also bad) behavior. My understanding is that the old behavior could lead to a config that had two devices at the same PCI address, which is definitely undesirable. The new behavior could lead to the PCI address of a newly-added device being different the next time the guest is shutdown and restarted. I would tend to prefer the latter, with the caveat that this new behavior provides a config that works (from libvirt's domain parsing POV), but might create a strange error in the guest that would be extremely difficult to troubleshoot (especially 6 months from now after we've all forgotten about this patch (and forgotten about the idea that a more complete fix was needed).
So I'm undecided about my opinion. And when undecided I tend toward inaction. Now *that's* helpful, isn't it?
Laine is stumped ;-).
I'm still stumped over what strange error could be created. How is the virtual {PCI|SCSI} address changing any different from "real hardware" if someone adds something new into their physical system? Or the other
Well, the issue is that you issue an API to add the device, which in real life would translate into plugging it into the machine.
Afterwards you turn the machine off and back on. Without any API (or physical contact) you expect that the hardware will not move places by itself.
If you chose the address yourself, then it wouldn't change. IOW: If I physically plugged into a specific spot, then no change. But in this case, the consumer said, I don't care, choose one for me and we did. If the consumer said LIVE and CONFIG every time, then I'd venture to guess/assume since the algorithm is the same that the resulting libvirt chosen address would be the same. The problem comes when the same customer chooses CONFIG (or LIVE) at least once before choosing CONFIG & LIVE in a followup.
If the API call includes both AFFECT_LIVE and AFFECT_CONFIG we should make sure that the device plugged in is exactly the same. If they are issued separately we don't care at all though.
Btw, having this analogy, specifying only AFFECT_CONFIG is probably similar to putting a post-it note on top of the power button for the next person to attach the hardware prior to powering it up.
Given the same physical situation of leaving a post-it note to plug this thing in later into slot 3, but someone comes after the note is written but before the power button is pressed and plugs something else into slot 3 because it was just "next", then when it comes time to act upon the post-it note where does said person plug this into since slot 3 is taken? Do they unplug whatever was plugged into slot 3, plug the post-it note thing into slot 3 and then plug the other thing into slot 4 or do they just plug the post-it note device into slot 4. I'd say it's a coin flip and no worse than what we'd be doing. Conversely, if that person plugging in live read the note and then chose slot 4, then when it comes time to act upon the post-it note to plug at slot 3, everyone is happy. Of course in this case, we had a human deciding to "assign" the addresses to specific devices or in XML terms provide the <address> to assign the device rather than deciding upon the next available slot. John
fun adventure, a physical move from location A to location B where someone "forgot" to properly label what went where and upon reassembly things are ordered differently.
Consider some (perhaps not so) long ago SCSI devices where you'd need to grab a small, sharp, pin like object to toggle switches to define the address for the device when it got plugged in.
Note that in that era it would be very bad in some cases when the hardware would change the jumper configuration by itself as sometimes the drivers could not autoconfigure. The addresses were put in config files.
Still trying to picture how "hardware would change the jumper configuration by itself" without any human interaction. Those toggle switches and connectors were a pain to deal with if you didn't have the right tools. If only the hardware would have done it by itself, then we wouldn't need the jumpers or toggle switches.
participants (11)
-
Andrea Bolognani
-
Bjoern Walk
-
Daniel P. Berrangé
-
Daniel Veillard
-
John Ferlan
-
Ján Tomko
-
Laine Stump
-
Martin Kletzander
-
Michal Privoznik
-
Michal Prívozník
-
Peter Krempa