[libvirt] [RFC PATCH 0/2] Clear capabilities cache options

Posting this as an RFC since I'm trying to get upstream feedback on whether either approach is better or whether perhaps both approaches should be used. The problem occurred primarily for downstream installations where it seems the order of the build libvirtd caused issues regarding whether certain capabilities bits were present. Although the history of the case doesn't "prove" my theory exactly other issues around the same time that were part of a private bz or email stream (I forget now) seem to indicate that the order of libvirtd builds and installations would be very important as they relate to when the cache is rebuilt. In the instances seen there was lots of testing going on related to whether patches were found in a 'maintenance release' and the 'mainline release'. Since the process is build the mainline first and then backport the patch(es) to the maintenance release, it's possible the date of the maintenance libvirtd would be later than that of the mainline release. If testing installs the maintenance release first and the cache gets updated based on the 'ctime' of libvirtd being later than what was currently installed and then the mainline release is installed with a libvirtd 'ctime' that would possibly be prior to that of the maintentance release, then the cache wouldn't be updated. The fix for testing has been just to delete the cache. In looking at the code and history, I see a couple possible solutions. The first seems to be a "big hammer" - on every libvirtd installation we delete the entire cache - this would force the reread. I believe I got that right using %post of the install to clear out anything in the capabilities directory (whether just created or already existing from some prior installation). Another solution was to adjust the libvirtd 'ctime' check to be if different than what is found in the cache. When looking at the original commit, this seems to be what the intention was in the commit message, but the actual code was slightly different. Upstream review comments for changes didn't mention the difference either, see: http://www.redhat.com/archives/libvir-list/2014-March/msg00629.html and https://www.redhat.com/archives/libvir-list/2014-March/msg00301.html John Ferlan (2): spec: Remove capabilities cache during %post install qemu: Force capabilities cache read if libvirtd date is different libvirt.spec.in | 11 +++++++++++ src/qemu/qemu_capabilities.c | 6 ++++-- 2 files changed, 15 insertions(+), 2 deletions(-) -- 2.1.0

https://bugzilla.redhat.com/show_bug.cgi?id=1195882 Force re-reading of the capabilities cache upon new installations. Since re-reading of the capabilities is based on the 'ctime' of either the QEMU binary or if the new libvirtd is greater than what created the cache file, it's possible that depending on the order of installations that a later date created "backported" libvirtd could create the cache file while followup installation of a mainline release that had an earlier creation date wouldn't cause the cache to be updated. This results in the possibility that (a) feature(s) in the mainstream release wouldn't be available unless someone delete the cache by hand. Signed-off-by: John Ferlan <jferlan@redhat.com> --- libvirt.spec.in | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/libvirt.spec.in b/libvirt.spec.in index 4195518..e28f737 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -1705,6 +1705,10 @@ fi /sbin/chkconfig --add virtlockd %endif +# Remove any files in the capabilities cache to force the daemon +# to start fresh +rm -rf %{_localstatedir}/cache/libvirt/qemu/capabilities + %preun daemon %if %{with_systemd} %if %{with_systemd_macros} @@ -1828,6 +1832,13 @@ exit 0 %endif %endif %endif + +%post daemon-driver-qemu + +# Remove any files in the capabilities cache to force the daemon +# to start fresh +rm -rf %{_localstatedir}/cache/libvirt/qemu/capabilities + %endif # %{with_libvirtd} %preun client -- 2.1.0

On Wed, May 20, 2015 at 08:52:56AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Force re-reading of the capabilities cache upon new installations.
Since re-reading of the capabilities is based on the 'ctime' of either the QEMU binary or if the new libvirtd is greater than what created the cache file, it's possible that depending on the order of installations that a later date created "backported" libvirtd could create the cache file while followup installation of a mainline release that had an earlier creation date wouldn't cause the cache to be updated. This results in the possibility that (a) feature(s) in the mainstream release wouldn't be available unless someone delete the cache by hand.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- libvirt.spec.in | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/libvirt.spec.in b/libvirt.spec.in index 4195518..e28f737 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -1705,6 +1705,10 @@ fi /sbin/chkconfig --add virtlockd %endif
+# Remove any files in the capabilities cache to force the daemon +# to start fresh +rm -rf %{_localstatedir}/cache/libvirt/qemu/capabilities
This is going to be racey. You are removing the cache while the existing libvirtd is still running. Some time later RPM will tell systemd to restart libvirtd. Between those points in time the old libvirtd may have re-generated the cache. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

https://bugzilla.redhat.com/show_bug.cgi?id=1195882 Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file. This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2757636..51bbf55 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2979,9 +2979,11 @@ virQEMUCapsInitCached(virQEMUCapsPtr qemuCaps, const char *cacheDir) goto cleanup; } - /* Discard if cache is older that QEMU binary */ + /* Discard if cache is older that QEMU binary or + * if libvirtd changed + */ if (qemuctime != qemuCaps->ctime || - selfctime < virGetSelfLastChanged()) { + selfctime != virGetSelfLastChanged()) { VIR_DEBUG("Outdated cached capabilities '%s' for '%s' " "(%lld vs %lld, %lld vs %lld)", capsfile, qemuCaps->binary, -- 2.1.0

On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2757636..51bbf55 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2979,9 +2979,11 @@ virQEMUCapsInitCached(virQEMUCapsPtr qemuCaps, const char *cacheDir) goto cleanup; }
- /* Discard if cache is older that QEMU binary */ + /* Discard if cache is older that QEMU binary or + * if libvirtd changed + */
This comment is actually run for QEMU already
if (qemuctime != qemuCaps->ctime ||
As we're re-generating if QEMU changed, not if it is older
- selfctime < virGetSelfLastChanged()) { + selfctime != virGetSelfLastChanged()) { VIR_DEBUG("Outdated cached capabilities '%s' for '%s' " "(%lld vs %lld, %lld vs %lld)", capsfile, qemuCaps->binary,
At first glance this looks reasonable, because it aligns behaviour for QEMU and libvirtd timestamp checks. I'm trying to understand why I used < for libvirtd check, but != for the QEMU check when i first wrote this, in case there was some edge case I've forgotten though. Tentative ACK if others agree Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/20/2015 09:28 AM, Daniel P. Berrange wrote:
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2757636..51bbf55 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2979,9 +2979,11 @@ virQEMUCapsInitCached(virQEMUCapsPtr qemuCaps, const char *cacheDir) goto cleanup; }
- /* Discard if cache is older that QEMU binary */ + /* Discard if cache is older that QEMU binary or + * if libvirtd changed + */
This comment is actually run for QEMU already
if (qemuctime != qemuCaps->ctime ||
As we're re-generating if QEMU changed, not if it is older
Re-reading the comment is odd - I can change it to: "Discard cache if QEMU binary or libvirtd changed"
- selfctime < virGetSelfLastChanged()) { + selfctime != virGetSelfLastChanged()) { VIR_DEBUG("Outdated cached capabilities '%s' for '%s' " "(%lld vs %lld, %lld vs %lld)", capsfile, qemuCaps->binary,
At first glance this looks reasonable, because it aligns behaviour for QEMU and libvirtd timestamp checks.
I'm trying to understand why I used < for libvirtd check, but != for the QEMU check when i first wrote this, in case there was some edge case I've forgotten though.
I assumed it perhaps was related to in release upstream related work which generally wouldn't need the update unless of course someone updated QEMU as well. The v1 check was mtime based - perhaps that's the reason? I dunno! John

s/read/refresh/ in the commit message? On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change. However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug. Jan

On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read... So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits. John

On Wed, May 20, 2015 at 01:35:43PM -0400, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
For privileged daemons, this depends on the LOCALSTATEDIR constant, which can be chosen at configure time. For non-privileged, it depends on the XDG_CACHE_HOME environment variable. I use ./autogen.sh --system for both my libvirt repos, so both store the cache under /var/ Jan

On 20.05.2015 19:35, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read...
So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits.
How about computing a hash of the qemu binary, and if hashes do not equal recompile the cache? Michal

On Fri, May 22, 2015 at 11:56:24AM +0200, Michal Privoznik wrote:
On 20.05.2015 19:35, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read...
So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits.
How about computing a hash of the qemu binary, and if hashes do not equal recompile the cache?
That'd be significantly slower than todays code, and I don't think it is needed. We just need to change the from < date to != date. The currenty < date check was written based on the (mistaken) assumption that the ctime would be set based on the time at which the RPM package was installed. ie, even if you downgraded libvirt, I thought you'd get a newer ctime due to install time. Since RPM preserves timestamps from the build-environment this assumption was invalid. Using != date should deal with it adequately as the chance of two separate RPMs builds having been done at precisely the same time is minimal. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Fri, May 22, 2015 at 11:02:57AM +0100, Daniel P. Berrange wrote:
On Fri, May 22, 2015 at 11:56:24AM +0200, Michal Privoznik wrote:
On 20.05.2015 19:35, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read...
So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits.
How about computing a hash of the qemu binary, and if hashes do not equal recompile the cache?
That'd be significantly slower than todays code, and I don't think it is needed. We just need to change the from < date to != date. The currenty < date check was written based on the (mistaken) assumption that the ctime would be set based on the time at which the RPM package was installed. ie, even if you downgraded libvirt, I thought you'd get a newer ctime due to install time. Since RPM preserves timestamps from the build-environment this assumption was invalid. Using != date should deal with it adequately as the chance of two separate RPMs builds having been done at precisely the same time is minimal.
RPM preserves mtimes, not ctimes. We use ctimes since commit f5059a929e0db6689e908e354b13a60661c143e1 Change QEMU capabilities cache to check ctime instead of mtime CommitDate: 2014-03-11 10:52:29 +0000 http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f5059a92 Jan

On Fri, May 22, 2015 at 02:23:12PM +0200, Ján Tomko wrote:
On Fri, May 22, 2015 at 11:02:57AM +0100, Daniel P. Berrange wrote:
On Fri, May 22, 2015 at 11:56:24AM +0200, Michal Privoznik wrote:
On 20.05.2015 19:35, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read...
So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits.
How about computing a hash of the qemu binary, and if hashes do not equal recompile the cache?
That'd be significantly slower than todays code, and I don't think it is needed. We just need to change the from < date to != date. The currenty < date check was written based on the (mistaken) assumption that the ctime would be set based on the time at which the RPM package was installed. ie, even if you downgraded libvirt, I thought you'd get a newer ctime due to install time. Since RPM preserves timestamps from the build-environment this assumption was invalid. Using != date should deal with it adequately as the chance of two separate RPMs builds having been done at precisely the same time is minimal.
RPM preserves mtimes, not ctimes. We use ctimes since commit f5059a929e0db6689e908e354b13a60661c143e1 Change QEMU capabilities cache to check ctime instead of mtime CommitDate: 2014-03-11 10:52:29 +0000
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f5059a92
Oh, so it does. I thought it preserved both, but I was comparing the wrong fields. In that case, I'm not clear on just what problem John is seeing. Even if he installs an older maint release RPM, or switches to an old maint branch and rebuilds, the ctime should always get changed. So the cache should invalidate. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/22/2015 08:26 AM, Daniel P. Berrange wrote:
RPM preserves mtimes, not ctimes. We use ctimes since commit f5059a929e0db6689e908e354b13a60661c143e1 Change QEMU capabilities cache to check ctime instead of mtime CommitDate: 2014-03-11 10:52:29 +0000
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f5059a92
Oh, so it does. I thought it preserved both, but I was comparing the wrong fields. In that case, I'm not clear on just what problem John is seeing. Even if he installs an older maint release RPM, or switches to an old maint branch and rebuilds, the ctime should always get changed. So the cache should invalidate.
The "history" of this issue also includes some direct (internal Red Hat) email regarding using beaker systems where testing was having problems. I suppose while trying to consider both and my assumptions about ctime being creation time not change time, my theory has been this installation order processing that could cause the problem, but it seems that notion is no longer valid. Still given the various ways ctime can be changed - is it perhaps reasonable or desirable to add checks for versions as well? That is save both qemu & libvirtd versions and if they don't match, cause a refresh? John FWIW: Here's a cut-n-paste of a part of the email that I have. It doesn't give all the context, but it does show the process used to "work around" what was seen. ... Sojust worked this much out. This flow stops the error from happening: - Installing libvirt and the various other dependencies - setting memerships etc. - rm -rf /var/cache/libvirt/qemu/capabilities - systemctl restart libvirtd - <run code to create and start vm> ie: after installing libvirt I had to clear the cache and restart the service. Clearing the cache before, uninstalling libvirt and reinstalling libvirt didn't help. The cache had to be cleared right after installing libvirt but before launching the vm. The problem only showed up if we installed, launched a vm, removed the vm, uninstalled, reinstalled, launched a new vm. ... There was also a link to the following: https://www.mail-archive.com/search?l=debian-bugs-dist@lists.debian.org&q=subject:%22Bug%23731815%3A+virsh%3A+Unable+to+start+any+VM+with+custom+cpu+model%22&o=newest&f=1

On Fri, May 22, 2015 at 09:00:27AM -0400, John Ferlan wrote:
On 05/22/2015 08:26 AM, Daniel P. Berrange wrote:
RPM preserves mtimes, not ctimes. We use ctimes since commit f5059a929e0db6689e908e354b13a60661c143e1 Change QEMU capabilities cache to check ctime instead of mtime CommitDate: 2014-03-11 10:52:29 +0000
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f5059a92
Oh, so it does. I thought it preserved both, but I was comparing the wrong fields. In that case, I'm not clear on just what problem John is seeing. Even if he installs an older maint release RPM, or switches to an old maint branch and rebuilds, the ctime should always get changed. So the cache should invalidate.
The "history" of this issue also includes some direct (internal Red Hat) email regarding using beaker systems where testing was having problems. I suppose while trying to consider both and my assumptions about ctime being creation time not change time, my theory has been this installation order processing that could cause the problem, but it seems that notion is no longer valid. Still given the various ways ctime can be changed - is it perhaps reasonable or desirable to add checks for versions as well? That is save both qemu & libvirtd versions and if they don't match, cause a refresh?
John
FWIW: Here's a cut-n-paste of a part of the email that I have. It doesn't give all the context, but it does show the process used to "work around" what was seen.
...
Sojust worked this much out. This flow stops the error from happening:
- Installing libvirt and the various other dependencies - setting memerships etc. - rm -rf /var/cache/libvirt/qemu/capabilities - systemctl restart libvirtd - <run code to create and start vm>
ie: after installing libvirt I had to clear the cache and restart the service. Clearing the cache before, uninstalling libvirt and reinstalling libvirt didn't help. The cache had to be cleared right after installing libvirt but before launching the vm.
The problem only showed up if we installed, launched a vm, removed the vm, uninstalled, reinstalled, launched a new vm.
I'd be interested to see the precise steps to reproduce the problem, as I'm still struggling to see why that could cause an issue, given that we've confirmed RPM changes the libvirtd timestamp at time of install. So if libvirtd is restarted after installation, as is done in %post, then it should pick up the new ctime. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Fri, May 22, 2015 at 11:56:24AM +0200, Michal Privoznik wrote:
On 20.05.2015 19:35, John Ferlan wrote:
On 05/20/2015 10:22 AM, Ján Tomko wrote:
s/read/refresh/ in the commit message?
On Wed, May 20, 2015 at 08:52:57AM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1195882
Original commit id 'cbde3589' indicates that the cache file would be discarded if either the QEMU binary or libvirtd 'ctime' changes; however, the code only discarded if the QEMU binary time didn't match or if the new libvirtd ctime was later than what created the cache file.
This could lead to issues with respect to how the order of libvirtd images is created for maintenance or patch branches where if someone had a libvirtd created on 'date x' that was created from (a) backported patch(es) followed by a subsequent install of the primary release which would have 'date y' where if 'date x' was greater than 'date y', then features in a primary release branch may not be available.
I can see how here can be two daemons with different ctimes on the same system during development, so ACK to the change.
But they'd use different cache paths, right?
However, when you install the daemon (even from an older package), ctime should only move forward, so I'm sceptical about its relevance to the referenced bug.
Perhaps that my misinterpretation or misunderstanding of ctime - certainly in a different OS I used to work on many years ago - ctime was when the image was created by a build and not when the inode or file change time as I (now) read...
So hmmm... that blows my theory to shreds - unless you account for that window of time during an install where files are being replaced while libvirtd is still running an older release. As Dan notes in my 1/2 removing the cache in between would be racey. So would it also be racey if something caused a cache read, code finds updated libvirtd, asks qemu for current capabilities, gets answer, creates file based on current (or old) understanding... Then when libvirtd restarts it finds a ctime of itself, doesn't update the cache, and of course doesn't have the latest bits.
How about computing a hash of the qemu binary, and if hashes do not equal recompile the cache?
That would catch the unlikely case of someone moving the system time back and changing the QEMU binary at the exact time when the old one was changed. Jan
participants (4)
-
Daniel P. Berrange
-
John Ferlan
-
Ján Tomko
-
Michal Privoznik