Cc: libvir-list for review of the compatibility argument below.
Daniel Henrique Barboza <danielhb413(a)gmail.com> writes:
Hey David,
On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> * Daniel Henrique Barboza (danielhb413(a)gmail.com) wrote:
>> changes in v2:
>> - removed the "RFC" marker;
>> - added a new patch (patch 2) that removes
>> bdrv_snapshot_delete_by_id_or_name from the code;
>> - made changes in patch 1 as suggested by Murilo;
>> - previous patch set link:
>>
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
>>
>>
>> It is not uncommon to see bugs being opened by testers that attempt to
>> create VM snapshots using HMP. It turns out that "0" and "1"
are quite
>> common snapshot names and they trigger a lot of bugs. I gave an example
>> in the commit message of patch 1, but to sum up here: QEMU treats the
>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as
'name'. It
>> is documented as such, but this can lead to strange situations.
>>
>> Given that it is strange for an API to consider a parameter to be 2 fields
>> at the same time, and inadvently treating them as one or the other, and
>> that removing the ID field is too drastic, my idea here is to keep the
>> ID field for internal control, but do not let the user set it.
>>
>> I guess there's room for discussion about considering this change an API
>> change or not. It doesn't affect users of HMP and it doesn't affect
Libvirt,
>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> Can you clarify a couple of things:
> a) What is it about libvirt's use that means it's OK ? Does it never
> use numeric tags?
I am glad you asked because my understanding in how Libvirt was dealing
with snapshots was wrong, and I just looked into it further to answer your
question. Luckily, this series fixes the situation for Libvirt as well.
I was misled by the fact that Libvirt does not show the same symptoms
we see in
QEMU of this problem, but the bug is there. Here's a quick test with
Libvirt with
"0" and "1" as snapshot names, considering a VM named with no
snapshots,
using QEMU 2.12 and Libvirt 4.0.0:
- create the "0" snapshot:
$ sudo virsh snapshot-create-as --name 0 dhb
Domain snapshot 0 created
$ sudo virsh snapshot-list dhb
Name Creation Time State
------------------------------------------------------------
0 2018-09-24 15:47:56 -0400 running
$ sudo virsh qemu-monitor-command dhb --hmp info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- 0 405M 2018-09-24 15:47:56 00:04:20.867
- created the "1" snapshot. Here, Libvirt shows both snapshots with
snapshot-list,
but the snapshot was erased inside QEMU:
$ sudo virsh snapshot-create-as --name 1 dhb
Domain snapshot 1 created
$
$ sudo virsh snapshot-list dhb
Name Creation Time State
------------------------------------------------------------
0 2018-09-24 15:47:56 -0400 running
1 2018-09-24 15:50:09 -0400 running
$ sudo virsh qemu-monitor-command dhb --hmp info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- 1 404M 2018-09-24 15:50:10 00:05:36.226
This is where I stopped checking out Libvirt at first, concluding
wrongly that it
was immune to the bug.
Libvirt does not throw an error when trying to apply snapshot 0. In
fact, it acts
like everything went fine:
$ sudo virsh snapshot-revert dhb 0
$ echo $?
0
Is that a libvirt bug?
Reverting back to snapshot "1" works as intended, restoring
the VM
state when it
was created.
(perhaps this is something we should let Libvirt be aware of ...)
This series fixes this behavior because the snapshot 0 isn't erased
from QEMU, allowing
Libvirt to successfully restore it.
> b) After this series are you always guaranteed to be able to fix
> any existing oddly named snapshots?
The oddly named snapshots that already exists can be affected by the
semantic
change in loadvm and delvm, in a way that the user can't load/del
using the snap
ID, just the tag. Aside from that, I don't see any side effects with
existing
snapshots and this patch series.
Do all snapshots have a tag that is unique within their image? Even
snapshots created by old versions of QEMU?