Re: [libvirt] [PATCH 17/16] snapshot: Store qemu snapshot state in bulk

25 Mar 2019

      On 3/25/19 12:19 PM, Daniel P. Berrangé wrote:
...
On Fri, Mar 22, 2019 at 01:07:04PM -0500, Eric Blake wrote:
...
[keeping context, because of adding Dan in cc]
...
To some degree we could be 'stuck with' the model for snapshots, but
that doesn't mean checkpoints couldn't make use of some newer
functionality that stores everything in one file and not allow the
storage into multiple files.
I think it would be a really bad idea to use a different approach for
snapshots vs checkpoints given how similar they are in every other
respect.
Agreed, which is why deciding what to do here will impact what code I
end up using for checkpoints.

I will either end up reverting the patches adding low-level bulk code
for snapshots in commits 86c0ed6f and 1b57269c (it's been refactored a
bit in the meantime, but since we dropped the API additions, there is no
current client to the low-level code), or I will have similar low-level
bulk code for checkpoints.
...
...
...
and even more.  I think when I first reviewed this I thought of the RPC
limit stuff - at least w/r/t how the stats code is the other place where
we've hit this limit. But for some reason I felt that perhaps some under
the covers code in that code we solved something that allowed for things
to work, but now I'm not so sure.
I don't know if virXML has some way of visiting a seekable stream, and
parsing in only chunks at a time. You still have to PARSE the entire
file to learn where the chunks are, but if the parser is stream based
and keeps track of offsets where interesting chunks start, it is perhaps
feasable that you could cap your memory usage by revisiting chunks as
needed. (But doing that to save memory ALSO means reparsing portions as
you reload them into memory, so you're slower than if you had kept the
full file in memory anyway).
As soon as we talk about making the stream seekable & parsing chunks
for efficiency, this is a red flag to me saying that we should not
go down this route. We already have a filesystem layout that lets us
read just the individual snapshots that we care about one at a time.
Creating this problem by putting all snapshots into a single file
and then trying to solve it by creating indexes for chunks inside the
file is basically creating a filesystem within a file.
...
Or maybe we need to think about ways to compress the information (most
of the <domain>s in the overall <snapshotlist> will share a lot of
commonality, even if the user hot-plugged devices between snapshots).
But that's a big project, which I don't want to tackle.
On disk storage space is practically free, so I don't see compression
being neccessary, especially if you compare the XML size to the actual
disk image snapshot data.
Overall, I'm struggling to be convinced that putting all snapshots into
a single file is a compelling enough think todo. I can see there are
some benefits, but there are also benefits to the current per-file
approach too. I worry that the bulk file will create efficiency problems
later that will require us to go back to per-file impl later, or reinvent
filesystems-within-a-file.
One other thing I just thought of - one of the potential advantages I
could think of for bulk code was that I only had to write the file
system once per API call (rather than multiple times), and in doing so,
I was able to track the current snapshot via <snapshots current='name'>
rather than via <domainsnapshot>...<active>1</active></domainsnapshot>,
so that I no longer could run into an inconsistent situation where
various file system fiascos could end up reporting more than one current
snapshot.  But if maintaining <active> correctly is painful during
per-snapshot files, it would be possible to instead store the name
(only) of the active snapshot in <domain> on disk, where individual
snapshots don't need a per-snapshot <active>.  But that still doesn't
require a low-level bulk parse/format.

Okay, I think that for 5.2, I'm best off reverting the bulk functions as
not having any client, and just making sure that checkpoints copy the
style of snapshots in tracking one xml file per entity, rather than
pursuing this bulk code any further for now.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org