于 2011年08月17日 22:12, Eric Blake 写道:
On 08/17/2011 07:10 AM, Osier Yang wrote:
> If one tries to restore a domain from a corrupt save image, we blindly
> goes forward to restore from it, this can cause many different errors,
> depending on how much the image is saved. E.g.
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=730750
>
> So I'm thinking if we can introduce a new feild to struct
> qemud_save_header,
> such as "bool complete;", and set it true if succeeded to save the
> image,
> false if not. So that could do some checking while trying to open the
> image
> (qemuDomainSaveImageOpen), and quit early if "complete" is false, with
> a sensiable error message.
Almost. I think we can reuse one of the existing spare fields in the
header (that is, change unused[15] to instead be unused[14] and make
the new field a uint32_t), and I also think we need to have a
tri-state value:
0 - save image was created with older libvirt, no idea if image is sane
1 - save image created by current libvirt, but not yet marked
complete; attempts to restore from this image should fail with
sensible message suggesting nuking the save image since it is broken -
value written at start of save process
2 - save image created by current libvirt and completed - value
written at end of save process
And of course, we have to update the bswap_header routine to treat it
with the same endianness as the rest of the header.
>
> Thought?
Sounds reasonable. I don't even think we have to bump the
qemud_save_header version (that is, older libvirt will ignore the
field, and the size is not changing, and newer libvirt will correctly
handle the situation where the field was not set).
Agree on using the "unused[14]", with this we don't need to care
about the C/S communications (between older and newer, or newer
and older). Thanks for this good thought. :)
The toughest part will be figuring out how to modify the field when
using an O_DIRECT fd. I guess here, we write the entire file O_DIRECT,
then when qemu completes the save, we close the fd, and open a new one
that is not O_DIRECT, since we will be polluting the cache with only
one sector of the file, rather than having to worry about O_DIRECT
pickiness (the main point of O_DIRECT was for the benefit to the much
larger portion of the file written by qemu).
Not quite understand here, what I can get is opening the file with
O_DIRECT will cause the dirty cache will be flushed to disk before
every following read/write. What pollutes the cache?
Agree on reopening the save file without O_DIRECT to modify the
feild though.
Osier