On 03/25/2011 03:00 AM, Hu Tao wrote:
On Thu, Mar 24, 2011 at 03:45:26PM -0600, Eric Blake wrote:
> On 03/24/2011 02:46 AM, Hu Tao wrote:
>> If two or more disks are mapped to the same image file, operating
>> on these disks at the same time may corrupt data stored in the
>> image file.
>>
>> changes:
>>
>> v2:
>>
>> - allow it for read-only disks
>> - compare source files by inode number
>>
>> +
>> + if (stat(disk->src,&stat1)) {
>> + if (errno != ENOENT) {
>> + /* Can't stat file, for safety treate it as conflicted */
> s/treate/treat/
>
> Won't this will fail on root-squash NFS from qemu:///system? (Or does
> root-squash meant that root can still stat() but just not open() a file?)
I think it won't. man stat says:
No permissions are required on the file itself (to stat it)
But needs 'r' bit to open().
root-squash means that root can only do what user "nobody" could do with
the filesystem. Although it doesn't need read access on the file itself,
stat() *will* fail if the current user isn't able to read the directory
containing the file being stat'ed. So, if user nobody doesn't have
access to the parent directory (ie if permissions are xx0, which is very
common), then root will not be able to stat the file - it will just
return -1.
To make this work properly for root-squash, you'll need to either 1)
fork a child that does setuid to the qemu user:group and does the stat()
(a real pain, since you'll need to pass the results back to the parent,
and can't reasonably log errors while in the child == not recommended),
or 2) use virFileOpenAs() to open the file as the qemu user
(virFileOpenAs() uses SCM_RIGHTS and a child process to do this), then
do fstat() of the file to get the info you need, and close it.
Keeping libvirt working properly with save images, disk images, and
pools on root-squash NFS is a never-ending battle (and can lead to an
intense hatred of NFS!) Basically any time new code is added that needs
to access any of these things, operation on root-squash is broken. Now
that we have virFileOpenAs(), it should be much more straightforward to
code so that root-squash scenarios keep working.
If anyone is adding some code that does anything with files on disk,
they don't have a root-squash NFS setup, and they want some testing to
make sure they haven't broken it, please point out the patch to me, and
I'll make the time to apply it locally and try it on my NFS setup. (I
know I should automatically just see it, but the volume on libvir-list
has increased *a lot* lately, and I frequently find myself several days
behind).
> Overall, I'm worried that this patch is repeating some of
danpb's bigger
> efforts to integrate a sanlock disk contention avoidance [1]. If a
> resource manager is properly hooked to all disks, then we can prevent
> contention between domains (and not limit ourself to just single-domain
> contention, as in this patch). On the other hand, this seems like an
> easy enough check to do for a single domain
My opinion is that it's probably much more likely that someone would
mistakenly use the same disk in two different domains (due to cut-paste
of XML) rather than using it twice in the same domain (that's much
easier to notice since the definitions are close to each other in the
same file).
So beyond the fact that this patch can't eliminate all erroneous
duplicate usage, it's not looking for the category that is most likely,
and thus it will probably have more an effect of providing a false sense
of security, rather than of catching any duplicate usage. That makes me
think this may actually do more harm than good.
> whether or not we get the
> sanlock code completed (that is, timing wise this looks like it could be
> ready prior to 0.9.0 while Dan's work is bigger in scope and probably
> missed the feature freeze for this month's release). So I'm not sure
> whether to ack this.
Assuming this patch (or something similar) was accepted, would anybody
(aside from those of us using upstream builds directly) ever get a
release that contained this patch, but didn't contain Dan's? If the
answer is no, then I think that's another reason to NACK this and wait
for the Dan's patch.