Re: [libvirt] [PATCH v1 09/23] lock_protocol: Add two new remote procedures

19 Oct 2015

      On 16.10.2015 21:58, John Ferlan wrote:
...
On 10/16/2015 10:54 AM, Michal Privoznik wrote:
...
On 16.10.2015 09:17, Peter Krempa wrote:
...
On Mon, Oct 12, 2015 at 12:25:54 +0200, Michal Privoznik wrote:
...
These procedures will be used to store and bring back security
labels. So far, the idea is that tuple (path, model, label) is
enough. Well, certainly for DAC and SELinux. The functions are:
I'm afraid that a 'path' per-se isn't enough to express storage
locations in a unique way. Said that a string is fine though, but I'd
rather call it 'identifier' or something similar and explicitly
document some formats for possibly other storage systems that might use
security labeling but don't have a local representation in the host.
I'd imagine something like:
STORAGE_SYSTEM_UUID:VOLUME_UUID
I can imagine this, okay.
...
and perhaps
HOST_UUID:/path for local files
Hm... what good it will have to store UUID among with the path? I mean,
we can't know if two paths from two different hosts is the same file or
not anyway.
...
One other thing to consider is that if this approach will be used across
multiple hosts the paths although equal string-wise might not result
being the same file. Not honoring that would result into security
implications.
What security implications do you have in mind?
On the other hand, I just realized that this approach will not fly. I
mean, virtlockd is running per host. So consider a file on NFS share,
and to make things more complicated assume it's accessible under
different paths from two different hosts. There is still a race between
those two virtlockd-s - they will not mutually exclude each other and
therefore can store already chown()-ed owership info. For instance:
host1: stat()
host1: store ownership (e.g. 100:20)
host1: chown (e.g. to 20:20)
host2: stat()
host2: store ownership (20:20)
host2: chown (e.g. to 20:40)
...
...
host1: recall ownership (100:20)
host1: restore ownership (100:20)
(this will cut off host2)
...
...
host2: recall ownership (20:20)
host2: restore ownership (20:20)
Both hosts think that they are the last ones to use the file, therefore
they restore ownerships. However, after all the original ownership is
not preserved.
Ah - the age old problem of networked and shared storage...  And it's
compounded by two hosts that don't know they are modifying the same
resource. If realistically the server is the 'only' one to know, then
who's problem is this to solve?
Say nothing of choosing NFS...  For this one you have to also consider
root_squash vs. non root squash environment. The attempt to change
anything is managed by the server, but for the sake of your argument
it's set up to allow "no_root_squash" - meaning clients can have more
control...
In your scenario, H2 never chown()'s... So was it trying set to (20:20)
- in which case, why would it do that since the stat it would get would
show (20:20)... Or does it try to set say (40:20).  If the latter, then
H1 is going to have access problems as soon as H2 is successful. If H1
then goes to restore and finds the current stat of (40:20), but it
believes it set (20:20), then should it restore?  Or does it "try to"
restore anyway and then let virtlockd hold onto the restore until H2 is
done? Eventually H2 will restore to (20:20) at which time whatever is
waiting to restore H1's value finds the (20:20) and can finally
successfully restore back to (100:20)...  Conversely if H2 did save and
chown (20:20), but when it goes to restore (20:20) and finds (100:20),
you wouldn't want some thread hanging around...  However, in this case
if the "store" == "restore" and the current is something else, then
perhaps the restore gets dropped.
Oh, yeah. H2 should chown() too. But not necessarily cutting off H1 (it
still can have access via user while H2 via group part of the permission).
And we should not consider just DAC labels here. The framework should be
capable of storing SELinux labels too. In that world, different labels
(or context ranges) do not necessarily mean EPERM like in DAC. Therefore
I think we should not check if  current label is the one we set earlier.

Moreover, even if we do check on restore if the seclabel has changed
since the time we set it and if it has do not restore - in my scenario
the file would has the wrong ownership at the end anyway, wouldn't it? I
mean, at restore, h1 finds the seclabel has changed (it's 20:40 and it
has set it to 20:20), so it leaves the file alone. Then, h2 comes around
and finds the seclabel has NOT changed, so it restores it to what it
thinks was the original label (20:20).
...
Maybe a few drinks will help the thought process more ;-)
John
...
One way out of this could be that we already require virtlockd in order
to work properly that lockspace is on a shared mountpoint accessible by
both hosts. So instead of keeping seclabel space in memory, we can load
& store it within a file on that mountpoint and guard accesses via
flock(). But this is rather unpleasant as we would have to parse and
format the whole seclabel space just to update a refcounter to some entry.
The other approach would be to have a seclabel space as a directory on
the shared mountpoint and create some file (with internal structure) per
path remembered. The internal structure of the file would then contain
tuple (model, seclabel, refcount) for each model. When updating the
refcount, only the file would need to be flock()-ed getting us much
higher throughput. But we would still need to parse and format a file on
each access (even though the file would be much smaller).
I've given this some thoughts and even after few beers (reaching Ballmer
peak) it still seems like good idea.

Michal