On 7/18/19 8:30 AM, Michal Privoznik wrote:
On 7/18/19 11:28 AM, Daniel P. Berrangé wrote:
> On Thu, Jul 18, 2019 at 11:14:49AM +0200, Michal Privoznik wrote:
>> If there are two paths on the list that are the same we need to
>> lock it only once. Because when we try to lock it the second time
>> then open() fails. And if it didn't, locking it the second time
>> would fail for sure. After all, it is sufficient to lock all
>> paths just once satisfy the caller.
>>
>> Reported-by: Daniel Henrique Barboza <danielhb413(a)gmail.com>
>> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
>> ---
>> src/security/security_manager.c | 23 +++++++++++++++++++++--
>> 1 file changed, 21 insertions(+), 2 deletions(-)
>
> Reviewed-by: Daniel P. Berrangé <berrange(a)redhat.com>
>
>
>>
>> diff --git a/src/security/security_manager.c
>> b/src/security/security_manager.c
>> index ade2c96141..7c905f0785 100644
>> --- a/src/security/security_manager.c
>> +++ b/src/security/security_manager.c
>> @@ -1294,16 +1294,35 @@
>> virSecurityManagerMetadataLock(virSecurityManagerPtr mgr
>> ATTRIBUTE_UNUSED,
>> * paths A B and there's another that is trying to lock them
>> * in reversed order a deadlock might occur. But if we sort
>> * the paths alphabetically then both processes will try lock
>> - * paths in the same order and thus no deadlock can occur. */
>> + * paths in the same order and thus no deadlock can occur.
>> + * Lastly, it makes searching for duplicate paths below
>> + * simpler. */
>> qsort(paths, npaths, sizeof(*paths), cmpstringp);
>> for (i = 0; i < npaths; i++) {
>> const char *p = paths[i];
>> struct stat sb;
>> + size_t j;
>> int retries = 10 * 1000;
>> int fd;
>> - if (!p || stat(p, &sb) < 0)
>> + if (!p)
>> + continue;
>> +
>> + /* If there's a duplicate path on the list, skip it over.
>> + * Not only we would fail open()-ing it the second time,
>> + * we would deadlock with ourselves trying to lock it the
>> + * second time. After all, we've locked it when iterating
>> + * over it the first time. */
>
> Presumably this deals with the problem at cold boot. Is it possible
> to hit the same problem when we have cold plugged one device and
> then later try to hotplug another device using the same resource,
> or do the SRIOV assignment grouping requirements make the hotplug
> impossible ?
Okay, so digging deeper I think I know what the problem is. In
general, it is of course possible to open a file multiple times, but
that is not true for "/dev/vfio/N" which can be opened exactly one
time. Indeed, looking into vfio_group_fops_open() in
kernel.git/drivers/vfio/vfio.c if a group is already opened, then
-EBUSY is returned (what we've seen in the error message). In fact,
git log blames v3.11-rc1~46^2~1 which explicitly disallowed multiple
open()-s.
So, in theory, our code works, because we open() the files we are
chown()-ing and close them immediatelly after. So the problem in which
we try to lock /dev/vfio/N multiple times won't happen. However, since
qemu already has the device opened, we will fail opening it.
I can see two options here:
1) ignore this specific error in virSecurityManagerMetadataLock().
This will of course mean that the caller
(virSecurityDACTransactionRun()) will chown() the /dev/vfio/N file
without a lock, but then again, we have namespaces and there can't be
any other domain using the path. And also, we don't really chown() if
the file already has the correct owner - which is going to be 99.99%
cases unless there would be different seclabels for two PCI devices
(do we even allow <seclabel/> for <hostdev/>?).
This option seems better to me because it already establishes an
exception rule for
files that can be opened only once. For now we found out about
/dev/vfio/N files, but
there might be more in the future, and then it is just a matter of
adding more files
to this exception rule.
Thanks,
DHB
2) Make virSecurityDACSetHostdevLabel() and
virSecurityDACRestoreHostdevLabel() be NO-OP if a domain already/still
has a device from the same IOMMU group like the one we're
attaching/detaching. If these are NO-OPs then no relabel is done and
thus the whole code I'm modifying in 1) is not even called.
I have a preference for 1) because the code will be cleaner IMO.
Michal