On Mon, Mar 26, 2018 at 17:22:01 +0200, Michal Privoznik wrote:
On 03/26/2018 05:17 PM, Peter Krempa wrote:
> On Mon, Mar 26, 2018 at 16:43:02 +0200, Michal Privoznik wrote:
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1557769
>>
>> Problem with device mapper targets is that there can be several
>> other devices 'hidden' behind them. For instance, /dev/dm-1 can
>> consist of /dev/sda, /dev/sdb and /dev/sdc. Therefore, when
>> setting up devices CGroup and namespaces we have to take this
>> into account.
>>
>> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
>> ---
[...]
>> + for (i = 0; i < nmaj; i++) {
>> + if (virAsprintf(&devPath, "/dev/block/%u:%u", maj[i],
min[i]) < 0)
>> + goto cleanup;
>> +
>> + if (qemuDomainCreateDevice(devPath, data, false) < 0)
>> + goto cleanup;
>
> So now that I see this new version, this part starts looking suspicious
> to me. Since this did not care much that the path changed, is it really
> necessary to create the /dev/ entries in the container?
>
> Looks like even device mapper is returning them as the node
> specificator, so I'd presume it really does not matter if they are
> present.
>
> More specifically we can't really reverse engineer from the major:minor
> numbers which actual path the user used so it should not really be
> necessary for it to be present in the container.
Yes, looks like I was too eager trying to fix this bug. I've rebuilt
libvirt without qemu_domain.c change (so only CGroup code was modified)
and the bug still did not reproduce. So I guess namespace changes are
not necessary after all. I'll drop them.
Okay. Apart from that I thought about this for a while and also tested
various configurations. Unfortunately I was not able to reproduce the
issue. I realized that this code is allowing the backing devices for all
device-mapper based storage. While I don't think that it is a very big
problem we still might have security implications (e.g. give access to
the whole device, while the device mapper makes the device accessible
only partially). This means that I'd like to have an explanation why it
is necessary in that case to allow the backend devices so that we can
asses whether it's worth doing it always or we can limit the amount of
devices allowed in cgroups.
Also yet another implication is that in the hotunplug code for disks
we'd remove the cgroup for the main device, but this implementation does
not remove the slave devices.