On Tue, Aug 3, 2021 at 6:24 AM Boris Fiuczynski
<fiuczy(a)linux.ibm.com> wrote:
>
> On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
>> On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy(a)linux.ibm.com>
wrote:
>>>
>>> On 7/30/21 9:48 AM, Michal Prívozník wrote:
>>>> On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
>>>>> On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski
<fiuczy(a)linux.ibm.com> wrote:
>>>>>>
>>>>>> On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
>>>>>>> On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník
<mprivozn(a)redhat.com> wrote:
>>>>>>>>
>>>>>>>> On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
>>>>>>>>> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník
<mprivozn(a)redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
>>>>>>>>>>> Unfortunately, mdevctl supports defining more
than one mdev with the
>>>>>>>>>>> same UUID as long as they have different
parent devices. (Only one of
>>>>>>>>>>> these devices can be active at any given
time).
>>>>>>>>>>>
>>>>>>>>>>> This means that we can't use the UUID
alone as a way to uniquely
>>>>>>>>>>> identify mdev node devices. Append the parent
address to ensure
>>>>>>>>>>> uniqueness. For example:
>>>>>>>>>>>
>>>>>>>>>>> Before:
mdev_88a6b868_46bd_4015_8e5b_26107f82da38
>>>>>>>>>>> After:
mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
>>>>>>>>>>>
>>>>>>>>>>> Related:
https://bugzilla.redhat.com/show_bug.cgi?id=1979440
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Jonathon Jongsma
<jjongsma(a)redhat.com>
>>>>>>>>>>> ---
>>>>>>>>>>> src/node_device/node_device_driver.c
| 3 ++-
>>>>>>>>>>> src/node_device/node_device_udev.c
| 2 +-
>>>>>>>>>>>
tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++----
>>>>>>>>>>> 3 files changed, 7 insertions(+), 6
deletions(-)
>>>>>>>>>>
>>>>>>>>>> The patch looks good, but for some reason it
leaves API breakage
>>>>>>>>>> aftertaste. But I guess we don't document
anywhere what MDEV <name/>
>>>>>>>>>> looks like, do we? IOW - we are free to change it
if needed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That is true -- it does have a bit of a bad
aftertaste. As far as I
>>>>>>>>> know, we haven't documented or promised any
specific naming format for
>>>>>>>>> mdevs. But it could certainly cause some pain for
people that are
>>>>>>>>> using mdevs already, which might be reason enough to
reject or adjust
>>>>>>>>> this patch. If a person already has assigned a mdev
device to their vm
>>>>>>>>> with the old name, the name would change when
upgrading libvirt. That
>>>>>>>>> could make the domain definition become invalid.
>>>>>>>>>
>>>>>>>>> But, we have to deal with this situation somehow. We
don't have a
>>>>>>>>> choice but to handle the case where mdevctl might
return multiple
>>>>>>>>> defined devices with the same UUID. I considered a
couple of other
>>>>>>>>> approaches to handling this that I rejected, such as
>>>>>>>>> - only add a suffix to the second such device with
the same UUID
>>>>>>>>> - rejected because the name of a given device
depends on the order
>>>>>>>>> in which it was observed and the presence of other
devices. For
>>>>>>>>> example, if we have two devices: mdev_$UUID and
mdev_$UUID_$PARENT,
>>>>>>>>> and the first device was undefined, then when libvirt
was restarted,
>>>>>>>>> mdev_$UUID_$PARENT would change its name to
mdev_$UUID because it is
>>>>>>>>> now the only device with that UUID
>>>>>>>>
>>>>>>>> Agreed, this would not make situation any better, in fact
worse.
>>>>>>>>
>>>>>>>>> - name all *active* mdevs mdev_$UUID and use a
different scheme for
>>>>>>>>> inactive, defined mdevs
>>>>>>>>> - rejected because the device would change
names when it changed
>>>>>>>>> from inactive to active.
>>>>>>>>
>>>>>>>> Yeah, this is equally horrible.
>>>>>>>>
>>>>>>>>> - ignore this situation because probably almost
nobody uses multiple
>>>>>>>>> devices with the same UUID
>>>>>>>>> - but do they?
>>>>>>>>
>>>>>>>> Frankly, I don't have enough experience with MDEVs,
but since it's
>>>>>>>> possible to define an MDEV with an existing UUID, it is
possible to
>>>>>>>> define it also for the same parent? I mean, if I'd
have an MDEV capable
>>>>>>>> graphics card and I'd define two MDEVs with the same
UUID they would
>>>>>>>> both have the same parent, wouldn't they? Because in
that case,
>>>>>>>> appending PCI address to libvirt's name is not
enough.
>>>>>>>
>>>>>>>
>>>>>>> Nope, it's the opposite in fact. mdevctl allows you to
define two or
>>>>>>> more mdevs with the same UUID, but only if they have
different
>>>>>>> parents. Only a single mdev with a given UUID can be
activated at the
>>>>>>> same time. So among *active* devices, the UUID is indeed
unique. But
>>>>>>> not among persistent device definitions. For defined devices,
the UUID
>>>>>>> is guaranteed unique per parent.
>>>>>>>
>>>>>>> I discussed this with Alex a while ago and he thought that
the reason
>>>>>>> that mdevctl supported this was to allow people to assign an
mdev with
>>>>>>> a specific UUID to a vm, but that UUID could represent one of
several
>>>>>>> (equivalent?) devices, depending on which parent device was
present in
>>>>>>> the host or which device was instantiated. That feature
doesn't seem
>>>>>>> very compelling to me, and Alex wasn't sure that anybody
was actually
>>>>>>> using it that way. But the fact remains that mdevctl does
allow people
>>>>>>> to configure things this way and it seems like we need to be
prepared
>>>>>>> to deal with it.
>>>>>>
>>>>>>
>>>>>> I am questioning how a user is supposed to correlate the nodedev
>>>>>> representation with mdev_{uuid}_{parent} to a domain using the
mdev like
>>>>>> <devices>
>>>>>> <hostdev mode='subsystem' type='mdev'
model='vfio-ccw'>
>>>>>> <source>
>>>>>> <address uuid='{uuid}'/>
>>>>>> </source>
>>>>>> </hostdev>
>>>>>> </devices>
>>>>>
>>>>> Sorry for potentially confusing the impact of this change with my
>>>>> previous comments. I had said that domains with mdevs assigned might
>>>>> become invalid if the nodedev name changes. But that's not
actually
>>>>> true. The naming that we're discussing is only related to the
name of
>>>>> the device within the libvirt nodedev driver. When adding a mediated
>>>>> device to a domain, you don't use this nodedev name, you use the
UUID
>>>>> directly. This is similar to the way you add a physical passthrough
>>>>> PCI device to a domain using its PCI address rather than its
>>>>> 'pci_$domain_$bus_$slot_$fn' nodedev name.
>>>>>
>>>>>> Reading uuid it tells me its universal unique
>>>>>> (
https://en.wikipedia.org/wiki/Universally_unique_identifier) but
now
>>>>>> this is not true for definitions which breaks the term universal
in my
>>>>>> small universe since for the defined state of an mdev it becomes
>>>>>> "parental" unique...
>>>>>
>>>>> Yes, that's true. Which is why I think this is a questionable
feature. But:
>>>>> A) defined devices are only *potential* devices.This is not all that
>>>>> different from the fact that a user can manually instantiate a mdev
on
>>>>> parent A with a specific UUID and then later instantiate a totally
>>>>> different mdev on parent B with the exact same UUID. One UUID refers
>>>>> to different devices at different times.
>>>>> B) the question we're discussing is not whether mdevctl should
allow
>>>>> duplicate UUIDs or not. mdevctl *already* allows it, so we need to
>>>>> figure out how to deal with it in libvirt.
>>>>>
>>>>> I suppose one potential possibility might be to change upstream
>>>>> mdevctl to disallow duplicate UUIDs and then require this new
mdevctl
>>>>> version for libvirt. But that's a discussion we'd have to
have in
>>>>> upstream mdevctl first.
>>>>
>>>> Maybe we can start the discussion, but even in libvirt you are allowed
>>>> to have two guests with the same name and UUID if they are in different
>>>> drivers (e.g. one in QEMU and one in LXC).
>>>>
>>>> Meanwhile, I think these can be pushed as we don't have much options
anyway.
>>>>
>>>> Michal
>>>>
>>>
>>> I think that we have at least one option.
>>> We could rethink the approach to create nodedev objects based on mdevctl
>>> mdev definitions which actually are not a usable host resource for domains.
>>
>> If I understand your suggestion, I don't think it solves any problems.
>> It seems to me that you're saying that libvirt should ignore all mdev
>> definitions from mdevctl if the parent does not (yet?) exist on the
>> host. (At least that's my interpretation of your phrase "definitions
>> which are not a usable host resource"). But it's still possible to
>> have definitions with duplicate UUIDs for devices which have parents
>> that *do* exist on the host but are simply not instantiated yet. If
>> I'm misinterpreting your comment, feel free to clarify.
>
> I used to think of the nodedev objects as libvirt showing the user its
> view of devices it can understand in the sysfs. This changed when mdev
> nodedev objects got created based on mdevctl's mediated device definition.
> These definition objects are different from the sysfs objects.
Yes, but in a similar way, a defined storage pool is different from an
active storage pool. Or a defined domain that is different from an
active domain. There is plenty of precedent in libvirt of this kind of
distinction.
> One difference is state active/inactive which caused e.g. the
> introduction of the option "inactive" on list operation.
> When we now talk about an nodedev mdev object it could be
> a) an instantiated mediated devices without mdev definition,
> b) an instantiated mediated devices with mdev definition or
> c) just an mdev definition,
> but just an instantiated mediated device [part of a) and b)] is actually
> a host resource one can use in a domain.
> So what I meant by "...not usable host resources" was the definition
> itself is not usable for a domain. The sysfs based host resources are.
Those three options are also true of pretty much every other libvirt
object as well. A domain can be transient or defined, active or
inactive. Same for pools. I can define a domain in libvirt that uses
devices that are not (currently) present on the host system. I can
define a storage pool that refers to a nonexistent directory. When I
attempt to instantiate these objects, libvirt will return an error,
but I am allowed to define objects that refer to unavailable host
resources. So all of those things mentioned above seem pretty
analogous to existing libvirt objects.
> I agree that mdev definitions and operations on these definitions have a
> value but I am getting the feeling that they do not fit so well into the
> nodedev objects especially with the new twist of the loss of identity
> uniqueness which results from my perspective in a very messy nodedev
> object space.
I don't think we've lost any identity uniqueness. It's just that we
initially chose a device name schema for mdevs that turned out to not
be unique when mdevctl is used as a backend.
Also, I don't think this nodedev object space is any more "messy" than
the other libvirt objects (e.g. pools and domains mentioned above).
Are you arguing that we should not provide a way to define new
persistent nodedevs in libvirt?
> There is one other thing I noticed. I can change an mdev definition
> after I initiated/started the mdev definition. This is kind of starting
> a domain and afterwards change/edit its persisted definition. So when
> dumping the xml one would need to be able to chose running/started or
> defined. This currently is not a big issue since there are not many
> actually no attributes that can be retrieved via mdevctl from a
> running/started mdev but it just adds to the disparity since currently
> the user of nodedev gets the mdev definition and running/started mdev in
> a merged nodedev object.
Again, this is not all that different from existing libvirt objects. I
can also edit a persistent definition of a storage pool after it is
started, for example. It's possible that there are some issues unique
to mdevs that I need to fix in this area, but those are just details.
In general it seems pretty consistent with other libvirt objects.
Jonathon
My arguing was based on looking at the node device driver.
I know that the concepts exist elsewhere in libvirt but apparently not
in the node device driver. I always though that the reason for it is/was
that it is based on sysfs.
--
Mit freundlichen Grüßen/Kind regards
Boris Fiuczynski
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294