On 5/6/20 1:48 PM, Andrea Bolognani wrote:
On Mon, 2020-04-20 at 21:55 +0200, Boris Fiuczynski wrote:
> On 4/10/20 2:06 PM, Andrea Bolognani wrote:
>> On Thu, 2020-04-09 at 12:30 +0200, Shalini Chellathurai Saroja wrote:
>>> The ZPCI address validation or autogeneration does not work as
>>> expected in the following scenarios
>>> 1. uid = 0 and fid = 0
>>> 2. uid = 0 and fid > 0
>>> 3. uid = 0 and fid not specified
>>> 4. uid not specified and fid > 0
>>> 5. 2 ZPCI devices with uid > 0 and fid not specified.
>>>
>>> This is because of the following reasons
>>> 1. If uid = 0 or fid = 0 the code assumes that user has not specified
>>> the corresponding address
>>> 2. If either uid or fid is provided, the code assumes that both uid
>>> and fid addresses are specified by the user.
>>
>> I'd have to dig up the old threads, but based on what I remember the
>> behaviors you describe are entirely intentional.
>>
>> For PCI addresses, setting all parts of the address to zero or not
>> setting it at all is equivalent, and we wanted to be consistent with
>> that behavior for ZPCI; additionally, zero is not a valid value for
>> uid so of course neither is the address uid=0 fid=0, which means that
>> we're not preventing the user from specifying a valid address by
>> conflating the all-zero address with the unspecified address.
>>
>> For partially-specified addresses, the behavior is also the same as
>> PCI: any part you don't specify is considered to be zero, which
>> results in
>>
>> uid=0 fid=0 -> uid=0 fid=0 -> address gets autogenerated
>> uid=0 fid=x -> uid=0 fid=x -> address is rejected as invalid
>> uid=0 -> uid=0 fid=0 -> address gets autogenerated
>> fid=x -> uid=0 fid=x -> address is rejected as invalid
>> uid=x -> uid=x fid=0 -> address is accepted
>>
>> So, just like for PCI addresses, you have basically two reasonable
>> options: either don't specify any zPCI address and leave allocation
>> entirely up to libvirt, or specify all of the addresses completely:
>> anything in between will likely not work as you'd expect or want.
>>
>> Again, this is based purely on my recollection of design discussions
>> that happened one and a half years ago, so I might have gotten some
>> of the details wrong - in which case by all means call me out on
>> that O:-)
>>
> Hi Andrea,
> sorry for the delayed answer. I (and some others as well) lost some
> emails on my IMAP account and I just found your answer today.
No apologies needed: I also took a long time to reply to your
message, and in my case there's no mail server malfunction that I
can assign the blame to O:-)
> I can remember that you had a discussion with the original author of the
> zpci code. There are a few issues with the currently implemented "rules"
> which partially are not even working as you outlined above in more
> complex scenarios.
I disagree with this assessment - they work exactly as designed and
as described above. Whether we *want* them to behave that way... Now
that's a different topic :)
I think the disconnect lies in what the user's expectations are and
what libvirt actually implements. Basically the user expects that
* if either one of uid and fid is explictly assigned a value by
the user, then the guest will use that value - unless such value
is invalid, in which case libvirt will report an error;
* if either one of uid and fid is absent from the user-provided
configuration, then libvirt will automatically pick a valid value
for the attribute.
This is not how the current zPCI implementation works, or how PCI
address assignment works in libvirt; on the other hand, I think these
expectations are in fact completely reasonable, as the examples you
provide illustrate quite well.
I think you successfully convinced me that the current approach is
not good for users and we should fix it; my only doubt at this point
is whether can we safely do that without breaking libvirt's backwards
compatibility guarantees.
Dan, Laine, what's your take?
It sounds like the same semantics were applied to the zPCI address
element as are applied to <address type='pci'> (if I understand
correctly, while an PCI address is considered "valid and complete" if
any of its attributes is non-0, for the zPCI extensions, each attribute
is taken on its own. Is that correct?
In any case, it sounds like current behavior for zPCI is broken for some
use cases, and imo this is new enough and usage is narrow enough that if
the maintainers (who I would guess represent the actual users) think
fixing the bug is more important than 100% backward compatibility in a
corner case, then they should be able to change it.
Now if you wanted to change the way regular PCI address attributes are
handled, *then* I would have a totally different opinion :-)