On 12/8/22 11:15 AM, Julia Suvorova wrote:
> On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan <ailan(a)redhat.com> wrote:
>>
>>
>>
>> On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan <ailan(a)redhat.com> wrote:
>>>
>>>
>>>
>>> On Wed, Nov 2, 2022 at 6:47 PM Laine Stump <laine(a)redhat.com> wrote:
>>>>
>>>> On 11/2/22 11:58 AM, Igor Mammedov wrote:
>>>>> On Wed, 2 Nov 2022 15:20:39 +0000
>>>>> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
>>>>>
>>>>>> On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:
>>>>>>> On Wed, 2 Nov 2022 10:43:10 -0400
>>>>>>> Laine Stump <laine(a)redhat.com> wrote:
>>>>>>>
>>>>>>>> On 11/1/22 7:46 AM, Igor Mammedov wrote:
>>>>>>>>> On Mon, 31 Oct 2022 14:48:54 +0000
>>>>>>>>> Daniel P. Berrangé <berrange(a)redhat.com>
wrote:
>>>>>>>>>
>>>>>>>>>> On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward
Haas wrote:
>>>>>>>>>>> Hi Igor and Laine,
>>>>>>>>>>>
>>>>>>>>>>> I would like to revive a 2 years old
discussion [1] about consistent network
>>>>>>>>>>> interfaces in the guest.
>>>>>>>>>>>
>>>>>>>>>>> That discussion mentioned that a guest PCI
address may change in two cases:
>>>>>>>>>>> - The PCI topology changes.
>>>>>>>>>>> - The machine type changes.
>>>>>>>>>>>
>>>>>>>>>>> Usually, the machine type is not expected to
change, especially if one
>>>>>>>>>>> wants to allow migrations between nodes.
>>>>>>>>>>> I would hope to argue this should not be
problematic in practice, because
>>>>>>>>>>> guest images would be made per a specific
machine type.
>>>>>>>>>>>
>>>>>>>>>>> Regarding the PCI topology, I am not sure I
understand what changes
>>>>>>>>>>> need to occur to the domxml for a defined
guest PCI address to change.
>>>>>>>>>>> The only think that I can think of is a
scenario where hotplug/unplug is
>>>>>>>>>>> used,
>>>>>>>>>>> but even then I would expect existing
devices to preserve their PCI address
>>>>>>>>>>> and the plug/unplug device to have a
reserved address managed by the one
>>>>>>>>>>> acting on it (the management system).
>>>>>>>>>>>
>>>>>>>>>>> Could you please help clarify in which
scenarios the PCI topology can cause
>>>>>>>>>>> a mess to the naming of interfaces in the
guest?
>>>>>>>>>>>
>>>>>>>>>>> Are there any plans to add the acpi_index
support?
>>>>>>>>>>
>>>>>>>>>> This was implemented a year & a half ago
>>>>>>>>>>
>>>>>>>>>>
https://libvirt.org/formatdomain.html#network-interfaces
>>>>>>>>>>
>>>>>>>>>> though due to QEMU limitations this only works
for the old
>>>>>>>>>> i440fx chipset, not Q35 yet.
>>>>>>>>>
>>>>>>>>> Q35 should work partially too. In its case
acpi-index support
>>>>>>>>> is limited to hotplug enabled root-ports and
PCIe-PCI bridges.
>>>>>>>>> One also has to enable ACPI PCI hotplug (it's
enled by default
>>>>>>>>> on recent machine types) for it to work
(i.e.it's not supported
>>>>>>>>> in native PCIe hotplug mode).
>>>>>>>>>
>>>>>>>>> So if mgmt can put nics on root-ports/bridges, then
acpi-index
>>>>>>>>> should just work on Q35 as well.
>>>>>>>>
>>>>>>>> With only a few exceptions (e.g. the first ich9 audio
device, which is
>>>>>>>> placed directly on the root bus at 00:1B.0 because that
is where the
>>>>>>>> ich9 audio device is located on actual Q35 hardware),
libvirt will
>>>>>>>> automatically put all PCI devices (including network
interfaces) on a
>>>>>>>> pcie-root-port.
>>>>>>>>
>>>>>>>> After seeing reports that "acpi index doesn't
work with Q35
>>>>>>>> machinetypes" I just assumed that was correct and
didn't try it. But
>>>>>>>> after seeing the "should work partially"
statement above, I tried it
>>>>>>>> just now and an <interface> of a Q35 guest that
had its PCI address
>>>>>>>> auto-assigned by libvirt (and so was placed on a
pcie-root-port)m and
>>>>>>>> had <acpi index='4'/> was given the name
"eno4". So what exactly is it
>>>>>>>> that *doesn't* work?
>>>>>>>
>>>>>>> From QEMU side:
>>>>>>> acpi-index requires:
>>>>>>> 1. acpi pci hotplug enabled (which is default on
relatively new q35 machine types)
>>>>>>> 2. hotpluggble pci bus (root-port, various pci bridges)
>>>>>>> 3. NIC can be cold or hotplugged, guest should pick up
acpi-index of the device
>>>>>>> currently plugged into slot
>>>>>>> what doesn't work:
>>>>>>> 1. device attached to host-bridge directly (work in
progress)
>>>>>>> (q35)
>>>>>>> 2. devices attached to any PXB port and any hierarchy
hanging of it (there are not plans to make it work)
>>>>>>> (q35, pc)
>>>>>>
>>>>>> I'd say this is still a relatively important, as the PXBs
are needed
>>>>>> to create a NUMA placement aware topology for guests, and
I'd say it
>>>>>> is undesirable to loose acpi-index if a guest is updated to be
NUMA
>>>>>> aware, or if a guest image can be deployed in either normal or
NUMA
>>>>>> aware setups.
>>>>>
>>>>> it's not only Q35 but also PC.
>>>>> We basically do not generate ACPI hierarchy for PXBs at all,
>>>>> so neither ACPI hotplug nor depended acpi-index would work.
>>>>> It's been so for many years and no one have asked to enable
>>>>> ACPI hotplug on them so far.
>>>>
>>>> I'm guessing (based on absolutely 0 information :-)) that there
would be
>>>> more demand for acpi-index (and the resulting predictable interface
>>>> names) than for acpi hotplug for NUMA-aware setup.
>>>
>>>
>>> My guess is similar, but it is still desirable to have both (i.e. support
ACPI-indexing/hotplug with Numa-aware)
>>> Adding @Peter Xu to check if our setups for SAP require NUMA-aware topology
>>>
>>> How big of a project would it be to enable ACPI-indexing/hotplug with PXB?
>
> Why would you need to add acpi hotplug on pxb?
>
>> Adding +Julia Suvorova and +Tsirkin, Michael to help answer this question
>>
>> Thanks,
>> Amnon
>>
>>>
>>> Since native PCI was improved, we can still compromise on switching to
native-PCI-hotplug when PXB is required (and no fixed indexing)
>
> Native hotplug works on pxb as is, without disabling acpi hotplug.
Are you saying you can add an acpi-index to a device plugged into a pxb,
that index will be recognized (and used to name the device), but it will
still do native hotplug?
nope, acpi-index won't work on pxb hierarchy, it works only PCI tree
hanging off main host bridge.
That sounds okay to me, since it ticks all the functional marks
(hotplug, consistent device names, NUMA-aware). It's possible there are
some things I'm misunderstanding or haven't thought of though...
>
>>> Thanks,
>>> Amnon
>>>
>>>
>>>>
>>>>
>>>> Anyway, it sounds like (*within the confines of how libvirt constructs
>>>> the PCI topology*) we actually have functional parity of acpi-index
>>>> between 440fx and Q35.
>>>>
>