On Thu, Dec 8, 2022 at 5:44 PM Laine Stump <laine@redhat.com> wrote:
>
> On 12/8/22 11:15 AM, Julia Suvorova wrote:
> > On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan <ailan@redhat.com> wrote:
> >>
> >>
> >>
> >> On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan <ailan@redhat.com> wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Nov 2, 2022 at 6:47 PM Laine Stump <laine@redhat.com> wrote:
> >>>>
> >>>> On 11/2/22 11:58 AM, Igor Mammedov wrote:
> >>>>> On Wed, 2 Nov 2022 15:20:39 +0000
> >>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>>>>
> >>>>>> On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:
> >>>>>>> On Wed, 2 Nov 2022 10:43:10 -0400
> >>>>>>> Laine Stump <laine@redhat.com> wrote:
> >>>>>>>
> >>>>>>>> On 11/1/22 7:46 AM, Igor Mammedov wrote:
> >>>>>>>>> On Mon, 31 Oct 2022 14:48:54 +0000
> >>>>>>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:
> >>>>>>>>>>> Hi Igor and Laine,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to revive a 2 years old discussion [1] about consistent network
> >>>>>>>>>>> interfaces in the guest.
> >>>>>>>>>>>
> >>>>>>>>>>> That discussion mentioned that a guest PCI address may change in two cases:
> >>>>>>>>>>> - The PCI topology changes.
> >>>>>>>>>>> - The machine type changes.
> >>>>>>>>>>>
> >>>>>>>>>>> Usually, the machine type is not expected to change, especially if one
> >>>>>>>>>>> wants to allow migrations between nodes.
> >>>>>>>>>>> I would hope to argue this should not be problematic in practice, because
> >>>>>>>>>>> guest images would be made per a specific machine type.
> >>>>>>>>>>>
> >>>>>>>>>>> Regarding the PCI topology, I am not sure I understand what changes
> >>>>>>>>>>> need to occur to the domxml for a defined guest PCI address to change.
> >>>>>>>>>>> The only think that I can think of is a scenario where hotplug/unplug is
> >>>>>>>>>>> used,
> >>>>>>>>>>> but even then I would expect existing devices to preserve their PCI address
> >>>>>>>>>>> and the plug/unplug device to have a reserved address managed by the one
> >>>>>>>>>>> acting on it (the management system).
> >>>>>>>>>>>
> >>>>>>>>>>> Could you please help clarify in which scenarios the PCI topology can cause
> >>>>>>>>>>> a mess to the naming of interfaces in the guest?
> >>>>>>>>>>>
> >>>>>>>>>>> Are there any plans to add the acpi_index support?
> >>>>>>>>>>
> >>>>>>>>>> This was implemented a year & a half ago
> >>>>>>>>>>
> >>>>>>>>>> https://libvirt.org/formatdomain.html#network-interfaces
> >>>>>>>>>>
> >>>>>>>>>> though due to QEMU limitations this only works for the old
> >>>>>>>>>> i440fx chipset, not Q35 yet.
> >>>>>>>>>
> >>>>>>>>> Q35 should work partially too. In its case acpi-index support
> >>>>>>>>> is limited to hotplug enabled root-ports and PCIe-PCI bridges.
> >>>>>>>>> One also has to enable ACPI PCI hotplug (it's enled by default
> >>>>>>>>> on recent machine types) for it to work (i.e.it's not supported
> >>>>>>>>> in native PCIe hotplug mode).
> >>>>>>>>>
> >>>>>>>>> So if mgmt can put nics on root-ports/bridges, then acpi-index
> >>>>>>>>> should just work on Q35 as well.
> >>>>>>>>
> >>>>>>>> With only a few exceptions (e.g. the first ich9 audio device, which is
> >>>>>>>> placed directly on the root bus at 00:1B.0 because that is where the
> >>>>>>>> ich9 audio device is located on actual Q35 hardware), libvirt will
> >>>>>>>> automatically put all PCI devices (including network interfaces) on a
> >>>>>>>> pcie-root-port.
> >>>>>>>>
> >>>>>>>> After seeing reports that "acpi index doesn't work with Q35
> >>>>>>>> machinetypes" I just assumed that was correct and didn't try it. But
> >>>>>>>> after seeing the "should work partially" statement above, I tried it
> >>>>>>>> just now and an <interface> of a Q35 guest that had its PCI address
> >>>>>>>> auto-assigned by libvirt (and so was placed on a pcie-root-port)m and
> >>>>>>>> had <acpi index='4'/> was given the name "eno4". So what exactly is it
> >>>>>>>> that *doesn't* work?
> >>>>>>>
> >>>>>>> From QEMU side:
> >>>>>>> acpi-index requires:
> >>>>>>> 1. acpi pci hotplug enabled (which is default on relatively new q35 machine types)
> >>>>>>> 2. hotpluggble pci bus (root-port, various pci bridges)
> >>>>>>> 3. NIC can be cold or hotplugged, guest should pick up acpi-index of the device
> >>>>>>> currently plugged into slot
> >>>>>>> what doesn't work:
> >>>>>>> 1. device attached to host-bridge directly (work in progress)
> >>>>>>> (q35)
> >>>>>>> 2. devices attached to any PXB port and any hierarchy hanging of it (there are not plans to make it work)
> >>>>>>> (q35, pc)
> >>>>>>
> >>>>>> I'd say this is still a relatively important, as the PXBs are needed
> >>>>>> to create a NUMA placement aware topology for guests, and I'd say it
> >>>>>> is undesirable to loose acpi-index if a guest is updated to be NUMA
> >>>>>> aware, or if a guest image can be deployed in either normal or NUMA
> >>>>>> aware setups.
> >>>>>
> >>>>> it's not only Q35 but also PC.
> >>>>> We basically do not generate ACPI hierarchy for PXBs at all,
> >>>>> so neither ACPI hotplug nor depended acpi-index would work.
> >>>>> It's been so for many years and no one have asked to enable
> >>>>> ACPI hotplug on them so far.
> >>>>
> >>>> I'm guessing (based on absolutely 0 information :-)) that there would be
> >>>> more demand for acpi-index (and the resulting predictable interface
> >>>> names) than for acpi hotplug for NUMA-aware setup.
> >>>
> >>>
> >>> My guess is similar, but it is still desirable to have both (i.e. support ACPI-indexing/hotplug with Numa-aware)
> >>> Adding @Peter Xu to check if our setups for SAP require NUMA-aware topology
> >>>
> >>> How big of a project would it be to enable ACPI-indexing/hotplug with PXB?
> >
> > Why would you need to add acpi hotplug on pxb?
> >
> >> Adding +Julia Suvorova and +Tsirkin, Michael to help answer this question
> >>
> >> Thanks,
> >> Amnon
> >>
> >>>
> >>> Since native PCI was improved, we can still compromise on switching to native-PCI-hotplug when PXB is required (and no fixed indexing)
> >
> > Native hotplug works on pxb as is, without disabling acpi hotplug.
>
> Are you saying you can add an acpi-index to a device plugged into a pxb,
> that index will be recognized (and used to name the device), but it will
> still do native hotplug?
nope, acpi-index won't work on pxb hierarchy, it works only PCI tree
hanging off main host bridge.
>
> That sounds okay to me, since it ticks all the functional marks
> (hotplug, consistent device names, NUMA-aware). It's possible there are
> some things I'm misunderstanding or haven't thought of though...
>
>
> >
> >>> Thanks,
> >>> Amnon
> >>>
> >>>
> >>>>
> >>>>
> >>>> Anyway, it sounds like (*within the confines of how libvirt constructs
> >>>> the PCI topology*) we actually have functional parity of acpi-index
> >>>> between 440fx and Q35.
> >>>>
> >
>