Hi Igor,
I have tried some scenarios and recorded the status in this document[1].
Could you please help to check the test result?
Is my test matrix enough? (I will test again once qemu is ready)
Thank you!
BTW, current test results for pxb:
Q35+ pcie-expander-bus - works
PC + pci-expander-bus - not work
[1]
Yalan
On Fri, Dec 9, 2022 at 5:39 AM Igor Mammedov <imammedo(a)redhat.com> wrote:
On Thu, Dec 8, 2022 at 5:44 PM Laine Stump <laine(a)redhat.com>
wrote:
>
> On 12/8/22 11:15 AM, Julia Suvorova wrote:
> > On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan <ailan(a)redhat.com> wrote:
> >>
> >>
> >>
> >> On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan <ailan(a)redhat.com> wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Nov 2, 2022 at 6:47 PM Laine Stump <laine(a)redhat.com>
wrote:
> >>>>
> >>>> On 11/2/22 11:58 AM, Igor Mammedov wrote:
> >>>>> On Wed, 2 Nov 2022 15:20:39 +0000
> >>>>> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
> >>>>>
> >>>>>> On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov
wrote:
> >>>>>>> On Wed, 2 Nov 2022 10:43:10 -0400
> >>>>>>> Laine Stump <laine(a)redhat.com> wrote:
> >>>>>>>
> >>>>>>>> On 11/1/22 7:46 AM, Igor Mammedov wrote:
> >>>>>>>>> On Mon, 31 Oct 2022 14:48:54 +0000
> >>>>>>>>> Daniel P. Berrangé <berrange(a)redhat.com>
wrote:
> >>>>>>>>>
> >>>>>>>>>> On Mon, Oct 31, 2022 at 04:32:27PM +0200,
Edward Haas wrote:
> >>>>>>>>>>> Hi Igor and Laine,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to revive a 2 years old
discussion [1] about
consistent network
> >>>>>>>>>>> interfaces in the guest.
> >>>>>>>>>>>
> >>>>>>>>>>> That discussion mentioned that a guest
PCI address may
change in two cases:
> >>>>>>>>>>> - The PCI topology changes.
> >>>>>>>>>>> - The machine type changes.
> >>>>>>>>>>>
> >>>>>>>>>>> Usually, the machine type is not
expected to change,
especially if one
> >>>>>>>>>>> wants to allow migrations between
nodes.
> >>>>>>>>>>> I would hope to argue this should not
be problematic in
practice, because
> >>>>>>>>>>> guest images would be made per a
specific machine type.
> >>>>>>>>>>>
> >>>>>>>>>>> Regarding the PCI topology, I am not
sure I understand what
changes
> >>>>>>>>>>> need to occur to the domxml for a
defined guest PCI address
to change.
> >>>>>>>>>>> The only think that I can think of is a
scenario where
hotplug/unplug is
> >>>>>>>>>>> used,
> >>>>>>>>>>> but even then I would expect existing
devices to preserve
their PCI address
> >>>>>>>>>>> and the plug/unplug device to have a
reserved address
managed by the one
> >>>>>>>>>>> acting on it (the management system).
> >>>>>>>>>>>
> >>>>>>>>>>> Could you please help clarify in which
scenarios the PCI
topology can cause
> >>>>>>>>>>> a mess to the naming of interfaces in
the guest?
> >>>>>>>>>>>
> >>>>>>>>>>> Are there any plans to add the
acpi_index support?
> >>>>>>>>>>
> >>>>>>>>>> This was implemented a year & a half
ago
> >>>>>>>>>>
> >>>>>>>>>>
https://libvirt.org/formatdomain.html#network-interfaces
> >>>>>>>>>>
> >>>>>>>>>> though due to QEMU limitations this only
works for the old
> >>>>>>>>>> i440fx chipset, not Q35 yet.
> >>>>>>>>>
> >>>>>>>>> Q35 should work partially too. In its case
acpi-index support
> >>>>>>>>> is limited to hotplug enabled root-ports and
PCIe-PCI bridges.
> >>>>>>>>> One also has to enable ACPI PCI hotplug
(it's enled by default
> >>>>>>>>> on recent machine types) for it to work
(i.e.it's not
supported
> >>>>>>>>> in native PCIe hotplug mode).
> >>>>>>>>>
> >>>>>>>>> So if mgmt can put nics on root-ports/bridges,
then acpi-index
> >>>>>>>>> should just work on Q35 as well.
> >>>>>>>>
> >>>>>>>> With only a few exceptions (e.g. the first ich9
audio device,
which is
> >>>>>>>> placed directly on the root bus at 00:1B.0 because
that is
where the
> >>>>>>>> ich9 audio device is located on actual Q35
hardware), libvirt
will
> >>>>>>>> automatically put all PCI devices (including
network
interfaces) on a
> >>>>>>>> pcie-root-port.
> >>>>>>>>
> >>>>>>>> After seeing reports that "acpi index
doesn't work with Q35
> >>>>>>>> machinetypes" I just assumed that was correct
and didn't try
it. But
> >>>>>>>> after seeing the "should work partially"
statement above, I
tried it
> >>>>>>>> just now and an <interface> of a Q35 guest
that had its PCI
address
> >>>>>>>> auto-assigned by libvirt (and so was placed on a
pcie-root-port)m and
> >>>>>>>> had <acpi index='4'/> was given the
name "eno4". So what
exactly is it
> >>>>>>>> that *doesn't* work?
> >>>>>>>
> >>>>>>> From QEMU side:
> >>>>>>> acpi-index requires:
> >>>>>>> 1. acpi pci hotplug enabled (which is default on
relatively
new q35 machine types)
> >>>>>>> 2. hotpluggble pci bus (root-port, various pci
bridges)
> >>>>>>> 3. NIC can be cold or hotplugged, guest should pick
up
acpi-index of the device
> >>>>>>> currently plugged into slot
> >>>>>>> what doesn't work:
> >>>>>>> 1. device attached to host-bridge directly (work in
progress)
> >>>>>>> (q35)
> >>>>>>> 2. devices attached to any PXB port and any
hierarchy hanging
of it (there are not plans to make it work)
> >>>>>>> (q35, pc)
> >>>>>>
> >>>>>> I'd say this is still a relatively important, as the
PXBs are
needed
> >>>>>> to create a NUMA placement aware topology for guests, and
I'd say
it
> >>>>>> is undesirable to loose acpi-index if a guest is updated to
be
NUMA
> >>>>>> aware, or if a guest image can be deployed in either normal
or
NUMA
> >>>>>> aware setups.
> >>>>>
> >>>>> it's not only Q35 but also PC.
> >>>>> We basically do not generate ACPI hierarchy for PXBs at all,
> >>>>> so neither ACPI hotplug nor depended acpi-index would work.
> >>>>> It's been so for many years and no one have asked to
enable
> >>>>> ACPI hotplug on them so far.
> >>>>
> >>>> I'm guessing (based on absolutely 0 information :-)) that
there
would be
> >>>> more demand for acpi-index (and the resulting predictable
interface
> >>>> names) than for acpi hotplug for NUMA-aware setup.
> >>>
> >>>
> >>> My guess is similar, but it is still desirable to have both (i.e.
support ACPI-indexing/hotplug with Numa-aware)
> >>> Adding @Peter Xu to check if our setups for SAP require NUMA-aware
topology
> >>>
> >>> How big of a project would it be to enable ACPI-indexing/hotplug
with PXB?
> >
> > Why would you need to add acpi hotplug on pxb?
> >
> >> Adding +Julia Suvorova and +Tsirkin, Michael to help answer this
question
> >>
> >> Thanks,
> >> Amnon
> >>
> >>>
> >>> Since native PCI was improved, we can still compromise on switching
to native-PCI-hotplug when PXB is required (and no fixed indexing)
> >
> > Native hotplug works on pxb as is, without disabling acpi hotplug.
>
> Are you saying you can add an acpi-index to a device plugged into a pxb,
> that index will be recognized (and used to name the device), but it will
> still do native hotplug?
nope, acpi-index won't work on pxb hierarchy, it works only PCI tree
hanging off main host bridge.
>
> That sounds okay to me, since it ticks all the functional marks
> (hotplug, consistent device names, NUMA-aware). It's possible there are
> some things I'm misunderstanding or haven't thought of though...
>
>
> >
> >>> Thanks,
> >>> Amnon
> >>>
> >>>
> >>>>
> >>>>
> >>>> Anyway, it sounds like (*within the confines of how libvirt
constructs
> >>>> the PCI topology*) we actually have functional parity of
acpi-index
> >>>> between 440fx and Q35.
> >>>>
> >
>