Now for a slightly more serious overview; more details can
be obtained by reading the bug report[1].
There are two main goals behind this series:
1) allow the use of multiple PHBs for pSeries guests;
2) isolate hostdevs from emulated devices.
By implementing these changes, pSeries guests running on
QEMU/KVM become more similar to those running on PowerVM,
which is great for admins. It also makes it possible to
have better error detection / recovery.
The first part involves convincing libvirt that a guest
can have more than a single pci-root controller, which is
something that was explicitly forbidden until now. Only
pSeries guests get to follow the new rules, of course.
Once PHBs are usable, it makes sense to just use them
for pretty much everything on pSeries guests. pci-bridge
is no longer needed, which is good because it probably
never really worked on such guests.
To match PowerVM's behavior and provide better isolation,
hostdevs should be assigned each their own PHB. Well,
almost: the exact requirement is one PHB per host IOMMU
group, with the default PHB being reserved for emulated
devices.
To achieve this, the concept of "isolation group"[2] is
introduced: each PHB is assigned one, and only devices
that match it will be automatically assigned to that
PHB. For hostdevs, the isolation group is the same as
the host IOMMU group; emulated devices are assigned the
default isolation group, same as the default PHB. It all
works out.
The implementation is basically complete, minus the
documentation. I'm debating whether the current handling
of the default isolation group is good or not, but the
ideas are all there and I expect I'll need a couple of
respins to get all the details right, so feel free to
start reviewing :)
[1]
https://bugzilla.redhat.com/show_bug.cgi?id=1280542
[2] Named like that because, at least for me, "IOMMU" is
basically a tongue twister
--
Andrea Bolognani / Red Hat / Virtualization