On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
I think the only votes were for option 1 and 4 (interesting how the
only
ones that were chosen were those that I *didn't* pick personally :-).
See comments below. In the meantime the other issue Alex pointed out may
cause this to take a slightly different direction.
On 06/22/2015 02:44 PM, Laine Stump wrote:
> ===
> Idea 1: multiplex the meaning of the "model" attribute, so we currently
have:
>
> <controller type='pci' model='dmi-to-pci-bridge'/>
>
> which means "add an i82801b11-bridge device" and when we add a generic
> version of this type of controller, we would do it with something like:
>
> <controller type='pci' model='generic-dmi-to-pci-bridge'/>
>
> and for another vendor's mythical controller:
>
> <controller type='pci' model='xyz-dmi-to-pci-bridge'/>
>
> Cons: This will make for ugliness in switch statements where a new
> case will have to be added whenever a different controller with
> similar behavior/usage is supported. And it's generally not a good idea to
> have a single attribute be used for two different functions.
jtomko advocated this one because it would make it easier for an older
libvirt to notice an unsupported class+model of controller during a
migration attempt from a newer libvirt. An example would be if a newer
libvirt had a guest with
<controller type='pci' model='xyz'/>
(where xyz is some qemu device that implements a dmi-to-pci-bridge)
That would fail on an attempt to migrate to older libvirt, but
<controller type='pci' model='dmi-to-pci-bridge'/>
would succeed. On the other hand, if we did this:
<controller type='pci' model='dmi-to-pci-bridge
submodel='xyz'/>
that would succeed (and create the wrong device) because submodel would
be ignored.
Since there is currently no alternate implementation of a
dmi-to-pci-bridge available in qemu, and it will likely be awhile until
that happens, I think if we start filling out the XML now as:
<controller type='pci' model='dmi-to-pci-bridge'
submodel='i82801b11-bridge'/>
by the time we get to the point that we do have an alternate 'xyz'
controller, any version of libvirt that doesn't understand "submodel"
will be so far in the past that we wouldn't want to be able to migrate
back to it anyway.
> ===
> Idea 4: Unlike other uses of "model" in libvirt, for pci controllers,
> continue to use "model" to denote the subtype/class/whatever of
> controller, and create a new attribute to denote the different
> specific implementations of each one. So for example:
>
> [4] <controller type='pci' model='dmi-to-pci-bridge'/>
>
> would become:
>
> [5] <controller type='pci' model='dmi-to-pci-bridge'
> implementation='i82801b11-bridge'/>
>
> (or some other name in place of "implementation" - ideas? I'm horrible
> at thinkin up names)
>
> Pros: wouldn't create compatibility problems when downgrading or
> migrating cross version.
>
> Cons: Is inconsistent with every other use of "model" attribute in
> libvirt, and each new addition of a PCI controller further propagates
> this misuse.
>
mkletzan preferred this one, and danpb was amenable to it in IRC. I
think I now agree, but Alex's comments about needing to keep track of
what we put in the qemu commandline for port and chassis have me
thinking this isn't enough.
The problem is that the "ioh3420" device needs its "port" and
"chassis"
options set; Alex recommends setting port to:
(slot << 3)+function
Likewise, he recommends setting "port" for the xio3130-downstream device
to the same value as slot. So, for the following:
<controller type='pci model='pci-root-port' index='3'>
<address type='pci' bus='0' slot='4'
function='1'>
</controller>
we would end up with the following commandline:
-device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0',
addr=0x4.0x1
(chassis is the same as "index" in the xml (again per Alex's
recommendation), and port is (4 << 3) + 1 = 0x21)
*However* he also says that we may change our mind on these in the
future, so chassis and port may end up being something different. Since
those values are visible to the guest, we can't allow them to change on
an existing machine, as it breaks the guest ABI. So we need to store
them in the XML. They aren't part of the PCI address though, they are
options. So I need to figure out the best way to represent that in the
XML, and fill it in when it is auto-generated when the controller is
defined. A few possible ways to solve both problems at once:
[1] <controller type='pci model='pci-root-port' index='3'>
<address type='pci' bus='0' slot='4'
function='1'>
<target type='ioh3420' chassis='3' port='0x21'/>
</controller>
Precedent: <target> is used to store the port number for serial/console
devices, specify guest-side bus and device name for disks, specify
guest-side mount location for filesystems, specify the *host*-side name
of tap devices for network interfaces, memory size for <memory>. So it
seems kind of proper. (slight variation - <target model='ioh3420' .../>
:-/ )
[2] <controller type='pci model='pci-root-port' index='3'>
<address type='pci' bus='0' slot='4'
function='1'>
<model type='ioh3420' chassis='3' port='0x21'/>
</controller>
(How's *that* for confusing?!? Both a model attribute *and* a model
subelement.)
Precedent: This is exactly patterned after the use of <model> in <video>
devices though, where the specific card being emulated along with any
memory/etc options are put in <model>, so at least there is precedence)
[3] <controller type='pci model='pci-root-port' index='3'>
<address type='pci' bus='0' slot='4'
function='1'>
<device type='ioh3420' chassis='3' port='0x21'/>
</controller>
(This one is completely new, just for a fresh start in case people think
neither of the others are exactly right)
At this point you can see that it all comes down to what name we want to
give the subelement; within that, we give the exact name of the qemu
device, and the exact name/value of any qemu options that we set to
non-default values.
Somebody *please* have an opinion on the name for this, because none of
these strikes me as better or worse (well, I think I dislike <driver>
To be honest, I kinds dislike all of them. Not that they would be
chosen poorly, no, it's simply because the good sensible choice is
unavailable due to another poor decision in the past (this may be
another point for Michal's talk on KVM Forum). Thinking about it I
must say I don't like how target (which is supposed to match a place
where the device appears for the guest) is used for the model
specification, on the other hand (ab)using 'model' element for the
specification of an "address" in guest (that's what I understand
chassis and port are) doesn't feel any better. What if we go with two
of those elements? Would that be too much pain? E.g.:
<controller type='pci model='pci-root-port' index='3'>
<address type='pci' bus='0' slot='4'
function='1'>
<model type='ioh3420'/>
<target chassis='3' port='0x21'/>
</controller>
I understand this might look like an overkill, but I think it's better
safe then sorry, I guess I just see us not so far in the future
regretting any decision made now.
Another idea I briefly thought about, which would also deal with the
migration to older libvirt versions was inventing a new address
type ... and then I realized how stupid^Wugly was that.
Anyway, I must say that I don't fully get the idea of how some of
these controllers are supposed to be specified and what particular HW
they should describe, so I might not had the best idea in town.
<rant> Mainly I don't understand why there's no pci switch. Or is
<controller type='pci'/> actually a pci switch (I usderstand it like
that)? Then why is there a downstream and upstream device when
{down,up}stream should just indicate whether the port aims towards the
root complex or the endpoints, respectively. </rant>
(P.S. There are other "automagic" qemu options being specified by
libvirt that maybe could use the same treatment as these. Two that come
to mind are chassis_nr for pci-bridge controllers (set to the
controller's index), and "vectors" for virtio-net multiqueue support
(set to (queues*2)+2).
I guess 'chassis_nr' is something exposed to guest, similarly to
'chassis' in the pci-root-port case, so that might be worth saving in
the XML. Maybe we can use 'chassis' in the target/model specification
for storing the chassis_nr as a level of abstraction on libvirt level.
But the number of vectors (to my knowledge) is not something exposed
to guest and is only advised to being set according to the formula you
wrote, above.