[libvirt] RFC: exposing a config setting to force vhost-net support on/off

There's a request to allow libvirt to explicitly turn on/off the new vhost-net feature of virtio network cards. I see a few ways to do it, and am looking for opinions on which is best. (For the uninitiated, vhost-net is a new kernel-based virtio implementation that saves the overhead of having all network traffic be handled by the userlevel qemu process.) The original implementation of vhost-net support (what's been in libvirt for a few releases now) doesn't expose any knobs to the user - if the vhost-net kernel module is loaded, libvirt uses it for all virtio network devices of all newly created guests, and if the kernel module isn't loaded, it reverts to using the old user-level virtio. It's simple enough to put a bit of extra logic at the point where we make that decision. I see 3 possibilities: 1) default - use vhost-net if it's loaded, don't if it isn't (current behavior) 2) require - use vhost-net if it's loaded, and refuse to start the guest if it isn't (for those who want to be 100% sure they're using it) 3) disable - don't use vhost-net, whether or not it's loaded (to disable it, eg in case a compatibility problem is found between vhost-net and some particular guest) The question is how to describe that in the XML. Here's what a virtio network interface might look like currently: <devices> <interface type='network'> <source network='default'/> <model type='virtio'/> </interface> </devices> 1) One possibility (the simplest) would be to add an optional attribute to <model>: <devices> <interface type='network'> <source network='default'/> <model type='virtio' vhost='default|require|disable'/> (or 'default|on|off' ?) </interface> </devices> or maybe: <devices> <interface type='network'> <source network='default'/> <model type='virtio' mode='default|kernel|user'/> </interface> </devices> 2) Another possibility would be to define a new sub-element of<interface>, called "<driver>", similar to what's done in the storage device XML. In the future, other backend driver-related items could be placed there: <devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver vhost='default|require|disable'/> (or "mode='default|kernel|user'") </interface> </devices> 3) A third method, which might make the XML file look less ugly if, in the future, there were *many* more configurable items for interface backends (this one was inspired by the<memtune> element): <devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver> <vhost>default|require|disable</vhost> </driver> </interface> </devices> (we *might* want to consider naming it something other than "<driver>", eg"<tune>" or something like that, so that more things could be put in there. The problem with that is I don't really consider vhost a true "tunable", since as far as I'm aware, it's always faster to use kernel-level virtio than to bounce everything up to userspace; the setting in question is intended more for testing purposes, or to alleviate unforeseen compatibility problems). So, any opinions on these three possibilities (or an undescribed 4th?) What about the naming? I'm open to suggestions!

On 01/04/2011 12:37 PM, Laine Stump wrote:
It's simple enough to put a bit of extra logic at the point where we make that decision. I see 3 possibilities:
1) default - use vhost-net if it's loaded, don't if it isn't (current behavior)
2) require - use vhost-net if it's loaded, and refuse to start the guest if it isn't (for those who want to be 100% sure they're using it)
3) disable - don't use vhost-net, whether or not it's loaded (to disable it, eg in case a compatibility problem is found between vhost-net and some particular guest)
I agree with these three modes.
2) Another possibility would be to define a new sub-element of<interface>, called "<driver>", similar to what's done in the storage device XML. In the future, other backend driver-related items could be placed there:
<devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver vhost='default|require|disable'/> (or "mode='default|kernel|user'") </interface> </devices>
Lumping it all as attributes of <driver> makes the most sense to me, given the similarity to the comparable number of attributes under <disk>/<driver> (name, type, cache, error_policy; and the current talk of adding another attribute for qemu aio policy).
So, any opinions on these three possibilities (or an undescribed 4th?) What about the naming? I'm open to suggestions!
So count my vote for option 2 for where to stick the choice. As for the option names, my preference is in this order: mode=default|kernel|user - either way, there is some virtio going on, but this makes it clear when a kernel module is needed, leaving the name 'vhost' out of the picture (since it is merely an implementation detail of how the kernel provides virtio) vhost=default|require|disable - makes it clear that we are requiring vhost support, but the use of a kernel module becomes implicit (you have to know that vhost implies kernel) vhost=default|on|off - shorter names, but creates confusion (vhost=on means the guest will fail to start if vhost is _off_ in the kernel). But I'm not hard-set on my choice for option 2 out of the 3, or for the option naming of default|kernel|user, so feel free to use a different option if you get votes in a different direction. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

* Laine Stump (laine@laine.org) wrote:
2) Another possibility would be to define a new sub-element of<interface>, called "<driver>", similar to what's done in the storage device XML. In the future, other backend driver-related items could be placed there:
<devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver vhost='default|require|disable'/> (or "mode='default|kernel|user'") </interface> </devices>
3) A third method, which might make the XML file look less ugly if, in the future, there were *many* more configurable items for interface backends (this one was inspired by the<memtune> element):
<devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver> <vhost>default|require|disable</vhost> </driver> </interface> </devices>
Either of these seems useful, perhaps the name <parameters>. There's already other tunable parameters that could conceivably be exposed here.

On Tue, Jan 04, 2011 at 02:37:15PM -0500, Laine Stump wrote:
There's a request to allow libvirt to explicitly turn on/off the new vhost-net feature of virtio network cards. I see a few ways to do it, and am looking for opinions on which is best.
(For the uninitiated, vhost-net is a new kernel-based virtio implementation that saves the overhead of having all network traffic be handled by the userlevel qemu process.)
The original implementation of vhost-net support (what's been in libvirt for a few releases now) doesn't expose any knobs to the user - if the vhost-net kernel module is loaded, libvirt uses it for all virtio network devices of all newly created guests, and if the kernel module isn't loaded, it reverts to using the old user-level virtio.
It's simple enough to put a bit of extra logic at the point where we make that decision. I see 3 possibilities:
1) default - use vhost-net if it's loaded, don't if it isn't (current behavior)
2) require - use vhost-net if it's loaded, and refuse to start the guest if it isn't (for those who want to be 100% sure they're using it)
3) disable - don't use vhost-net, whether or not it's loaded (to disable it, eg in case a compatibility problem is found between vhost-net and some particular guest)
The question is how to describe that in the XML. Here's what a virtio network interface might look like currently:
<devices> <interface type='network'> <source network='default'/> <model type='virtio'/> </interface> </devices>
1) One possibility (the simplest) would be to add an optional attribute to <model>:
<devices> <interface type='network'> <source network='default'/> <model type='virtio' vhost='default|require|disable'/> (or 'default|on|off' ?) </interface> </devices>
or maybe:
<devices> <interface type='network'> <source network='default'/> <model type='virtio' mode='default|kernel|user'/> </interface> </devices>
2) Another possibility would be to define a new sub-element of<interface>, called "<driver>", similar to what's done in the storage device XML. In the future, other backend driver-related items could be placed there:
<devices> <interface type='network'> <source network='default'/> <model type='virtio'/> <driver vhost='default|require|disable'/> (or "mode='default|kernel|user'") </interface> </devices>
We should try to keep terminology matching the disk <driver> so I think <driver name='qemu|vhost'/> with omission of <driver> resulting in us automatically adding either 'qemu' or 'vhost' to the XML. We don't want to have an explicit 'default' value in the XML, because users should be able to see the guest is running with. Regards, Daniel

On 01/05/2011 05:19 AM, Daniel P. Berrange wrote:
We should try to keep terminology matching the disk<driver> so I think
<driver name='qemu|vhost'/>
with omission of<driver> resulting in us automatically adding either 'qemu' or 'vhost' to the XML. We don't want to have an explicit 'default' value in the XML, because users should be able to see the guest is running with.
Do you mean to add it to the XML that's saved in the config? If so, that would mean that it would only be possible to configure it as "use whatever is best for the current situation" for the first startup of the domain. Once that happened, it would be stuck on whichever was used the first time (qemu or vhost), so if the domain was first started when vhost-net was loaded, then later restarted when vhost-net wasn't loaded (or maybe migrated to another host that didn't have vhost support), it would fail to start. If you mean adding it only to the dumpxml output (when the domain is running, and --inactive isn't specified), I suppose that would be okay, as long as there's an easy way for the low level functions to understand that's the case. Aside from that, I'd been thinking that the "backend" driver in this case is virtio, not qemu or vhost; qemu(userland) vs vhost seems like just a setting within that driver. So it doesn't seem appropriate to me to have the name decide whether to use userland or vhost. One other twist - there's already another request for something else to be set for each network device: sndbuf. <https://bugzilla.redhat.com/show_bug.cgi?id=665293> https://bugzilla.redhat.com/show_bug.cgi?id=665293 <https://bugzilla.redhat.com/show_bug.cgi?id=665293> The sndbuf setting is applicable to any network device that connects to the real world using a tap device (ie, not just virtio). If we want to add that setting via the same scheme, we would need something like: <driver name='qemu|vhost' sndbuf='0'/> (0 can't be the default, because 0 is actually one of the settings that they want to explicitly specify (if sndbuf isn't given on the commandline, qemu defaults to 1048576). But what of the case where the device isn't virtio? Would you then specify a <driver> with no name attribute? (eg "<driver sndbuf='0'/>)

On Wed, Jan 05, 2011 at 09:57:42AM -0500, Laine Stump wrote:
On 01/05/2011 05:19 AM, Daniel P. Berrange wrote:
We should try to keep terminology matching the disk<driver> so I think
<driver name='qemu|vhost'/>
with omission of<driver> resulting in us automatically adding either 'qemu' or 'vhost' to the XML. We don't want to have an explicit 'default' value in the XML, because users should be able to see the guest is running with.
Do you mean to add it to the XML that's saved in the config? If so, that would mean that it would only be possible to configure it as "use whatever is best for the current situation" for the first startup of the domain. Once that happened, it would be stuck on whichever was used the first time (qemu or vhost), so if the domain was first started when vhost-net was loaded, then later restarted when vhost-net wasn't loaded (or maybe migrated to another host that didn't have vhost support), it would fail to start.
Yes, I *did* actually mean to set it in the permanent XML config, so once a choice is made, that choice is preserved thereafter. This gives us better reliability in the future if a further possible 'default' options are introduced and we want to avoid existing guests accidentally getting the new option for some reason.
Aside from that, I'd been thinking that the "backend" driver in this case is virtio, not qemu or vhost; qemu(userland) vs vhost seems like just a setting within that driver. So it doesn't seem appropriate to me to have the name decide whether to use userland or vhost.
Hmm, I thought vhost was a property of any tap device based backend, rather than virtio ?
One other twist - there's already another request for something else to be set for each network device: sndbuf.
<https://bugzilla.redhat.com/show_bug.cgi?id=665293> https://bugzilla.redhat.com/show_bug.cgi?id=665293 <https://bugzilla.redhat.com/show_bug.cgi?id=665293>
The sndbuf setting is applicable to any network device that connects to the real world using a tap device (ie, not just virtio). If we want to add that setting via the same scheme, we would need something like:
<driver name='qemu|vhost' sndbuf='0'/>
(0 can't be the default, because 0 is actually one of the settings that they want to explicitly specify (if sndbuf isn't given on the commandline, qemu defaults to 1048576).
sndbuf is much more like a true "tunable" than vhost is, so I think it makes sense to have a generic representation for NIC tunables. Daniel

On 01/07/2011 10:55 AM, Daniel P. Berrange wrote:
We should try to keep terminology matching the disk<driver> so I think
<driver name='qemu|vhost'/>
with omission of<driver> resulting in us automatically adding either 'qemu' or 'vhost' to the XML. We don't want to have an explicit 'default' value in the XML, because users should be able to see the guest is running with. Do you mean to add it to the XML that's saved in the config? If so,
On 01/05/2011 05:19 AM, Daniel P. Berrange wrote: that would mean that it would only be possible to configure it as "use whatever is best for the current situation" for the first startup of the domain. Once that happened, it would be stuck on whichever was used the first time (qemu or vhost), so if the domain was first started when vhost-net was loaded, then later restarted when vhost-net wasn't loaded (or maybe migrated to another host that didn't have vhost support), it would fail to start. Yes, I *did* actually mean to set it in the permanent XML config, so once a choice is made, that choice is preserved
On Wed, Jan 05, 2011 at 09:57:42AM -0500, Laine Stump wrote: thereafter. This gives us better reliability in the future if a further possible 'default' options are introduced and we want to avoid existing guests accidentally getting the new option for some reason.
And if that auto-selected option isn't possible (esp. if "vhost" is selected), should it fail? Or fall back? It seems friendlier to me to have a mode that tries to do the best it can with the current situation (similar to how it works now without any config).
Aside from that, I'd been thinking that the "backend" driver in this case is virtio, not qemu or vhost; qemu(userland) vs vhost seems like just a setting within that driver. So it doesn't seem appropriate to me to have the name decide whether to use userland or vhost. Hmm, I thought vhost was a property of any tap device based backend, rather than virtio ?
Nope. According to cdub, vhost-net is only for virtio (although he points out that macvtap will work with any tap device, which I didn't know). At any rate, I'm not really comfortable putting this in the "name" attribute
One other twist - there's already another request for something else to be set for each network device: sndbuf.
<https://bugzilla.redhat.com/show_bug.cgi?id=665293> https://bugzilla.redhat.com/show_bug.cgi?id=665293 <https://bugzilla.redhat.com/show_bug.cgi?id=665293>
The sndbuf setting is applicable to any network device that connects to the real world using a tap device (ie, not just virtio). If we want to add that setting via the same scheme, we would need something like:
<driver name='qemu|vhost' sndbuf='0'/>
(0 can't be the default, because 0 is actually one of the settings that they want to explicitly specify (if sndbuf isn't given on the commandline, qemu defaults to 1048576). sndbuf is much more like a true "tunable" than vhost is, so I think it makes sense to have a generic representation for NIC tunables.
Should these tunables be formatted like <memtune>: <memtune> <hard_limit>1048576</hard_limit> <soft_limit>131072</soft_limit> <swap_hard_limit>2097152</swap_hard_limit> <min_guarantee>65536</min_guarantee> </memtune> ?? so something like this? <interface type='network'> <source network='default'/> <model type='virtio'/> <tune> <sndbuf>0</sndbuf> ... </tune> </interface> or some other name? Or possibly even forget about the <tune>, and just put them all at the top level of <interface> ? (probably not, nesting it makes it more obvious what they are).
participants (4)
-
Chris Wright
-
Daniel P. Berrange
-
Eric Blake
-
Laine Stump