On 7/28/20 12:03 PM, Paulo de Rezende Pinatti wrote:
Context:
Libvirt can already detect the active VFs of an SRIOV PF device specified in a network
definition and automatically assign these VFs to guests via an <interface> entry
referring to that network in the domain definition. This functionality, however, depends
on the system administrator having activated in advance the desired number of VFs outside
of libvirt (either manually or through system scripts).
It would be more convenient if the VFs activation could also be managed inside libvirt so
that the whole management of the VF pool is done exclusively by libvirt and in only one
place (the network definition) rather than spread in different components of the system.
Proposal:
We can extend the existing network definition by adding a new tag <vf> as a child
of the tag <pf> in order to allow the user to specify how many VFs they wish to have
activated for the corresponding SRIOV device when the network is started. That would look
like the following:
<network>
<name>sriov-pool</name>
<forward mode='hostdev' managed='yes'>
<pf dev='eth1'>
<vf num='10'/>
</pf>
</forward>
</network>
At xml definition time nothing gets changed on the system, as it is today. When the
network is started with 'virth net-start sriov-pool' then libvirt will activate
the desired number of VFs as specified in the tag <vf> of the network definition.
The operation might require resetting 'sriov_numvfs' to zero first in case the
number of VFs currently active differs from the desired value. In order to avoid the
situation where the user tries to start the network when a VF is already assigned to a
running guest, the implementation will have to ensure all existing VFs of the target PF
are not in use, otherwise VFs would be inadvertently hot-unplugged from guests upon
network start. In such cases, trying to start the network will then result in an error.
I'm not sure about the "echo 0 > sriov_numvfs' part. It works like that
for Mellanox
CX-4 and CX-5 cards but I can't say it works like that for every other SR-IOV card
out
there. Sooner enough, we'll have to handle specific behavior for the cards to create
the VFs. Perhaps Laine can comment on this.
About the whole idea, it kind of changes the design of this network pool. As it is today,
at least from my reading of [1], Libvirt will use any available VF from the pool and
allocate it
to the guest, coping with the existing host VF settings. Using this new option, Libvirt is
now
setting the VFs to a specific number, which might as well be less than the actual
setting,
disrupting the host for no apparent reason.
I would be on board with this idea if:
1 - The attribute is changed to "minimal VFs required for this pool" rather than
"change the host
to match this VF number". This means that we wouldn't tamper with the created VFs
if the host
already has more VFs that specified. In your example up there, setting 10 VFs, what if the
host
has 20 VFs? Why should Libvirt care about taking down 10 VFs that it wouldn't use in
the
first place?
2 - we find a universal way (or as much closer as universal) to handle the creation of
VFs.
3 - we guarantee that the process of VF creation, which will take down all existing VFs
in
case of CX-5 cards with echo 0 > numvfs for example, wouldn't disrupt the host in
any
way.
(1) is an easier sell. Rename the attribute to "vf minimalNum" or something like
that, then
refuse to net-start if the host has less than the set amount of VFs checking
sriov_numvfs.
Start the network if sriov_numvfs >= minimal. This would bring immediate value to the
existing
design, allowing the user to specify the minimal amount of VFs the user intends to
consume from the pool.
(2) and (3) are more complicated. Specially (2).
Thanks,
DHB
[1]
https://wiki.libvirt.org/page/Networking#Assignment_from_a_pool_of_SRIOV_...
Stopping the network with 'virsh net-destroy' will cause all VFs to be removed.
Similarly to when starting the network, the implementation will also need to verify for
running guests in order to prevent inadvertent hot-unplugging.
Is the functionality proposed above desirable?