
[...]
So, just for clarification of the concept, the device with ^this UUID will have had to be defined by the nodedev API by the time we start to edit the domain XML in this manner in which case the only thing the autocreate=yes would do is to actually create the mdev according to the nodedev config, right? Continuing with that thought, if UUID doesn't refer to any of the inactive configs it will be an error I suppose? What about the fact that only one vgpu type can live on the GPU? even if you can successfully identify a device using the UUID in this way, you'll still face the problem, that other types might be currently occupying the GPU and need to be torn down first, will this be automated as well in what you suggest? I assume not.
Technically we shouldn't need the node device to exist at the time we define the XML - only at the time we start the guest, does the node device have to exist. eg same way you list a virtual network as the source of a guest NIC, but that virtual network doesn't have to actually have been defined & started until the guest starts.
If there are constraints that a pGPU can only support a certain combination of vGPUs at any single point in time, doesn't the kernel already enforce that when you try to create the vGPU in sysfs. IOW, we merely need to try to create the vGPU, and if the kernel mdev driver doesn't allow you to mix that with the other vGPUs that already exist, then we'd just report an error from virNodeDeviceCreate, and that'd get propagated back as the error for the virDomainCreate call.
</source> </hostdev> </devices>
In the QEMU driver, then the only change required is
if (def->autocreate) virNodeDeviceCreate(dev)
Aha, so if a device gets torn down on shutdown, we won't face the problem with some other devices being active, all of them will have to be in the inactive state because they got torn down during the last shutdown - that would work.
I'm not sure what the relationship with other active devices is relevant here. The virNodeDevicePtr we're accesing here is a single vGPU - if other running guests have further vGPUs on the same pGPU, that's not really relevant. Each vGPU is created/deleted as required.
I think he's talking about devices that were previously used by other domains that are no longer active. Since they're also automatically destroyed, they're not a problem.
Yes, that was exactly my point, anyhow, seems like I got a grasp of Dan's proposal then, great. Erik