On Mon, Nov 28, 2016 at 11:15:41 +0300, Nikolay Shirokovskiy wrote:
Instead I suggest to serialize qemu monitor commands.
That won't be possible. There are a few APIs that call multiple commands
and expect the VM not to change in between. Thus they rely on nested
jobs. See migration and snapshot code (btw, the two most complex parts
of libvirt).
My original motivation is to prepare job exclusion code to add agent
jobs. This
job is a special job to operate on guest agent only so first it can be run in
parallel with sync, async and nested jobs and second it should not block any
That is possible. The async jobs allow for query jobs to be run while
they are executed you need to properly configure them.
qemu monitor job at all. Eventually I want to move from a lot of
different
types of jobs to just a job with some kind of matrix which describes what jobs
can run in a parallel. As a first step I want to drop nested jobs.
I don't think you will be able to do this in steps. Nested jobs have a
vital functions and unless you replace the mechanism or overhaul all the
code that is doing nested jobs you won't be able to guarantee that the
state of the objects are as the code expects.
Doin a ful N*N matrix of possible parallel operations will be very hard
to maintain. You will need to group them together in a way so that we
don't have to deal with the quadratic complexity. How are you expecting
to do this?
AFAIU *main* nested job purpuse is to serialize requests to monitor
from async
jobs and allowed sync jobs as monitor does not do it by himself now. Strictly
No, the main purpose is to allow to call multiple monitor commands while
you are guaranteed that something did not change the VM state and thus
invalidate the complex operation you are trying to execute.
speaking this assertion is rather rough because nested jobs and sync
jobs
serialized at job level and thus chunks of code that serialized differ from
case when only monitor requests are serialized. Nevertheless IFAIU this part of
libvirt is not strict too. We have a lot of query commands and async commands
that can execute in parallel and have no definition of what query job or async
job are. Thus it is not clear is this code correct or not. So the same argument
Certainly it's more correct this way than dropping it without
replacement.
is applicable to new approach: query jobs are safe to be called
during async
jobs. (Except that new approach opens situation when async job interrupts
Some of qemu monitor commands block on execution. The job mechanism
prevents the second query call to get stuck waiting forever for the job
to finish by doing a timed wait.
query job itself but it generally not clear how this can break query
job).
On this path job became high level concept which I think it should be. Also we
drop boring passing of current async job which we need only for safety check
deep down below the stack. We drop driver passing for nested jobs which
I guess we really don't need to pass at all as we need it only to check queue
size and nested job probably should skip this check.
I did not get what you wanted to say here.
The diff stat looks large but most of code is dumb renaming or
dropping unused
parameter in patches 4 and 5. To reduce risk of introducing an error on this
tedious task I used sed and perl where I can.
I ran only a simple test with this series of quering a domain during its
migration.
That is a weak test for doing such a complex change. I'd suggest you run
parallel migrations and snapshot to see if it works. You need to run
operations that change the VM configuration.
I don't think we can do this with a complete overhaul of snapshots,
migration and a few others that actually have an expectation of the VM
object not changing while we are in an async job.
Peter