Re: [libvirt] [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2 Jun 2011


      On 06/02/2011 02:13 PM, Luiz Capitulino wrote:
...
On Thu, 02 Jun 2011 13:33:52 -0500
Anthony Liguori<anthony@codemonkey.ws>  wrote:
...
On 06/02/2011 01:09 PM, Luiz Capitulino wrote:
...
On Thu, 02 Jun 2011 13:00:04 -0500
Anthony Liguori<anthony@codemonkey.ws>   wrote:
...
On 06/02/2011 12:57 PM, Luiz Capitulino wrote:
...
On Wed, 01 Jun 2011 16:35:03 -0500
Anthony Liguori<anthony@codemonkey.ws>    wrote:
...
On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
> Hi there,
>
> There are people who want to use QMP for thin provisioning. That's, the VM is
> started with a small storage and when a no space error is triggered, more space
> is allocated and the VM is put to run again.
>
> QMP has two limitations that prevent people from doing this today:
>
> 1. The BLOCK_IO_ERROR doesn't contain error information
>
> 2. Considering we solve item 1, we still have to provide a way for clients
>        to query why a VM stopped. This is needed because clients may miss the
>        BLOCK_IO_ERROR event or may connect to the VM while it's already stopped
>
> A proposal to solve both problems follow.
>
> A. BLOCK_IO_ERROR information
> -----------------------------
>
> We already have discussed this a lot, but didn't reach a consensus. My solution
> is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
> for example (see the "reason" key):
>
> { "event": "BLOCK_IO_ERROR",
>        "data": { "device": "ide0-hd1",
>                  "operation": "write",
>                  "action": "stop",
>                  "reason": "enospc", }
you can call the reason whatever you want, but don't call it stringfied
errno name :-)
In fact, just make reason "no space".
You mean, we should do:
"reason": "no space"
Or that we should make it a boolean, like:
"no space": true
Do we need reason in BLOCK_IO_ERROR if query-block returns this information?
True, no.
...
...
I'm ok with either way. But in case you meant the second one, I guess
we should make "reason" a dictionary so that we can group related
information when we extend the field, for example:
"reason": { "no space": false, "no permission": true }
Why would we ever have "no permission"?
Why did it happen?  It's not clear to me when read/write would return
EPERM.  open() should fail.  In fact, EPERM is not mentioned in man 2 read.
Actually, the error was an EACCESS which might sound more bizarre :)
What happened was that the device file in question had its permission
changed during VM execution due to a bug somewhere else. I'm not sure if
the error was returned in a read() or write() (Kevin might have more details).
Strange, EACCES should only happen on open().  Is it possible that 
somehow a reopen was happening?
...
This is a bit extreme and I'd agree it's arguable whether or not we should
report EACCESS, but I had this in mind and ended up mentioning it...
If we can't explain why an error would occur, we shouldn't make it part 
of the protocol :-)
...
Maybe libvirt guys could provide more input wrt the error reason usage.
If we don't have valid use cases for other errors, then I'll agree that
providing only "no space" is enough.
Definitely!  Adding libvirt to the CC to help encourage their input.

Regards,

Anthony Liguori