
On 01/30/2012 09:00 AM, Jiri Denemark wrote:
We already provide ways to detect when a domain has been paused as a result of I/O error, but there was no way of getting the exact error or even the device that experienced it. This new API may be used for both. --- daemon/remote.c | 40 ++++++++++++++++++++++++++++++++ include/libvirt/libvirt.h.in | 32 +++++++++++++++++++++++++ python/generator.py | 3 +- src/driver.h | 7 +++++ src/libvirt.c | 52 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 + src/remote/remote_driver.c | 34 +++++++++++++++++++++++++++ src/remote/remote_protocol.x | 22 +++++++++++++++++- src/remote_protocol-structs | 16 +++++++++++++ src/rpc/gendispatch.pl | 47 +++++++++++++++++++++++++++++++++++++ 10 files changed, 252 insertions(+), 2 deletions(-)
This is big enough that I might have split it into public API (include, python, src/{driver.h,libvirt.c,libvirt_public.syms}) and RPC implementation (daemon, src/remote, src/rpc). But I'll go ahead and review this, whether or not you split it.
+++ b/include/libvirt/libvirt.h.in @@ -1967,6 +1967,38 @@ virDomainGetBlockIoTune(virDomainPtr dom, int *nparams, unsigned int flags);
+/** + * virDomainDiskErrorCode: + * + * Disk I/O error. + */ +typedef enum { + VIR_DOMAIN_DISK_ERROR_NONE = 0, /* no error */ + VIR_DOMAIN_DISK_ERROR_UNSPEC = 1, /* unspecified I/O error */ + VIR_DOMAIN_DISK_ERROR_NO_SPACE = 2, /* no space left on the device */ + +#ifdef VIR_ENUM_SENTINELS + VIR_DOMAIN_DISK_ERROR_LAST +#endif +} virDomainDiskErrorCode;
Looks reasonable and extensible.
+ +/** + * virDomainDiskError: + * + */ +typedef struct _virDomainDiskError virDomainDiskError; +typedef virDomainDiskError *virDomainDiskErrorPtr; + +struct _virDomainDiskError { + char *disk; /* disk target */ + int error; /* virDomainDiskErrorCode */ +}; + +int virDomainGetDiskErrors(virDomainPtr dom, + virDomainDiskErrorPtr errors, + int maxerrors,
I guess inertia is on our side for using int maxerrors, even though the value should never be negative.
+++ b/python/generator.py @@ -423,7 +423,8 @@ skip_impl = ( 'virDomainGetBlockIoTune', 'virDomainSetInterfaceParameters', 'virDomainGetInterfaceParameters', - 'virDomainGetCPUStats' # not implemented now. + 'virDomainGetCPUStats', # not implemented now. + 'virDomainGetDiskErrors',
Oops - blame me for not using a trailing comma, so that your hunk had to touch more lines than one :)
+/** + * virDomainGetDiskErrors: + * @dom: a domain object + * @errors: array to populate on output + * @maxerrors: size of @errors array + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * The function populates @errors array with all disks that encountered an + * I/O error. Each disk is identified by its target (the dev attribute of + * target subelement in domain XML), such as "vda", and accompanied with + * the error that was seen on it.
Suppose I have a domain with 3 disks, and disk A and C have both encountered an error. It is not clear to me whether calling virDomainGetDiskErrors(dom, array, 3, 0) will return 2 (info on A and C, since those were the only ones with errors) or 3 (info on A, B, and C, with B explicitly stating that there were no errors). I guess virDomainDiskErrorPtr has a reserved value for no error, so I'm assuming the latter, but I'm wondering how best to word this. Maybe: s~array with all disks that encountered an I/O error~array with information on all disks, and whether they have encountered an I/O error~ Or, if you really did mean to return only the disks with errors while omitting working disks, then we should add a shortcut, where calling with errors==NULL and maxerrors==0 returns how many errors would be returned, in order for the caller to know how big to allocate things. For that matter, even if we return no error, it would _still_ be handy to support the special case as a way to count how many disks are being tracked for errors.
+ * Returns number of disks with errors filled in the @errors array or -1 on + * error.
Please also document that the caller is responsible for calling free() on each disk name returned.
+int +virDomainGetDiskErrors(virDomainPtr dom, + virDomainDiskErrorPtr errors, + int maxerrors, + unsigned int flags) +{ + VIR_DOMAIN_DEBUG(dom, "errors=%p, maxerrors=%d, flags=%x", + errors, maxerrors, flags); + + virResetLastError(); + + if (!VIR_IS_DOMAIN(dom)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (dom->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + }
Missing a sanity check that errors is non-NULL and maxerrors is > 0. It's a common check, so it should be here rather than making each driver repeat it, and we have precedence for that sort of sanity check in this file.
+struct remote_domain_get_disk_errors_ret { + remote_domain_disk_error errors<REMOTE_DOMAIN_DISK_ERRORS_MAX>; /* insert@1 */ +};
If we implement the shorthand of virDomainGetDiskErrors(dom, NULL, 0, 0) as a way to tell how big to allocate an array for the next call, then this struct also needs to have a return length independent from the array. Quite a few of the virTypedParameters functions should serve as a model.
+
/*----- Protocol. -----*/
@@ -2708,7 +2727,8 @@ enum remote_procedure { REMOTE_PROC_STORAGE_VOL_RESIZE = 260, /* autogen autogen */
REMOTE_PROC_DOMAIN_PM_SUSPEND_FOR_DURATION = 261, /* autogen autogen */ - REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262 /* skipgen skipgen */ + REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262, /* skipgen skipgen */ + REMOTE_PROC_DOMAIN_GET_DISK_ERRORS = 263 /* autogen autogen */
Nice that you figured out how to autogen the serialize/deserialize calls. We should probably do the same for a lot of the virTypedParameter clients someday (then again, I think the reason the virTypedParameter clients don't autogen is because we don't yet have the shorthand for determining allocation limits wired up in the generator, so I may have just broken that for you by requesting the shorthand). Since adding a shorthand for returning the allocation size would imply an ABI break for RPC, we need to do it now, before the 0.9.10 freeze, so I think it's best to post a v3. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org