[Libvir] [PATCH] Enhanced stats for fullvirt domains

This patch does a couple of primary things: Firstly it allows you to use "hda", etc. as a path for getting block device stats from fullvirt domains. Secondly it separates out the stats code into a new file called 'stats_linux.c'. The reasoning behind the name is that this code can be shared between Xen & QEMU, and that the code is Linux-specific (it never worked on Solaris, but now this is explicit). I anticipate a 'stats_solaris.c' file once I can get Solaris + Xen going on a test machine. Also we try to detect the case where the block dev stats of a fullvirt domain are stuck at 0 -- caused by there being no frontend driver connected. We detect the condition by a query to xenstore. XENVBD_MAJOR is no longer hard-coded if we can get it instead from Linux header files. This patch adds bytes written/read to block devices for Xen PV domains if available. This also corrects a bug where stats from xvdb, xvdc, .. could not be read out because the device number was being miscalculated. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

Hi Rich, I know this patch just moved the code below, and the probability of data corruption and file I/O errors here is low, but... "Richard W.M. Jones" <rjones@redhat.com> wrote:
+static int64_t +read_stat (const char *path) +{ + char str[64]; + int64_t r; + int i; + FILE *fp; + + fp = fopen (path, "r"); + if (!fp) return -1; + /* stupid GCC warning */ i = fread (str, sizeof str, 1, fp); + r = strtoll (str, NULL, 10); + fclose (fp); + return r; +}
Since all of fread, strtoll, and fclose can fail, and since the 64 bytes from fread might be a valid prefix, but not terminated (i.e., strtoll could overrun the STR buffer -- yeah, it's far-fetched, but still) the above should probably be rewritten something like e.g., WARNING: the following may not even compile /* Convert NUL-or-NL-terminated string to int64_t, detecting overflow, invalid string (i.e., non-digit), or a long long value that doesn't fit in int64_t (probably only theoretical). */ static int xstrtoint64 (char const *s, int base, int64_t *result) { long long int lli; char *p; errno = 0; lli = strtoll (s, &p, base); if (errno || !(*p == 0 || *p == '\n') || p == s || (int64_t) lli != lli) return -1; *result = lli; return 0; } static int64_t read_stat (const char *path) { char str[64]; int64_t r; int i; FILE *fp; fp = fopen (path, "r"); if (!fp) return -1; /* read, but don't bail out before closing */ i = fread (str, sizeof str, 1, fp); if (fclose (fp) != 0 || i < 2 /* ensure we read at least two bytes */ || str[i - 1] != 0 /* the last byte must be zero */ || xstrtoint64 (str, 10, &r) != 0) return -1; return r; }

Thanks Jim. Here is an updated patch which fixes this issue, and is tested against Xen 3.1. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Fri, Sep 28, 2007 at 10:26:05AM +0100, Richard W.M. Jones wrote:
Thanks Jim. Here is an updated patch which fixes this issue, and is tested against Xen 3.1.
The principle of the patch sounds good to me, I didn't spot anything obvious, but I didn't try to test it. +1 Once it's in CVS and in a release we will see more testing of those APIs Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Hi Rich, [I didn't look through the whole patch, initially. Here's the result of a more thorough review.]
+/* In Xenstore, /local/domain/0/backend/vbd/<domid>/<device>/state, + * if available, must be XenbusStateConnected (= 4), otherwise there + * is no connected device. + */ +static int +check_bd_connected (xenUnifiedPrivatePtr priv, int device, int domid) +{ + char s[256], *rs; + int r; + unsigned len = 0; + + /* This code assumes we're connected if we can't get to + * xenstore, etc. + */ + if (!priv->xshandle) return 1; + snprintf (s, sizeof s, "/local/domain/0/backend/vbd/%d/%d/state", + domid, device); + s[sizeof s - 1] = '\0'; + + rs = xs_read (priv->xshandle, 0, s, &len); + if (!rs) return 1; + + r = STREQ (rs, "4"); + free (rs); + return r; +}
That function should also check LEN (i.e., what if it's 0 or 1?). Otherwise, STREQ might read uninitialized memory. The "unsigned len = 0;" initialization looks unnecessary. ==================================================================
+int +xenLinuxDomainBlockStats (xenUnifiedPrivatePtr priv, + virDomainPtr dom, + const char *path, + struct _virDomainBlockStats *stats) +{ + int minor, device; + + /* Paravirt domains: + * Paths have the form "xvd[a-]" and map to paths + * /sys/devices/xen-backend/(vbd|tap)-domid-major:minor/ + * statistics/(rd|wr|oo)_req. + * The major:minor is in this case fixed as 202*256 + minor*16 + * where minor is 0 for xvda, 1 for xvdb and so on. + */ + if (strlen (path) == 4 && + STREQLEN (path, "xvd", 3)) { + if ((minor = path[3] - 'a') < 0 || minor > 26) { + statsErrorFunc (VIR_ERR_INVALID_ARG, __FUNCTION__, + "invalid path, should be xvda, xvdb, etc.", + dom->id); + return -1; + } + device = XENVBD_MAJOR * 256 + minor * 16; + + return read_bd_stats (priv, device, dom->id, stats); + } + /* Fullvirt domains: + * hda, hdb etc map to major = HD_MAJOR*256 + minor*16. + */ + else if (strlen (path) == 3 && + STREQLEN (path, "hd", 2)) { + if ((minor = path[2] - 'a') < 0 || minor > 26) { + statsErrorFunc (VIR_ERR_INVALID_ARG, __FUNCTION__, + "invalid path, should be hda, hdb, etc.", + dom->id); + return -1; + } + device = HD_MAJOR * 256 + minor * 16;
Probably won't ever happen, but it looks like the code should require that minor "letters" be in [a-o] (aka ['a'..'a'+15]). If it were to allow larger ones, the "minor * 16" part would be large enough to modify the major number bits of "device". That sounds undesirable. So, if this is something to worry about, you could change "minor > 26" to "minor > 15" in the two tests above. ==================================================================
+static void +statsErrorFunc(virErrorNumber error, const char *func, const char *info, + int value) +{ + char fullinfo[1000]; + const char *errmsg; + + errmsg = __virErrorMsg(error, info); + if (func != NULL) { + snprintf(fullinfo, 999, "%s: %s", func, info); + fullinfo[999] = 0; + __virRaiseError(NULL, NULL, NULL, VIR_FROM_XEN, error, VIR_ERR_ERROR, + errmsg, fullinfo, NULL, value, 0, errmsg, fullinfo, + value); + } else { + __virRaiseError(NULL, NULL, NULL, VIR_FROM_XEN, error, VIR_ERR_ERROR, + errmsg, info, NULL, value, 0, errmsg, info, + value);
Use "sizeof fullinfo - 1", not 999. Rather than duplicating the __virRaiseError call, how about calling it just once, with info or fullinfo? ==================================================================
+static int +read_bd_stats (xenUnifiedPrivatePtr priv, ... + /* 'Bytes' was really sectors when we read it. Scale up by + * an assumed sector size. + */ + if (stats->rd_bytes > 0) stats->rd_bytes *= 512; + if (stats->wr_bytes > 0) stats->wr_bytes *= 512;
For large starting values of rd_bytes and wr_bytes, it'd be nice if rather than wrapping around, the result were the maximum value for that type -- or maybe a sentinel value.

Jim Meyering wrote:
+/* In Xenstore, /local/domain/0/backend/vbd/<domid>/<device>/state, + * if available, must be XenbusStateConnected (= 4), otherwise there + * is no connected device. + */ +static int +check_bd_connected (xenUnifiedPrivatePtr priv, int device, int domid) +{ + char s[256], *rs; + int r; + unsigned len = 0; + + /* This code assumes we're connected if we can't get to + * xenstore, etc. + */ + if (!priv->xshandle) return 1; + snprintf (s, sizeof s, "/local/domain/0/backend/vbd/%d/%d/state", + domid, device); + s[sizeof s - 1] = '\0'; + + rs = xs_read (priv->xshandle, 0, s, &len); + if (!rs) return 1; + + r = STREQ (rs, "4"); + free (rs); + return r; +}
That function should also check LEN (i.e., what if it's 0 or 1?). Otherwise, STREQ might read uninitialized memory.
I've addressed that by doing something, hopefully the right thing, if len == 0. (We would expect len == 1).
The "unsigned len = 0;" initialization looks unnecessary.
I left that because I can't see it doing any harm in this case.
==================================================================
+int +xenLinuxDomainBlockStats (xenUnifiedPrivatePtr priv, + virDomainPtr dom, + const char *path, + struct _virDomainBlockStats *stats) +{ + int minor, device; + + /* Paravirt domains: + * Paths have the form "xvd[a-]" and map to paths + * /sys/devices/xen-backend/(vbd|tap)-domid-major:minor/ + * statistics/(rd|wr|oo)_req. + * The major:minor is in this case fixed as 202*256 + minor*16 + * where minor is 0 for xvda, 1 for xvdb and so on. + */ + if (strlen (path) == 4 && + STREQLEN (path, "xvd", 3)) { + if ((minor = path[3] - 'a') < 0 || minor > 26) { + statsErrorFunc (VIR_ERR_INVALID_ARG, __FUNCTION__, + "invalid path, should be xvda, xvdb, etc.", + dom->id); + return -1; + } + device = XENVBD_MAJOR * 256 + minor * 16; + + return read_bd_stats (priv, device, dom->id, stats); + } + /* Fullvirt domains: + * hda, hdb etc map to major = HD_MAJOR*256 + minor*16. + */ + else if (strlen (path) == 3 && + STREQLEN (path, "hd", 2)) { + if ((minor = path[2] - 'a') < 0 || minor > 26) { + statsErrorFunc (VIR_ERR_INVALID_ARG, __FUNCTION__, + "invalid path, should be hda, hdb, etc.", + dom->id); + return -1; + } + device = HD_MAJOR * 256 + minor * 16;
Probably won't ever happen, but it looks like the code should require that minor "letters" be in [a-o] (aka ['a'..'a'+15]). If it were to allow larger ones, the "minor * 16" part would be large enough to modify the major number bits of "device". That sounds undesirable.
So, if this is something to worry about, you could change "minor > 26" to "minor > 15" in the two tests above.
Oh dear, that's a mess isn't it. I can't find out how the devices above xvdo should be numbered, so I took your advice and limited the minor to <= 15. The same for devices >= hdo too.
==================================================================
+static void +statsErrorFunc(virErrorNumber error, const char *func, const char *info, + int value) +{ + char fullinfo[1000]; + const char *errmsg; + + errmsg = __virErrorMsg(error, info); + if (func != NULL) { + snprintf(fullinfo, 999, "%s: %s", func, info); + fullinfo[999] = 0; + __virRaiseError(NULL, NULL, NULL, VIR_FROM_XEN, error, VIR_ERR_ERROR, + errmsg, fullinfo, NULL, value, 0, errmsg, fullinfo, + value); + } else { + __virRaiseError(NULL, NULL, NULL, VIR_FROM_XEN, error, VIR_ERR_ERROR, + errmsg, info, NULL, value, 0, errmsg, info, + value);
Use "sizeof fullinfo - 1", not 999. Rather than duplicating the __virRaiseError call, how about calling it just once, with info or fullinfo?
Yes, the perils of copying and pasting code. Fixed both.
==================================================================
+static int +read_bd_stats (xenUnifiedPrivatePtr priv, ... + /* 'Bytes' was really sectors when we read it. Scale up by + * an assumed sector size. + */ + if (stats->rd_bytes > 0) stats->rd_bytes *= 512; + if (stats->wr_bytes > 0) stats->wr_bytes *= 512;
For large starting values of rd_bytes and wr_bytes, it'd be nice if rather than wrapping around, the result were the maximum value for that type -- or maybe a sentinel value.
It would have to be really really large (these are 64 bit numbers), but yes I have fixed this too. Updated patch is attached. Thanks, Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

"Richard W.M. Jones" <rjones@redhat.com> wrote: ...
That function should also check LEN (i.e., what if it's 0 or 1?). Otherwise, STREQ might read uninitialized memory.
I've addressed that by doing something, hopefully the right thing, if len == 0. (We would expect len == 1).
Yeah, you're right. len==0 was the only missed case. I didn't realize at the time that "len" doesn't include the trailing NUL byte.
The "unsigned len = 0;" initialization looks unnecessary.
I left that because I can't see it doing any harm in this case.
It's not a big deal. True, it causes no run-time harm, but anything unnecessary can make the reviewer/maintainer wonder "why?". ...
It would have to be really really large (these are 64 bit numbers), but yes I have fixed this too.
Updated patch is attached.
Looks fine, now.

This patch has been applied. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903
participants (3)
-
Daniel Veillard
-
Jim Meyering
-
Richard W.M. Jones