On Wed, Aug 23, 2017 at 04:47:03PM -0400, John Ferlan wrote:
On 08/23/2017 07:47 AM, Martin Kletzander wrote:
> We always truncated the name at 20 bytes instead of characters. In
> case 20 bytes were in the middle of a multi-byte character, then the
> string became invalid and various parts of the code would error
> out (e.g. XML parsing of that string). Let's instead properly
> truncate it after 20 characters instead.
>
> In some cases the truncation didn't even work (as can be seen from the
> modified tests), which is fixed by this as well.
Didn't work as well? Try at all...
As a test I changed the name of pcie-expander-bus to 80 characters and
all 80 were printed. I think this goes against your original intention
of a shortname...
>
> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1448766
>
> Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
> ---
> src/conf/domain_conf.c | 30 +++++++++++++++++++---
> .../qemuxml2argv-aarch64-virt-default-nic.args | 2 +-
> .../qemuxml2argv-hugepages-pages2.args | 4 +--
> .../qemuxml2argv-hugepages-pages3.args | 5 ++--
> .../qemuxml2argv-hugepages-pages5.args | 4 +--
> .../qemuxml2argv-hugepages-pages6.args | 2 +-
> .../qemuxml2argv-pcie-expander-bus.args | 2 +-
> 7 files changed, 37 insertions(+), 12 deletions(-)
>
Can we add a multibyte name test? These just modify existing tests (I
think wrongly too).
Good point, I don't know why I didn't add one. I created one especially
for testing this, but didn't add it to the repo it seems. However I'm
not sure we will be able to force the characters to be multibyte, we
would have to modify the locale of the test, but also skip the test if
the NLS for the filesystem is something like ISO 8859-1 or so. I can
add one anyway, at least on some systems it might check for something.
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 47eba4dbb315..9a46ceece2d6 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -27131,6 +27131,8 @@ virDomainDefHasMemballoon(const virDomainDef *def)
> }
>
>
> +#define VIR_DOMAIN_SHORT_NAME_MAX 20
> +
> /**
> * virDomainObjGetShortName:
> * @vm: Machine for which to get a name
> @@ -27141,15 +27143,37 @@ virDomainDefHasMemballoon(const virDomainDef *def)
> char *
> virDomainObjGetShortName(const virDomainDef *def)
> {
> - const int dommaxlen = 20;
> + wchar_t wshortname[VIR_DOMAIN_SHORT_NAME_MAX + 1] = {0};
> + size_t len = 0;
> + char *shortname = NULL;
> char *ret = NULL;
>
It would seem that we could figure there were multibyte chars in the
name "if strlen(def->name) != mbstowcs(NULL, def->name, 0)" and not do
any conversion, leaving the old code "as is" (mostly) and then
implementing something for these wide characters.
Good point, we can do that.
Of course I'm not getting anything other than -1 returned for the
mbstowcs and I didn't really want to dig any further, so I leave it up
to you!
What the errno in that case?
I tried to modify the same test and change the name to
"ÄèÎØÛ", but the
mbstowcs kept returning -1, until I added:
if (setlocale(LC_ALL, "en_US.UTF-8") == NULL) {
What is the output of `locale` on your machine?
But how does one know which locale to use? From my quick read of the
mbstowcs man page it seems a locale needs to be set first. I also tried
the man page example of "Grüße!" and that worked with UTF-8.
It should use the locale of the system. Maybe if you have e.g. 8859-1
set, there will be no multibyte characters, so it does not do anything.
If we can gather that info from the errno, then we can safely skip the
conversion as well.
> - ignore_value(virAsprintf(&ret, "%d-%.*s",
> - def->id, dommaxlen, def->name));
> + if (mbstowcs(wshortname, def->name, VIR_DOMAIN_SHORT_NAME_MAX) == (size_t)
-1) {
> + virReportError(VIR_ERR_INVALID_ARG, "%s",
> + _("Cannot convert domain name to wide character
string"));
> + return NULL;
> + }
> +
> + len = wcstombs(NULL, wshortname, 0);
> + if (len == (size_t) -1)
> + return NULL;
>
> + if (VIR_ALLOC_N(shortname, len + 1) < 0)
> + return NULL;
> +
> + if (wcstombs(shortname, wshortname, len) == (size_t) -1) {
> + virReportError(VIR_ERR_INVALID_ARG, "%s",
> + _("Cannot convert domain name from wide character
string"));
> + goto cleanup;
> + }
> +
> + ignore_value(virAsprintf(&ret, "%d-%s", def->id,
def->name));
You go through all this trouble and still write def->name! I didn't
realize that at first either - kept wondering why the whole damn name
was being printed!
What the fsck have I sent?!?!?
I see what I did wrong, interesting. So I ran the tests with
VIR_TEST_DEBUG=1 and I've seen some truncation not happening and I (by
mistake) though that new truncation *is* happening now, so I re-ran with
VIR_TEST_REGENERATE_OUTPUT=1. And it fixed the issue because there was
no more truncation happening.
Anyway, thanks for seeing that, that's what the reviews are for ;) v2 on
the way.