On 08/25/2017 02:30 AM, Martin Kletzander wrote:
On Wed, Aug 23, 2017 at 04:47:03PM -0400, John Ferlan wrote:
>
>
> On 08/23/2017 07:47 AM, Martin Kletzander wrote:
>> We always truncated the name at 20 bytes instead of characters. In
>> case 20 bytes were in the middle of a multi-byte character, then the
>> string became invalid and various parts of the code would error
>> out (e.g. XML parsing of that string). Let's instead properly
>> truncate it after 20 characters instead.
>>
>> In some cases the truncation didn't even work (as can be seen from the
>> modified tests), which is fixed by this as well.
>
> Didn't work as well? Try at all...
>
> As a test I changed the name of pcie-expander-bus to 80 characters and
> all 80 were printed. I think this goes against your original intention
> of a shortname...
>
>>
>> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1448766
>>
>> Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
>> ---
>> src/conf/domain_conf.c | 30
>> +++++++++++++++++++---
>> .../qemuxml2argv-aarch64-virt-default-nic.args | 2 +-
>> .../qemuxml2argv-hugepages-pages2.args | 4 +--
>> .../qemuxml2argv-hugepages-pages3.args | 5 ++--
>> .../qemuxml2argv-hugepages-pages5.args | 4 +--
>> .../qemuxml2argv-hugepages-pages6.args | 2 +-
>> .../qemuxml2argv-pcie-expander-bus.args | 2 +-
>> 7 files changed, 37 insertions(+), 12 deletions(-)
>>
>
> Can we add a multibyte name test? These just modify existing tests (I
> think wrongly too).
>
Good point, I don't know why I didn't add one. I created one especially
for testing this, but didn't add it to the repo it seems. However I'm
not sure we will be able to force the characters to be multibyte, we
would have to modify the locale of the test, but also skip the test if
the NLS for the filesystem is something like ISO 8859-1 or so. I can
add one anyway, at least on some systems it might check for something.
>
>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>> index 47eba4dbb315..9a46ceece2d6 100644
>> --- a/src/conf/domain_conf.c
>> +++ b/src/conf/domain_conf.c
>> @@ -27131,6 +27131,8 @@ virDomainDefHasMemballoon(const virDomainDef
>> *def)
>> }
>>
>>
>> +#define VIR_DOMAIN_SHORT_NAME_MAX 20
>> +
>> /**
>> * virDomainObjGetShortName:
>> * @vm: Machine for which to get a name
>> @@ -27141,15 +27143,37 @@ virDomainDefHasMemballoon(const
>> virDomainDef *def)
>> char *
>> virDomainObjGetShortName(const virDomainDef *def)
>> {
>> - const int dommaxlen = 20;
>> + wchar_t wshortname[VIR_DOMAIN_SHORT_NAME_MAX + 1] = {0};
>> + size_t len = 0;
>> + char *shortname = NULL;
>> char *ret = NULL;
>>
>
> It would seem that we could figure there were multibyte chars in the
> name "if strlen(def->name) != mbstowcs(NULL, def->name, 0)" and not
do
> any conversion, leaving the old code "as is" (mostly) and then
> implementing something for these wide characters.
>
Good point, we can do that.
> Of course I'm not getting anything other than -1 returned for the
> mbstowcs and I didn't really want to dig any further, so I leave it up
> to you!
>
What the errno in that case?
"Invalid or incomplete multibyte or wide character"
Same message even after adding a virGettextInitialize call in the
function as a test... It was one of those annoying end of the day
things too - as in this *should* work, why is it not working.
> I tried to modify the same test and change the name to
"ÄèÎØÛ", but the
> mbstowcs kept returning -1, until I added:
>
> if (setlocale(LC_ALL, "en_US.UTF-8") == NULL) {
>
What is the output of `locale` on your machine?
en_US.UTF-8
> But how does one know which locale to use? From my quick read of
the
> mbstowcs man page it seems a locale needs to be set first. I also tried
> the man page example of "Grüße!" and that worked with UTF-8.
>
It should use the locale of the system. Maybe if you have e.g. 8859-1
set, there will be no multibyte characters, so it does not do anything.
If we can gather that info from the errno, then we can safely skip the
conversion as well.
>> - ignore_value(virAsprintf(&ret, "%d-%.*s",
>> - def->id, dommaxlen, def->name));
>> + if (mbstowcs(wshortname, def->name, VIR_DOMAIN_SHORT_NAME_MAX)
>> == (size_t) -1) {
>> + virReportError(VIR_ERR_INVALID_ARG, "%s",
>> + _("Cannot convert domain name to wide
>> character string"));
>> + return NULL;
>> + }
>> +
>> + len = wcstombs(NULL, wshortname, 0);
>> + if (len == (size_t) -1)
>> + return NULL;
>>
>> + if (VIR_ALLOC_N(shortname, len + 1) < 0)
>> + return NULL;
>> +
>> + if (wcstombs(shortname, wshortname, len) == (size_t) -1) {
>> + virReportError(VIR_ERR_INVALID_ARG, "%s",
>> + _("Cannot convert domain name from wide
>> character string"));
>> + goto cleanup;
>> + }
>> +
>> + ignore_value(virAsprintf(&ret, "%d-%s", def->id,
def->name));
>
> You go through all this trouble and still write def->name! I didn't
> realize that at first either - kept wondering why the whole damn name
> was being printed!
>
What the fsck have I sent?!?!?
Yeah well I only saw it after spending more than a few minutes messing
around with the code.
I see what I did wrong, interesting. So I ran the tests with
VIR_TEST_DEBUG=1 and I've seen some truncation not happening and I (by
mistake) though that new truncation *is* happening now, so I re-ran with
VIR_TEST_REGENERATE_OUTPUT=1. And it fixed the issue because there was
no more truncation happening.
Ahhh... If I run the test directly:
VIR_TEST_DEBUG=1 ./run tests/qemuxml2argvtest
then things work as (mostly) expected as long as locale is set via a
virGettextInitialize in virDomainObjGetShortName; however, if I run via:
VIR_TEST_DEBUG=1 make -C tests check TESTS=qemuxml2argvtest
then I get the mbtowcs failure util I explicitly set the locale to
en_US.UTF-8 in virDomainObjGetShortName.
It's very odd/strange...
Since I only was messing with pcie-expander-bus-test I can also use
VIR_TEST_RANGE=565 in order to see just that output...
Anyway, thanks for seeing that, that's what the reviews are for ;) v2 on
the way.
I see v2 hit the list - I'll look at that...
John