On Mon, Sep 19, 2016 at 08:43:17 +0200, Michal Privoznik wrote:
On 19.09.2016 08:32, Peter Krempa wrote:
> On Mon, Sep 19, 2016 at 07:55:47 +0200, Michal Privoznik wrote:
>> When trying to migrate a huge page enabled guest, I've noticed
>> the following crash. Apparently, if no specific hugepages are
>> requested:
>>
>> <memoryBacking>
>> <hugepages/>
>> </memoryBacking>
>>
>> and there are no hugepages configured on the destination, we try
>> to dereference a NULL pointer.
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at
qemu/qemu_conf.c:1447
>> 1447 if (virAsprintf(&ret, "%s/libvirt/qemu",
hugepage->mnt_dir) < 0)
>> (gdb) bt
>> #0 0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at
qemu/qemu_conf.c:1447
>> #1 0x00007fcc907fb2f5 in qemuGetDefaultHugepath (hugetlbfs=0x0, nhugetlbfs=0)
at qemu/qemu_conf.c:1466
>> #2 0x00007fcc907b4afa in qemuBuildMemoryBackendStr (size=4194304, pagesize=0,
guestNode=0, userNodeset=0x0, autoNodeset=0x0, def=0x7fcc70019070,
qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, backendType=0x7fcc95087228,
backendProps=0x7fcc95087218,
>> force=false) at qemu/qemu_command.c:3297
>> #3 0x00007fcc907b4f91 in qemuBuildMemoryCellBackendStr (def=0x7fcc70019070,
qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, cell=0, auto_nodeset=0x0,
backendStr=0x7fcc70020360) at qemu/qemu_command.c:3413
>> #4 0x00007fcc907c0406 in qemuBuildNumaArgStr (cfg=0x7fcc5c011800,
def=0x7fcc70019070, cmd=0x7fcc700040c0, qemuCaps=0x7fcc70004000, auto_nodeset=0x0) at
qemu/qemu_command.c:7470
>> #5 0x00007fcc907c5fdf in qemuBuildCommandLine (driver=0x7fcc5c07b8a0,
logManager=0x7fcc70003c00, def=0x7fcc70019070, monitor_chr=0x7fcc70004bb0,
monitor_json=true, qemuCaps=0x7fcc70004000, migrateURI=0x7fcc700199c0 "defer",
snapshot=0x0,
>> vmop=VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, standalone=false,
enableFips=false, nodeset=0x0, nnicindexes=0x7fcc95087498, nicindexes=0x7fcc950874a0,
domainLibDir=0x7fcc700047c0 "/var/lib/libvirt/qemu/domain-1-fedora") at
qemu/qemu_command.c:9547
>
> I think you can trim the backtrace here.
Okay,
>> ---
>> src/qemu/qemu_command.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index 038c38f..0bafc3f 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -3294,6 +3294,12 @@ qemuBuildMemoryBackendStr(unsigned long long size,
>> if (!(mem_path = qemuGetHugepagePath(&cfg->hugetlbfs[i])))
>> goto cleanup;
>> } else {
>> + if (!cfg->nhugetlbfs) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + "%s", _("hugetlbfs filesystem is
not mounted "
>> + "or disabled by administrator
config"));
>> + goto cleanup;
>> + }
>> if (!(mem_path = qemuGetDefaultHugepath(cfg->hugetlbfs,
>> cfg->nhugetlbfs)))
>
> I think the bug is in qemuGetDefaultHugepath function rather than the
> caller:
>
> char *
> qemuGetDefaultHugepath(virHugeTLBFSPtr hugetlbfs,
> size_t nhugetlbfs)
> {
> size_t i;
>
> for (i = 0; i < nhugetlbfs; i++)
> if (hugetlbfs[i].deflt)
> break;
>
> if (i == nhugetlbfs)
> i = 0;
>
> return qemuGetHugepagePath(&hugetlbfs[i]);
>
> ^^ it dereferences the first member of the first argument even if you
> explicitly specify that the array has 0 elements.
That i = 0 assignment does not state array size, it just picks the first
element in the array if no explicitly set default huge page size is found.
It still dereferences the first element in the array despite the caller
passing that there are no elements in that array.
>
> }
>
> If you want to go with the fix as in this patch then you'll need to add
> a comment to qemuGetDefaultHugepath that the callers should make sure
> that it's called with at least one hugepage entry. Or fix
> qemuGetDefaultHugepath rathr than working it around in the caller.
Well, after my 2nd patch, there's no other caller than a fixed one.
That's great.
And I think qemuGetDefaultHugepath is really designed to be just a
simple function to fetch caller the path to huge pages. But okay, I can
That's great, but it has unusual semantics so you need to document it.
discard 2/2 and basically move out its internals to various other
functions (like qemuGetDefaultHugepath).
Why?