Daniel P. Berrangé <berrange(a)redhat.com> writes:
On Thu, Sep 10, 2020 at 04:26:40PM +0200, Milan Zamazal wrote:
> Daniel P. Berrangé <berrange(a)redhat.com> writes:
>
> > On Thu, Jul 02, 2020 at 01:21:15PM +0200, Milan Zamazal wrote:
> >> Hi,
> >>
> >
> >> I've met two situations with NVDIMM support in libvirt where I'm
not
> >> sure all the parties (libvirt & I) do the things correctly.
> >>
> >> The first problem is with memory alignment and size changes. In
> >> addition to the size changes applied to NVDIMMs by QEMU, libvirt also
> >> makes some NVDIMM size changes for better alignments, in
> >> qemuDomainMemoryDeviceAlignSize. This can lead to the size being
> >> rounded up, exceeding the size of the backing device and QEMU failing to
> >> start the VM for that reason (I've experienced that actually). I work
> >> with emulated NVDIMM devices, not a bare metal hardware, so one might
> >> argue that in practice the device sizes should already be aligned, but
> >> I'm not sure it must be always the case considering labels or whatever
> >> else the user decides to set up. And I still don't feel very
> >> comfortable that I have to count with two internal size adjustments
> >> (libvirt & QEMU) to the `size' value I specify, with the ultimate
goal
> >> of getting the VM started and having the NVDIMM aligned properly to make
> >> (non-NVDIMM) memory hot plug working. Is the size alignment performed
> >> by libvirt, especially rounding up, completely correct for NVDIMMs?
> >
> > The comment on the function says QEMU aligns to "page size", which
> > is something that can vary depending not only on architecture, and
> > also the build config options for the kernel on that architecture.
> > eg aarch64 has different page size in RHEL than other distros because
> > of different choice of page size in kernel config.
> >
> > Libvirt rounds up to 1 MB, essentially so that the size works no matter
> > what architecture or build options were used. I think this is quite
> > compelling as I don't think mgmt apps are likely to care enough about
> > non-x86 architectures to pick the right rounded sizes.
> >
> > If we're enforcing this 1 MB rounding though, we really should be
> > documenting it clearly, so that apps can pick the right backing file
> > size. I think we dropped the ball on docs.
>
> I still can't see it in the documentation, would it be possible to be
> clear about it in the docs, please? For first, it's not very intuitive
> to figure out that (if I've figured out it correctly) on POWER one
> *must* specify the NVDIMM size S as
>
> S == aligned_size + label_size
>
> and that size is used for the QEMU device; while on x86_64 one can
> specify any size S and
>
> align_up(S)
>
> will be used for the QEMU device (and label size doesn't influence the
> value). And additional alignment may be required for having any memory
> hot plug working.
>
> For second, and more importantly, I'm afraid that without documenting
> it, future changes may break the current behavior without warning. For
> example, the recent changes regarding POWER alignment in 6.7.0 are for
> good IMHO and one can use the same size with both 6.7 and 6.6 versions,
> but they could still cause pre-6.7 sizes stop working.
I don't know what changes you are referring to here, but if they were
in libvirt I'd consider that a bug - we shouldn't break a previously
working configuration by increasing required alignment.
I mean disabling the auto alignment in
https://gitlab.com/libvirt/libvirt/-/commit/07de813924caf37e535855541c0c1...
and replacing it with validation in
https://gitlab.com/libvirt/libvirt/-/commit/0ccceaa57c50e5ee528f7073fa872...
That change can cause a VM fail to start but after (manually) adjusting
the device size, all should work all right. Changes that would actually
change sizes would be more dangerous.
Regards,
Milan