Hi
On Fri, Sep 14, 2018 at 11:44 AM, Michal Prívozník <mprivozn(a)redhat.com> wrote:
On 09/13/2018 11:51 PM, John Ferlan wrote:
>
>
> On 09/13/2018 10:09 AM, John Ferlan wrote:
>>
>>
>> On 09/13/2018 03:39 AM, Marc-André Lureau wrote:
>>> Hi
>>>
>>> On Thu, Sep 13, 2018 at 2:25 AM, John Ferlan <jferlan(a)redhat.com>
wrote:
>>>>
>>>> [...]
>>>>
>>>>>>
>>>>>> So all that's "left":
>>>>>>
>>>>>> 1. "Add" a check in qemuDomainABIStabilityCheck to
ensure we're not
>>>>>> changing from memory-backend-ram to memory-backend-memfd. We
already
>>>>>> check that "(src->mem.source != dst->mem.source)"
- so we know we're
>>>>>> already anonymous or not.
>>>>>>
>>>>>> Any suggestions? If source is anonymous, then what? I think we
can use
>>>>>> the qemuDomainObjPrivatePtr in some way to determine that we
were
>>>>>> started with -memfd (or not started that way).
>>>>>
>>>>> No idea how we could save that information across various restarts /
>>>>> version changes.
>>>>
>>>> I think it'd be ugly... I think migration cookies would have to be
>>>> used... I considered other mechanisms, but each wouldn't quite work.
>>>> Without writing the code, if we cared to do this, then we'd have:
>>>>
>>>> 1. Add a field to qemuDomainObjPrivatePtr that indicates what got
>>>> started (none, memfd, file, or ram). Add a typedef enum that has
>>>> unknown, none, memfd, file, and ram. Add the Parse/Format code to handle
>>>> the field.
>>>>
>>>> 2. Modify the qemu_command code to set the field in priv based on what
>>>> got started, if something got started. The value would be > 0...
>>>>
>>>> 3. Mess with the migration cookie logic to add checks for what the
>>>> source started. On the destination side of that cookie if we had the
>>>> "right capabilities", then check the source cookie to see what
it has.
>>>> If it didn't have that field, then I think one could assume the
source
>>>> with anonymous memory backing would be using -ram. We'd already fail
the
>>>> src/dst mem.source check if one used -file. I'm not all the versed
in
>>>> the cookies, but I think that'd work "logically thinking"
at least. The
>>>> devil would be in the details.
>>>>
>>>> Assuming your 3.1 patches do something to handle the condition, I guess
>>>> it comes does to how much of a problem it's believed this could be
in
>>>> 2.12 and 3.0 if someone is running -ram and migrates to a host that
>>>> would default to -memfd.
>>>
>>> I am afraid we will need to do it to handle transparent -memfd usage.
>>> I'll look at it with your help.
>>>
>>
>> Let's see what I can cobble together. I'll repost the series a bit later
>> today hopefully.
>>
>
> After spending a few hours on this, the cookies just don't help enough
> or I don't know/understand enough about their usage.
>
> I keep coming back to the problem of how do we disallow a migration from
> a host that has/knows about and uses anonymous memfd to one that doesn't
> know about it. Similarly, if a domain source w/ "file" or "ram"
(whether
> at startup time or via hotplug) is migrated to a target host that would
> generate memfd - we have no mechanism to stop the migration because we
> have no way to tell what it was running, especially since what gets
> started isn't just based off the source type - hugepages have a
> tangential role. Lots of logic stuffed into qemu_command that probably
> should have been in some qemuDomainPrepareMemtune API.
>
> So unfortunately, I think the only safe way is to create a new source
> type ("anonmem", "anonfile", "anonmemfd", ??) and
describe it as lightly
> as the other entries are described (ironically the document default of
> "anonymous" could be "file" or it could be "ram" based
3 other factors
> not described in the docs). At least with a new type name/value we can
> guarantee that someone selects it by name rather than the multipurpose
> "anonymous" type. I think it would mean moving the caps checks to a bit
> later in the code, search for "otherwise check the required capability".
>
> Unless someone still brave enough to keep reading this stream has an
> idea to try. I'm tapped out!
We can have an element/attribute in status XML/migration XML saying
which backend we've used. This is slightly tricky because we have more
places then one where users can tune confuguration such that we use
different backends. My personal favorite is:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='1'/>
</hugepages>
</memoryBacking>
<cpu>
<numa>
<cell id='0' cpus='0' memory='1048576'
unit='KiB'/>
<cell id='1' cpus='1' memory='1048576'
unit='KiB' memAccess='shared'/>
<cell id='2' cpus='2' memory='1048576'
unit='KiB' memAccess='private'/>
<cell id='3' cpus='3' memory='1048576'
unit='KiB'/>
</numa>
</cpu>
<devices>
<memory model='dimm'>
<target>
<size unit='KiB'>524288</size>
<node>1</node>
</target>
<address type='dimm' slot='0' base='0x100000000'/>
</memory>
</devices>
So what we can have is:
<hugepages>
<page size=.... backend='memory-backend-file'/>
</hugepages>
<cell id='0' cpus='0' memory='1048576'
unit='KiB' backend='memory-backend-ram'/>
<cell id='1' cpus='1' memory='1048576'
unit='KiB' memAccess='shared' backend='memory-backend-file'/>
<cell id='2' cpus='2' memory='1048576'
unit='KiB' memAccess='private' backend='memory-backend-file/>
<cell id='3' cpus='3' memory='1048576'
unit='KiB' backend='memory-backend-ram'/>
<devices>
<memory model='dimm' backend='memory-backend-ram'/>
That's a bit overkill to me, since we don't have (yet) the capacity
for a user to select the memory backend, and the value is a
qemu-specific detail.
..
</devices>
This way we know what backend was used on the source (in saved state)
and the only thing we need to know on dst (on restore) is to check if
given backend is available.
I don't think putting anything in migration cookies is going to help.
It might help migration if anything but it will definitely keep
save/restore broken as there are no migration cookies.
Ah, too bad. I am not familar enough with migration and save/restore
in libvirt. But I started to imagine how the migration cookie could
have been used.
Is there only in the domain XML we can save information?
If yes, then either we go with your proposal (although I wonder if it
should be qemu: namespaced) or can we introduce libvirt capabilites?
(something as simple as
<capabilities><qemu-memorybackend-memfd</capabilities>) ?
thanks!
Michal