On 05/07/2023 16.35, Daniel P. Berrangé wrote:
On Wed, Jul 05, 2023 at 03:29:39PM +0200, Claudio Imbrenda wrote:
> On Wed, 5 Jul 2023 14:08:27 +0100
> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
>
>> On Wed, Jul 05, 2023 at 02:46:03PM +0200, Claudio Imbrenda wrote:
>>> On Wed, 5 Jul 2023 13:26:32 +0100
>>> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
>>>
>>> [...]
>>>
>>>>>> I rather think mgmt apps need to explicitly opt-in to async
teardown,
>>>>>> so they're aware that they need to take account of delayed
RAM
>>>>>> availablity in their accounting / guest placement logic.
>>>>>
>>>>> what would you think about enabling it by default only for guests
that
>>>>> are capable to run in Secure Execution mode?
>>>>
>>>> IIUC, that's basically /all/ guests if running on new enough
hardware
>>>> with prot_virt=1 enabled on the host OS, so will still present
challenges
>>>> to mgmt apps needing to be aware of this behaviour AFAICS.
>>>
>>> I think there is some fencing still? I don't think it's automatic
>>
>> IIUC, the following sequence is possible
>>
>> 1. Start QEMU with -m 500G
>> -> QEMU spawns async teardown helper process
>> 2. Stop QEMU
>> -> Async teardown helper process remains running while
>
> not running, the process terminates immediately as soon as QEMU
> terminates. the termination takes some time, because of the memory
> cleanup.
>
>> kernel releases RAM
>> 3. Start QEMU with -m 500G
>> -> Fails with ENOMEM
>
> why though? the new VM will not manage to instantly use all of the
> memory
>
>> ...time passes...
>> 4. Async teardown helper finally terminates
>> -> The full original 500G is only now released for use
>
> memory starts to get freed as soon as the helper process terminates
> (which is as immediately as possible after QEMU terminates
>
> so unless you have a guest that will allocate and use all of its memory
> immediately as fast as possible at boot, this won't be a concern.
When using huge pages, QEMU should be fully allocating memory
immediately, regardless of whether the guest OS touches all RAM.
IIRC huge pages cannot be used with protected guests yet (Claudio, Janosch,
please confirm), so this should not be a problem here.
Thomas