On Tue, Mar 14, 2023 at 11:20:49AM +0100, Michal Privoznik wrote:
We have this crazy backwards compatibility when it comes to
serial and console devices.
This truly is the gift that keeps on giving :/
Basically, in same cases the very
first <console/> is just an alias to the very first <serial/>
device. This is to be seen at various places:
1) virDomainDefFormatInternalSetRootName() - when generating
domain XML, the <console/> configuration is basically ignored
and corresponding <serial/> config is formatted,
Wow, I wasn't aware of that part of the insanity. That got me
confused for a bit: dumpxml on a running domain was showing the
expected alias for the <console>. Now I understand why! It was
just displaying the information for the <serial> instead.
2) virDomainDefAddConsoleCompat() - which adds a copy of
<serial/> or <console/> into virDomainDef in post parse.
Since we add the missing elements in this function, shouldn't we be
able to remove the hacky code from the format part?
I've done a quick test and it didn't work, obviously :) but maybe
we're just failing to update some parts of the definition somewhere
else, and fixing those instances would allow to restore some sanity
to the formatting routine?
And when talking to QEMU we need a special handling too, because
while <serial/> is generated on the cmd line, the <console/> is
not. And in a lot of place we get it right. Except for generating
device aliases. On domain startup the 'expected' happens and
devices get "serial0" and "console0" aliases, correspondingly.
This ends up in the status XML too. But due to aforementioned
trick when formatting domain XML, "serial0" ends up in both
'virsh dumpxml' and the status XML. But internally, both devices
have different alias. Therefore, detaching the device using
<console/> fails as qemuDomainDetachDeviceChr() tries to detach
"console0".
After the daemon is restarted and status XML is parsed, then
everything works suddenly. This is because in the status XML both
devices have the same alias.
So is the fact that two devices end up with the same alias something
that we consider okay? When it comes to user aliases, such a
configuration is (understandably) rejected.
Clearly nothing explodes too badly in this situation, as evidenced by
the fact that things actually start working more reasonably once the
two devices get matching aliases, but I wonder if it wouldn't be more
sensible to assign different aliases for the <console> and the
<serial>, then handle compatibility further down, closer to where we
generate command line arguments and monitor commands?
Let's generate correct alias from the beginning.
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=2156300
After your patches, the "unknown cause" error is no longer reported,
and instead we get
unable to execute QEMU command 'device_del': Bus 'isa.0' does not
support hotplugging
which is clearly a better error message.
Doesn't really help with the fact that it's fundamentally impossible
to hot-unplug platform devices, unfortunately :)
+ /* Some crazy backcompat for consoles. Look into
+ * virDomainDefAddConsoleCompat() for more explanation. */
+ if (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE &&
+ chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL &&
+ def->os.type == VIR_DOMAIN_OSTYPE_HVM &&
+ def->consoles[0] == chr &&
+ def->nserials &&
+ def->serials[0]->info.alias) {
+ chr->info.alias = g_strdup(def->serials[0]->info.alias);
+ return 0;
+ }
This code is... Fine. It does the trick.
The only reason why I'm not ACKing it right away is that I'm
concerned we're piling hacks on top of hacks, and in particular that
allowing multiple devices to have the same alias will cause grief
further down the line.
If you have already considered this and convinced yourself that we're
okay, then say so and I'll gladly ACK the patch.
--
Andrea Bolognani / Red Hat / Virtualization