
On 06/22/2016 12:45 PM, Jim Fehlig wrote:
On 06/22/2016 06:07 AM, Cole Robinson wrote:
On 06/21/2016 10:32 PM, Jim Fehlig wrote:
On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description? Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console> I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with
error: internal error: character device <null> is not using a PTY
My config only contained
<console type='pty'> <target port='0'/> </console>
so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above.
I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly.
Hmm, that sounds weird. Can you explain more how exactly the XML changes?
It only changes when defining the vm.
What is the diff between the initial 'virsh define; virsh dumpxml'
Initial XML has
<console type='pty'> <target port='0'/> </console>
After define
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
The config doesn't change after start; shutdown. But the contents of virDomainDef does change, specifically def->consoles[0]->source.type changes from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned below, libxlConsoleCallback() only checks for source.type == VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to def->consoles[0]->source.data.file.path. Another potential fix I also mentioned below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and perhaps VIR_DOMAIN_CHR_TYPE_FILE too?).
Okay I think I follow. The thing I was missing is that virDomainDefAddConsoleCompat will alter ->def, but that's _not_ what is formated/printed in the XML, which has some special handling to fully duplicate serials[0] XML to consoles[0], if console target type=serial I _think_ what the proper thing to do here is something like: diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..6bcb4d9 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -964,13 +964,17 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) virObjectLock(vm); for (i = 0; i < vm->def->nconsoles; i++) { virDomainChrDefPtr chr = vm->def->consoles[i]; - if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { + if (i == 0 && + chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + chr = vm->def->serials[0]; + + if (chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { libxl_console_type console_type; char *console = NULL; int ret; console_type = - (chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL ? LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); ret = libxl_console_get_tty(ctx, ev->domid, chr->target.port, console_type, Untested, but basically, libxl driver needs to learn that if consoles[0].target.type == SERIAL, then you should use serials[0] instead. The qemu driver has some magic like this sprinkled around... The problem with your patch below is that it's not a whole fix, there's many other bits besides source.type that would technically need to be copied over. Rather than doing that, which has its own problems trying to keep the console[0] and serial[0] data in sync, the convention seems to be that console gets only target.type=serial and the drivers need to map that to serials[0]. That's my reading of things anyways Thanks, Cole
Regards, Jim
and the dumpxml after the VM has started+shutdown once? Whatever that diff is, and why it's not in the XML to begin with, sounds like the root issue to me.
Long story short, I found the problem but not sure how to fix it :-). One approach would be the below patch. Another would be to loosen the restriction in libxlConsoleCallback, filing in source.data.file.path when the console source type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of which have toiled in and likely loathe this code) to see if they have opinions on a proper fix.
Regards, Jim
From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 From: Jim Fehlig <jfehlig@suse.com> Date: Tue, 21 Jun 2016 20:25:23 -0600 Subject: [PATCH] domain conf: copy source type from stolen console
When domXML contains only <console type='pty'> and no corresponding <serial>, the console is "stolen" [1] and used as the first <serial> device. A new console is created as an alias to the first <serial> device, but missed copying some configuration such as the source 'type' attribute.
[1] See comments associated with virDomainDefAddConsoleCompat() in $LIBVIRT-SRC/src/conf/domain_conf.c:
Signed-off-by: Jim Fehlig <jfehlig@suse.com> --- src/conf/domain_conf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75ad03f..2fda4fe 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) /* Create an console alias for the serial port */ def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; def->consoles[0]->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; + def->consoles[0]->source.type = def->serials[0]->source.type; } } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL &&
Then again this also seems okay to me, but it's concerning that I applied this but it didn't cause the test suite to change at all.
- Cole