On 06/22/2016 12:45 PM, Jim Fehlig wrote:
On 06/22/2016 06:07 AM, Cole Robinson wrote:
> On 06/21/2016 10:32 PM, Jim Fehlig wrote:
>> On 06/21/2016 04:20 AM, Joao Martins wrote:
>>> On 06/21/2016 01:38 AM, Jim Fehlig wrote:
>>>> Joao Martins wrote:
>>>>> Guests use a <console /> (and sometimes <serial /> pair)
to represent
>>>>> the console. On the callback that is called when console is brought
up
>>>>> (NB: before domain starts), we fill the path of the console element
with
>>>>> the appropriate "/dev/pts/X". For PV guests it all works
fine, although
>>>>> for HVM guests it doesn't. Consequently we end up seeing
erronous
>>>>> behaviour on HVM guests where console path is not displayed on the
XML
>>>>> but still can be accessed with virDomainOpenConsole after booting
guest.
>>>>> Consumers of this XML (e.g. Openstack) also won't be able to use
the
>>>>> console path.
>>>> Can you provide example input domXML config that causes this problem?
>>>>
>>> Ah yes - I probably should include that in the commit description?
>> Yes, I think it helps clarify the problem being solved by the commit.
>>
>>> So, for PV guests for an input console XML element:
>>>
>>> <console type='pty'>
>>> <target type='xen' port='0'/>
>>> </console>
>>>
>>> Which then on domain start gets filled like:
>>>
>>> <console type='pty' tty='/dev/pts/3'>
>>> <source path='/dev/pts/3'/>
>>> <target type='xen' port='0'/>
>>> </console>
>>>
>>> Although for HVM guests we have:
>>>
>>> <serial type='pty'>
>>> <target port='0'/>
>>> </serial>
>>> <console type='pty'>
>>> <target type='serial' port='0'/>
>>> </console>
>>>
>>> And no changes happen i.e. source path isn't written there _even though_
it's
>>> set on the console element - being the reason why we can access the console
in the
>>> first place. IIUC The expected output should be:
>>>
>>> <serial type='pty'>
>>> <source path='/dev/pts/4'/>
>>> <target port='0'/>
>>> </serial>
>>> <console type='pty' tty='/dev/pts/4'>
>>> <source path='/dev/pts/4'/>
>>> <target type='serial' port='0'/>
>>> </console>
>> I asked for an example after having problems testing your patch, but it turned
>> out to be an issue which I think was introduced long ago by commit 482e5f15. I
>> was testing with transient domains and noticed 'virsh create; 'virsh
console'
>> always failed with
>>
>> error: internal error: character device <null> is not using a PTY
>>
>> My config only contained
>>
>> <console type='pty'>
>> <target port='0'/>
>> </console>
>>
>> so the "really crazy backcompat stuff for consoles" in
>> virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and
>> created a new def->consoles[0] without def->consoles[0].source.type set to
>> VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would
>> ignore the console and not copy the PTY path to source.data.file.path.
'virsh
>> console' would then fail with the error noted above.
>>
>> I spent too much time debugging this, partly because I was confused by odd
>> behavior with persistent domains. E.g. 'virsh define; virsh start; virsh
>> console' would fail with the same error, but all subsequent 'virsh
shutdown;
>> virsh start; virsh console' sequences would work :-). That behavior was due
to
>> vm->newDef being populated with correct console config when calling
>> virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of
>> the domain, vm->newDef would be copied over to vm->def, allowing
subsequent
>> 'virsh start; virsh console' to work correctly.
>>
> Hmm, that sounds weird. Can you explain more how exactly the XML changes?
It only changes when defining the vm.
> What
> is the diff between the initial 'virsh define; virsh dumpxml'
Initial XML has
<console type='pty'>
<target port='0'/>
</console>
After define
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
The config doesn't change after start; shutdown. But the contents of
virDomainDef does change, specifically def->consoles[0]->source.type changes
from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned
below, libxlConsoleCallback() only checks for source.type ==
VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to
def->consoles[0]->source.data.file.path. Another potential fix I also mentioned
below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path
if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and
perhaps VIR_DOMAIN_CHR_TYPE_FILE too?).
Okay I think I follow. The thing I was missing is that
virDomainDefAddConsoleCompat will alter ->def, but that's _not_ what is
formated/printed in the XML, which has some special handling to fully
duplicate serials[0] XML to consoles[0], if console target type=serial
I _think_ what the proper thing to do here is something like:
diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index 221af87..6bcb4d9 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -964,13 +964,17 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev,
void *for_callback)
virObjectLock(vm);
for (i = 0; i < vm->def->nconsoles; i++) {
virDomainChrDefPtr chr = vm->def->consoles[i];
- if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) {
+ if (i == 0 &&
+ chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL)
+ chr = vm->def->serials[0];
+
+ if (chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) {
libxl_console_type console_type;
char *console = NULL;
int ret;
console_type =
- (chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ?
+ (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL ?
LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV);
ret = libxl_console_get_tty(ctx, ev->domid,
chr->target.port, console_type,
Untested, but basically, libxl driver needs to learn that if
consoles[0].target.type == SERIAL, then you should use serials[0] instead. The
qemu driver has some magic like this sprinkled around...
The problem with your patch below is that it's not a whole fix, there's many
other bits besides source.type that would technically need to be copied over.
Rather than doing that, which has its own problems trying to keep the
console[0] and serial[0] data in sync, the convention seems to be that console
gets only target.type=serial and the drivers need to map that to serials[0].
That's my reading of things anyways
Thanks,
Cole
Regards,
Jim
> and the dumpxml
> after the VM has started+shutdown once? Whatever that diff is, and why it's
> not in the XML to begin with, sounds like the root issue to me.
>
>> Long story short, I found the problem but not sure how to fix it :-). One
>> approach would be the below patch. Another would be to loosen the restriction in
>> libxlConsoleCallback, filing in source.data.file.path when the console source
>> type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of
>> which have toiled in and likely loathe this code) to see if they have opinions
>> on a proper fix.
>>
>> Regards,
>> Jim
>>
>>
>> From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001
>> From: Jim Fehlig <jfehlig(a)suse.com>
>> Date: Tue, 21 Jun 2016 20:25:23 -0600
>> Subject: [PATCH] domain conf: copy source type from stolen console
>>
>> When domXML contains only <console type='pty'> and no
corresponding
>> <serial>, the console is "stolen" [1] and used as the first
<serial>
>> device. A new console is created as an alias to the first <serial>
>> device, but missed copying some configuration such as the source
>> 'type' attribute.
>>
>> [1] See comments associated with virDomainDefAddConsoleCompat() in
>> $LIBVIRT-SRC/src/conf/domain_conf.c:
>>
>> Signed-off-by: Jim Fehlig <jfehlig(a)suse.com>
>> ---
>> src/conf/domain_conf.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>> index 75ad03f..2fda4fe 100644
>> --- a/src/conf/domain_conf.c
>> +++ b/src/conf/domain_conf.c
>> @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def)
>> /* Create an console alias for the serial port */
>> def->consoles[0]->deviceType =
VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE;
>> def->consoles[0]->targetType =
>> VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL;
>> + def->consoles[0]->source.type =
def->serials[0]->source.type;
>> }
>> } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM &&
def->nserials > 0 &&
>> def->serials[0]->deviceType ==
VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL &&
>>
> Then again this also seems okay to me, but it's concerning that I applied this
> but it didn't cause the test suite to change at all.
>
> - Cole
>