[libvirt] [PATCH RFC] libxl: set serial source path for console type=serial

Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Finally, the path set in consoles array won't persist across daemon reboots, thus rendering admins/users with no access to console with a message such as: "error: operation failed: PTY device is not yet assigned" This is because: for HVM guests, DefFormatInternal will ignore the console element and instead write it with the content of the serial element for target type = serial which isn't touched in our libxlConsoleCallback. To address that we introduce a new helper routine libxlConsoleSetSourcePath() that sets the source path and thus also handling this case. That is by setting the source path on with serial element akin to the one indexed by console "port". This way we keep similar behaviour for PV and HVM guests. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- src/libxl/libxl_domain.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..4a46143 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -955,6 +955,35 @@ libxlNetworkPrepareDevices(virDomainDefPtr def) return 0; } +static int +libxlConsoleSetSourcePath(virDomainDefPtr def, virDomainChrDefPtr console, + char *path) +{ + int ret = -1; + size_t i; + + if (!path || path[0] == '\0') + return ret; + + if (VIR_STRDUP(console->source.data.file.path, path) < 0) + return ret; + + if (console->targetType != VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + return 0; + + for (i = 0; i < def->nserials; i++) { + virDomainChrDefPtr serial = def->serials[i]; + + if (serial->target.port == console->target.port && + VIR_STRDUP(serial->source.data.file.path, path) >= 0) { + ret = 0; + break; + } + } + + return ret; +} + static void libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) { @@ -977,10 +1006,8 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) &console); if (!ret) { VIR_FREE(chr->source.data.file.path); - if (console && console[0] != '\0') { - ignore_value(VIR_STRDUP(chr->source.data.file.path, - console)); - } + if (libxlConsoleSetSourcePath(vm->def, chr, console) < 0) + VIR_WARN("Failed to set console source path"); } VIR_FREE(console); } -- 2.1.4

Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path.
Can you provide example input domXML config that causes this problem? Regards, Jim
Finally, the path set in consoles array won't persist across daemon reboots, thus rendering admins/users with no access to console with a message such as:
"error: operation failed: PTY device is not yet assigned"
This is because: for HVM guests, DefFormatInternal will ignore the console element and instead write it with the content of the serial element for target type = serial which isn't touched in our libxlConsoleCallback. To address that we introduce a new helper routine libxlConsoleSetSourcePath() that sets the source path and thus also handling this case. That is by setting the source path on with serial element akin to the one indexed by console "port". This way we keep similar behaviour for PV and HVM guests.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- src/libxl/libxl_domain.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..4a46143 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -955,6 +955,35 @@ libxlNetworkPrepareDevices(virDomainDefPtr def) return 0; }
+static int +libxlConsoleSetSourcePath(virDomainDefPtr def, virDomainChrDefPtr console, + char *path) +{ + int ret = -1; + size_t i; + + if (!path || path[0] == '\0') + return ret; + + if (VIR_STRDUP(console->source.data.file.path, path) < 0) + return ret; + + if (console->targetType != VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + return 0; + + for (i = 0; i < def->nserials; i++) { + virDomainChrDefPtr serial = def->serials[i]; + + if (serial->target.port == console->target.port && + VIR_STRDUP(serial->source.data.file.path, path) >= 0) { + ret = 0; + break; + } + } + + return ret; +} + static void libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) { @@ -977,10 +1006,8 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) &console); if (!ret) { VIR_FREE(chr->source.data.file.path); - if (console && console[0] != '\0') { - ignore_value(VIR_STRDUP(chr->source.data.file.path, - console)); - } + if (libxlConsoleSetSourcePath(vm->def, chr, console) < 0) + VIR_WARN("Failed to set console source path"); } VIR_FREE(console); }

On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path.
Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description? So, for PV guests for an input console XML element: <console type='pty'> <target type='xen' port='0'/> </console> Which then on domain start gets filled like: <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console> Although for HVM guests we have: <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be: <serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console> Joao
Regards, Jim
Finally, the path set in consoles array won't persist across daemon reboots, thus rendering admins/users with no access to console with a message such as:
"error: operation failed: PTY device is not yet assigned"
This is because: for HVM guests, DefFormatInternal will ignore the console element and instead write it with the content of the serial element for target type = serial which isn't touched in our libxlConsoleCallback. To address that we introduce a new helper routine libxlConsoleSetSourcePath() that sets the source path and thus also handling this case. That is by setting the source path on with serial element akin to the one indexed by console "port". This way we keep similar behaviour for PV and HVM guests.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- src/libxl/libxl_domain.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..4a46143 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -955,6 +955,35 @@ libxlNetworkPrepareDevices(virDomainDefPtr def) return 0; }
+static int +libxlConsoleSetSourcePath(virDomainDefPtr def, virDomainChrDefPtr console, + char *path) +{ + int ret = -1; + size_t i; + + if (!path || path[0] == '\0') + return ret; + + if (VIR_STRDUP(console->source.data.file.path, path) < 0) + return ret; + + if (console->targetType != VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + return 0; + + for (i = 0; i < def->nserials; i++) { + virDomainChrDefPtr serial = def->serials[i]; + + if (serial->target.port == console->target.port && + VIR_STRDUP(serial->source.data.file.path, path) >= 0) { + ret = 0; + break; + } + } + + return ret; +} + static void libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) { @@ -977,10 +1006,8 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) &console); if (!ret) { VIR_FREE(chr->source.data.file.path); - if (console && console[0] != '\0') { - ignore_value(VIR_STRDUP(chr->source.data.file.path, - console)); - } + if (libxlConsoleSetSourcePath(vm->def, chr, console) < 0) + VIR_WARN("Failed to set console source path"); } VIR_FREE(console); }

On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description?
Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console>
I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with error: internal error: character device <null> is not using a PTY My config only contained <console type='pty'> <target port='0'/> </console> so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above. I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly. Long story short, I found the problem but not sure how to fix it :-). One approach would be the below patch. Another would be to loosen the restriction in libxlConsoleCallback, filing in source.data.file.path when the console source type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of which have toiled in and likely loathe this code) to see if they have opinions on a proper fix. Regards, Jim
From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 From: Jim Fehlig <jfehlig@suse.com> Date: Tue, 21 Jun 2016 20:25:23 -0600 Subject: [PATCH] domain conf: copy source type from stolen console
When domXML contains only <console type='pty'> and no corresponding <serial>, the console is "stolen" [1] and used as the first <serial> device. A new console is created as an alias to the first <serial> device, but missed copying some configuration such as the source 'type' attribute. [1] See comments associated with virDomainDefAddConsoleCompat() in $LIBVIRT-SRC/src/conf/domain_conf.c: Signed-off-by: Jim Fehlig <jfehlig@suse.com> --- src/conf/domain_conf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75ad03f..2fda4fe 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) /* Create an console alias for the serial port */ def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; def->consoles[0]->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; + def->consoles[0]->source.type = def->serials[0]->source.type; } } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL && -- 2.1.4

On 06/21/2016 10:32 PM, Jim Fehlig wrote:
On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description?
Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console>
I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with
error: internal error: character device <null> is not using a PTY
My config only contained
<console type='pty'> <target port='0'/> </console>
so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above.
I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly.
Hmm, that sounds weird. Can you explain more how exactly the XML changes? What is the diff between the initial 'virsh define; virsh dumpxml' and the dumpxml after the VM has started+shutdown once? Whatever that diff is, and why it's not in the XML to begin with, sounds like the root issue to me.
Long story short, I found the problem but not sure how to fix it :-). One approach would be the below patch. Another would be to loosen the restriction in libxlConsoleCallback, filing in source.data.file.path when the console source type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of which have toiled in and likely loathe this code) to see if they have opinions on a proper fix.
Regards, Jim
From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 From: Jim Fehlig <jfehlig@suse.com> Date: Tue, 21 Jun 2016 20:25:23 -0600 Subject: [PATCH] domain conf: copy source type from stolen console
When domXML contains only <console type='pty'> and no corresponding <serial>, the console is "stolen" [1] and used as the first <serial> device. A new console is created as an alias to the first <serial> device, but missed copying some configuration such as the source 'type' attribute.
[1] See comments associated with virDomainDefAddConsoleCompat() in $LIBVIRT-SRC/src/conf/domain_conf.c:
Signed-off-by: Jim Fehlig <jfehlig@suse.com> --- src/conf/domain_conf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75ad03f..2fda4fe 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) /* Create an console alias for the serial port */ def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; def->consoles[0]->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; + def->consoles[0]->source.type = def->serials[0]->source.type; } } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL &&
Then again this also seems okay to me, but it's concerning that I applied this but it didn't cause the test suite to change at all. - Cole

On 06/22/2016 06:07 AM, Cole Robinson wrote:
On 06/21/2016 10:32 PM, Jim Fehlig wrote:
On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description? Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console> I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with
error: internal error: character device <null> is not using a PTY
My config only contained
<console type='pty'> <target port='0'/> </console>
so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above.
I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly.
Hmm, that sounds weird. Can you explain more how exactly the XML changes?
It only changes when defining the vm.
What is the diff between the initial 'virsh define; virsh dumpxml'
Initial XML has <console type='pty'> <target port='0'/> </console> After define <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> The config doesn't change after start; shutdown. But the contents of virDomainDef does change, specifically def->consoles[0]->source.type changes from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned below, libxlConsoleCallback() only checks for source.type == VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to def->consoles[0]->source.data.file.path. Another potential fix I also mentioned below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and perhaps VIR_DOMAIN_CHR_TYPE_FILE too?). Regards, Jim
and the dumpxml after the VM has started+shutdown once? Whatever that diff is, and why it's not in the XML to begin with, sounds like the root issue to me.
Long story short, I found the problem but not sure how to fix it :-). One approach would be the below patch. Another would be to loosen the restriction in libxlConsoleCallback, filing in source.data.file.path when the console source type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of which have toiled in and likely loathe this code) to see if they have opinions on a proper fix.
Regards, Jim
From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 From: Jim Fehlig <jfehlig@suse.com> Date: Tue, 21 Jun 2016 20:25:23 -0600 Subject: [PATCH] domain conf: copy source type from stolen console
When domXML contains only <console type='pty'> and no corresponding <serial>, the console is "stolen" [1] and used as the first <serial> device. A new console is created as an alias to the first <serial> device, but missed copying some configuration such as the source 'type' attribute.
[1] See comments associated with virDomainDefAddConsoleCompat() in $LIBVIRT-SRC/src/conf/domain_conf.c:
Signed-off-by: Jim Fehlig <jfehlig@suse.com> --- src/conf/domain_conf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75ad03f..2fda4fe 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) /* Create an console alias for the serial port */ def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; def->consoles[0]->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; + def->consoles[0]->source.type = def->serials[0]->source.type; } } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL &&
Then again this also seems okay to me, but it's concerning that I applied this but it didn't cause the test suite to change at all.
- Cole

On 06/22/2016 12:45 PM, Jim Fehlig wrote:
On 06/22/2016 06:07 AM, Cole Robinson wrote:
On 06/21/2016 10:32 PM, Jim Fehlig wrote:
On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote:
Guests use a <console /> (and sometimes <serial /> pair) to represent the console. On the callback that is called when console is brought up (NB: before domain starts), we fill the path of the console element with the appropriate "/dev/pts/X". For PV guests it all works fine, although for HVM guests it doesn't. Consequently we end up seeing erronous behaviour on HVM guests where console path is not displayed on the XML but still can be accessed with virDomainOpenConsole after booting guest. Consumers of this XML (e.g. Openstack) also won't be able to use the console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description? Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console> I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with
error: internal error: character device <null> is not using a PTY
My config only contained
<console type='pty'> <target port='0'/> </console>
so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above.
I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly.
Hmm, that sounds weird. Can you explain more how exactly the XML changes?
It only changes when defining the vm.
What is the diff between the initial 'virsh define; virsh dumpxml'
Initial XML has
<console type='pty'> <target port='0'/> </console>
After define
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
The config doesn't change after start; shutdown. But the contents of virDomainDef does change, specifically def->consoles[0]->source.type changes from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned below, libxlConsoleCallback() only checks for source.type == VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to def->consoles[0]->source.data.file.path. Another potential fix I also mentioned below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and perhaps VIR_DOMAIN_CHR_TYPE_FILE too?).
Okay I think I follow. The thing I was missing is that virDomainDefAddConsoleCompat will alter ->def, but that's _not_ what is formated/printed in the XML, which has some special handling to fully duplicate serials[0] XML to consoles[0], if console target type=serial I _think_ what the proper thing to do here is something like: diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..6bcb4d9 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -964,13 +964,17 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) virObjectLock(vm); for (i = 0; i < vm->def->nconsoles; i++) { virDomainChrDefPtr chr = vm->def->consoles[i]; - if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { + if (i == 0 && + chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + chr = vm->def->serials[0]; + + if (chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { libxl_console_type console_type; char *console = NULL; int ret; console_type = - (chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL ? LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); ret = libxl_console_get_tty(ctx, ev->domid, chr->target.port, console_type, Untested, but basically, libxl driver needs to learn that if consoles[0].target.type == SERIAL, then you should use serials[0] instead. The qemu driver has some magic like this sprinkled around... The problem with your patch below is that it's not a whole fix, there's many other bits besides source.type that would technically need to be copied over. Rather than doing that, which has its own problems trying to keep the console[0] and serial[0] data in sync, the convention seems to be that console gets only target.type=serial and the drivers need to map that to serials[0]. That's my reading of things anyways Thanks, Cole
Regards, Jim
and the dumpxml after the VM has started+shutdown once? Whatever that diff is, and why it's not in the XML to begin with, sounds like the root issue to me.
Long story short, I found the problem but not sure how to fix it :-). One approach would be the below patch. Another would be to loosen the restriction in libxlConsoleCallback, filing in source.data.file.path when the console source type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of which have toiled in and likely loathe this code) to see if they have opinions on a proper fix.
Regards, Jim
From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 From: Jim Fehlig <jfehlig@suse.com> Date: Tue, 21 Jun 2016 20:25:23 -0600 Subject: [PATCH] domain conf: copy source type from stolen console
When domXML contains only <console type='pty'> and no corresponding <serial>, the console is "stolen" [1] and used as the first <serial> device. A new console is created as an alias to the first <serial> device, but missed copying some configuration such as the source 'type' attribute.
[1] See comments associated with virDomainDefAddConsoleCompat() in $LIBVIRT-SRC/src/conf/domain_conf.c:
Signed-off-by: Jim Fehlig <jfehlig@suse.com> --- src/conf/domain_conf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75ad03f..2fda4fe 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) /* Create an console alias for the serial port */ def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; def->consoles[0]->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; + def->consoles[0]->source.type = def->serials[0]->source.type; } } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL &&
Then again this also seems okay to me, but it's concerning that I applied this but it didn't cause the test suite to change at all.
- Cole

On 06/22/2016 12:45 PM, Cole Robinson wrote:
On 06/22/2016 12:45 PM, Jim Fehlig wrote:
On 06/22/2016 06:07 AM, Cole Robinson wrote:
On 06/21/2016 10:32 PM, Jim Fehlig wrote:
On 06/21/2016 04:20 AM, Joao Martins wrote:
On 06/21/2016 01:38 AM, Jim Fehlig wrote:
Joao Martins wrote: > Guests use a <console /> (and sometimes <serial /> pair) to represent > the console. On the callback that is called when console is brought up > (NB: before domain starts), we fill the path of the console element with > the appropriate "/dev/pts/X". For PV guests it all works fine, although > for HVM guests it doesn't. Consequently we end up seeing erronous > behaviour on HVM guests where console path is not displayed on the XML > but still can be accessed with virDomainOpenConsole after booting guest. > Consumers of this XML (e.g. Openstack) also won't be able to use the > console path. Can you provide example input domXML config that causes this problem?
Ah yes - I probably should include that in the commit description? Yes, I think it helps clarify the problem being solved by the commit.
So, for PV guests for an input console XML element:
<console type='pty'> <target type='xen' port='0'/> </console>
Which then on domain start gets filled like:
<console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='xen' port='0'/> </console>
Although for HVM guests we have:
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
And no changes happen i.e. source path isn't written there _even though_ it's set on the console element - being the reason why we can access the console in the first place. IIUC The expected output should be:
<serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> </console> I asked for an example after having problems testing your patch, but it turned out to be an issue which I think was introduced long ago by commit 482e5f15. I was testing with transient domains and noticed 'virsh create; 'virsh console' always failed with
error: internal error: character device <null> is not using a PTY
My config only contained
<console type='pty'> <target port='0'/> </console>
so the "really crazy backcompat stuff for consoles" in virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and created a new def->consoles[0] without def->consoles[0].source.type set to VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would ignore the console and not copy the PTY path to source.data.file.path. 'virsh console' would then fail with the error noted above.
I spent too much time debugging this, partly because I was confused by odd behavior with persistent domains. E.g. 'virsh define; virsh start; virsh console' would fail with the same error, but all subsequent 'virsh shutdown; virsh start; virsh console' sequences would work :-). That behavior was due to vm->newDef being populated with correct console config when calling virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of the domain, vm->newDef would be copied over to vm->def, allowing subsequent 'virsh start; virsh console' to work correctly.
Hmm, that sounds weird. Can you explain more how exactly the XML changes? It only changes when defining the vm.
What is the diff between the initial 'virsh define; virsh dumpxml' Initial XML has
<console type='pty'> <target port='0'/> </console>
After define
<serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console>
The config doesn't change after start; shutdown. But the contents of virDomainDef does change, specifically def->consoles[0]->source.type changes from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned below, libxlConsoleCallback() only checks for source.type == VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to def->consoles[0]->source.data.file.path. Another potential fix I also mentioned below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and perhaps VIR_DOMAIN_CHR_TYPE_FILE too?).
Okay I think I follow. The thing I was missing is that virDomainDefAddConsoleCompat will alter ->def, but that's _not_ what is formated/printed in the XML, which has some special handling to fully duplicate serials[0] XML to consoles[0], if console target type=serial
Yes, correct.
I _think_ what the proper thing to do here is something like:
diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..6bcb4d9 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -964,13 +964,17 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) virObjectLock(vm); for (i = 0; i < vm->def->nconsoles; i++) { virDomainChrDefPtr chr = vm->def->consoles[i]; - if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { + if (i == 0 && + chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + chr = vm->def->serials[0]; + + if (chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { libxl_console_type console_type; char *console = NULL; int ret;
console_type = - (chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL ? LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); ret = libxl_console_get_tty(ctx, ev->domid, chr->target.port, console_type,
Untested, but basically, libxl driver needs to learn that if consoles[0].target.type == SERIAL, then you should use serials[0] instead. The qemu driver has some magic like this sprinkled around...
This works, with one other similar sprinkle in libxlDomainOpenConsole(). I've sent a patch which includes both changes https://www.redhat.com/archives/libvir-list/2016-June/msg01580.html
The problem with your patch below is that it's not a whole fix, there's many other bits besides source.type that would technically need to be copied over.
Right. I could have expanded the patch to do that if this approach was the correct fix.
Rather than doing that, which has its own problems trying to keep the console[0] and serial[0] data in sync, the convention seems to be that console gets only target.type=serial and the drivers need to map that to serials[0]. That's my reading of things anyways
The referenced patch changes the libxl driver to follow the convention :-). Regards, Jim
participants (3)
-
Cole Robinson
-
Jim Fehlig
-
Joao Martins