[libvirt] [PATCH 0/2] add qemu machine type q35 support

qemu-1.3 added machine type q35, changelog said: Added Intel Q35 chipset as a new machine type, '--machine q35'. Adds PCIe support. Requires an updated SeaBIOS (bios.bin), and '-acpitable file=/seabios-path/q35-acpi-dsdt.aml' to run. so add q35 support for libvirt. src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 8 +++- 7 files changed, 29 insertions(+), 2 deletions(-)

Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 4 +++- 5 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index 7b97f45..3fca853 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -51,7 +51,7 @@ int virDevicePCIAddressParseXML(xmlNodePtr node, virDevicePCIAddressPtr addr) { - char *domain, *slot, *bus, *function, *multi; + char *domain, *slot, *bus, *function, *multi, *expr; int ret = -1; memset(addr, 0, sizeof(*addr)); @@ -61,6 +61,7 @@ virDevicePCIAddressParseXML(xmlNodePtr node, slot = virXMLPropString(node, "slot"); function = virXMLPropString(node, "function"); multi = virXMLPropString(node, "multifunction"); + expr = virXMLPropString(node, "express"); if (domain && virStrToLong_ui(domain, NULL, 0, &addr->domain) < 0) { @@ -98,6 +99,10 @@ virDevicePCIAddressParseXML(xmlNodePtr node, goto cleanup; } + + if (expr) + addr->express = true; + if (!virDevicePCIAddressIsValid(addr)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Insufficient specification for PCI address")); @@ -112,6 +117,7 @@ cleanup: VIR_FREE(slot); VIR_FREE(function); VIR_FREE(multi); + VIR_FREE(expr); return ret; } diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index 5318738..5c87f48 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -48,6 +48,7 @@ struct _virDevicePCIAddress { unsigned int slot; unsigned int function; int multi; /* enum virDomainDeviceAddressPciMulti */ + bool express; /* if it is a pci-e device */ }; int virDevicePCIAddressIsValid(virDevicePCIAddressPtr addr); diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 104a3f8..c7fbb35 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -298,6 +298,17 @@ qemuCapsProbeCommand(const char *qemu, return cmd; } +int qemuMachineTypeIdx(const char *machine, qemuCapsPtr caps) +{ + size_t i; + + for (i = 0; i < caps->nmachineTypes; i++) { + if (STRPREFIX(caps->machineTypes[i], machine)) + return i; + } + + return -1; +} static void qemuSetDefaultMachine(qemuCapsPtr caps, diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index bf4eef8..715d625 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -238,6 +238,7 @@ int qemuCapsParseHelpStr(const char *qemu, bool check_yajl); /* Only for use by test suite */ int qemuCapsParseDeviceStr(qemuCapsPtr caps, const char *str); +int qemuMachineTypeIdx(const char *machine, qemuCapsPtr caps); VIR_ENUM_DECL(qemuCaps); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 04a9512..f40e1a5 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1801,7 +1801,9 @@ qemuBuildDeviceAddressStr(virBufferPtr buf, * When QEMU grows support for > 1 PCI domain, then pci.0 change * to pciNN.0 where NN is the domain number */ - if (qemuCapsGet(caps, QEMU_CAPS_PCI_MULTIBUS)) + if (qemuMachineTypeIdx("q35", caps) >= 0 && info->addr.pci.express) + virBufferAsprintf(buf, ",bus=pcie.0"); + else if (qemuCapsGet(caps, QEMU_CAPS_PCI_MULTIBUS)) virBufferAsprintf(buf, ",bus=pci.0"); else virBufferAsprintf(buf, ",bus=pci"); -- 1.7.2.5

On Wed, Jan 09, 2013 at 10:35:32AM +0800, liguang wrote:
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 4 +++- 5 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index 7b97f45..3fca853 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -51,7 +51,7 @@ int virDevicePCIAddressParseXML(xmlNodePtr node, virDevicePCIAddressPtr addr) { - char *domain, *slot, *bus, *function, *multi; + char *domain, *slot, *bus, *function, *multi, *expr; int ret = -1;
memset(addr, 0, sizeof(*addr)); @@ -61,6 +61,7 @@ virDevicePCIAddressParseXML(xmlNodePtr node, slot = virXMLPropString(node, "slot"); function = virXMLPropString(node, "function"); multi = virXMLPropString(node, "multifunction"); + expr = virXMLPropString(node, "express");
NACK, this is a gross hack. The q35 machine type provides multiple PCI buses, and it is already possible to express connection to the alternative buses using the 'bus' parameter. We don't need a new 'express' parameter. We need to make sure that the XML includes a <controller> element for each PCI bus provided by a machine type Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

在 2013-01-09三的 10:32 +0000,Daniel P. Berrange写道:
On Wed, Jan 09, 2013 at 10:35:32AM +0800, liguang wrote:
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 4 +++- 5 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index 7b97f45..3fca853 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -51,7 +51,7 @@ int virDevicePCIAddressParseXML(xmlNodePtr node, virDevicePCIAddressPtr addr) { - char *domain, *slot, *bus, *function, *multi; + char *domain, *slot, *bus, *function, *multi, *expr; int ret = -1;
memset(addr, 0, sizeof(*addr)); @@ -61,6 +61,7 @@ virDevicePCIAddressParseXML(xmlNodePtr node, slot = virXMLPropString(node, "slot"); function = virXMLPropString(node, "function"); multi = virXMLPropString(node, "multifunction"); + expr = virXMLPropString(node, "express");
NACK, this is a gross hack.
The q35 machine type provides multiple PCI buses, and it is already possible to express connection to the alternative buses using the 'bus' parameter. We don't need a new 'express' parameter. We need to make sure that the XML includes a <controller> element for each PCI bus provided by a machine type
Daniel
OK, will drop this unnecessary patch. -- regards! li guang

Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 ++++ 3 files changed, 6 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6a7646e..54ba77f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9548,6 +9548,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, def->os.bootloader = virXPathString("string(./bootloader)", ctxt); def->os.bootloaderArgs = virXPathString("string(./bootloader_args)", ctxt); + def->os.acpitable = virXPathString("string(./bootloader_args)", ctxt); def->os.type = virXPathString("string(./os/type[1])", ctxt); if (!def->os.type) { diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 5062e07..4f1dd10 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1514,6 +1514,7 @@ struct _virDomainOSDef { char *loader; char *bootloader; char *bootloaderArgs; + char *acpitable; int smbios_mode; virDomainBIOSDef bios; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index f40e1a5..ec56706 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5606,6 +5606,10 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArgList(cmd, "-bootloader", def->os.bootloader, NULL); } + if (def->os.acpitable) + virCommandAddArgFormat(cmd, "-acpitable file=%s", + def->os.acpitable); + for (i = 0 ; i < def->ndisks ; i++) { virDomainDiskDefPtr disk = def->disks[i]; -- 1.7.2.5

On Wed, Jan 09, 2013 at 10:35:33AM +0800, liguang wrote:
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 ++++ 3 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6a7646e..54ba77f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9548,6 +9548,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
def->os.bootloader = virXPathString("string(./bootloader)", ctxt); def->os.bootloaderArgs = virXPathString("string(./bootloader_args)", ctxt); + def->os.acpitable = virXPathString("string(./bootloader_args)", ctxt);
This is clearly bogus - you can't just grab an existing XML field and repurpose it. Second we shouldn't be requireing the user to specify custom ACPI tables just to use the machine type. libvirt should do the right thing with q35. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 01/09/2013 05:31 AM, Daniel P. Berrange wrote:
On Wed, Jan 09, 2013 at 10:35:33AM +0800, liguang wrote:
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 ++++ 3 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6a7646e..54ba77f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9548,6 +9548,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
def->os.bootloader = virXPathString("string(./bootloader)", ctxt); def->os.bootloaderArgs = virXPathString("string(./bootloader_args)", ctxt); + def->os.acpitable = virXPathString("string(./bootloader_args)", ctxt);
This is clearly bogus - you can't just grab an existing XML field and repurpose it. Second we shouldn't be requireing the user to specify custom ACPI tables just to use the machine type. libvirt should do the right thing with q35.
Daniel
Also, there are patches on qemu-devel which claim to remove the -acpitable requirement. https://lists.gnu.org/archive/html/qemu-devel/2013-01/msg00356.html - Cole

在 2013-01-09三的 10:23 -0500,Cole Robinson写道:
On 01/09/2013 05:31 AM, Daniel P. Berrange wrote:
On Wed, Jan 09, 2013 at 10:35:33AM +0800, liguang wrote:
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 ++++ 3 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6a7646e..54ba77f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9548,6 +9548,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
def->os.bootloader = virXPathString("string(./bootloader)", ctxt); def->os.bootloaderArgs = virXPathString("string(./bootloader_args)", ctxt); + def->os.acpitable = virXPathString("string(./bootloader_args)", ctxt);
This is clearly bogus - you can't just grab an existing XML field and repurpose it. Second we shouldn't be requireing the user to specify custom ACPI tables just to use the machine type. libvirt should do the right thing with q35.
Daniel
Also, there are patches on qemu-devel which claim to remove the -acpitable requirement.
https://lists.gnu.org/archive/html/qemu-devel/2013-01/msg00356.html
- Cole
Thanks! -- regards! li guang

On Tue, Jan 8, 2013 at 8:35 PM, liguang <lig.fnst@cn.fujitsu.com> wrote:
qemu-1.3 added machine type q35, changelog said: Added Intel Q35 chipset as a new machine type, '--machine q35'. Adds PCIe support. Requires an updated SeaBIOS (bios.bin), and '-acpitable file=/seabios-path/q35-acpi-dsdt.aml' to run. so add q35 support for libvirt.
src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 8 +++- 7 files changed, 29 insertions(+), 2 deletions(-)
I'd personally NACK this series for the time being. Per the qemu maintainers, q35 isn't really fully ready until 1.4. They're actively in the process of hashing out the machine type which will be exposed on the command line and via QMP so I think we really need to wait until that lands in upstream's repo before we add code for it in libvirt. Just my 2c. -- Doug Goldstein

On 01/08/2013 10:15 PM, Doug Goldstein wrote:
On Tue, Jan 8, 2013 at 8:35 PM, liguang <lig.fnst@cn.fujitsu.com> wrote:
qemu-1.3 added machine type q35, changelog said: Added Intel Q35 chipset as a new machine type, '--machine q35'. Adds PCIe support. Requires an updated SeaBIOS (bios.bin), and '-acpitable file=/seabios-path/q35-acpi-dsdt.aml' to run. so add q35 support for libvirt.
src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 8 +++- 7 files changed, 29 insertions(+), 2 deletions(-)
I'd personally NACK this series for the time being. Per the qemu maintainers, q35 isn't really fully ready until 1.4. They're actively in the process of hashing out the machine type which will be exposed on the command line and via QMP so I think we really need to wait until that lands in upstream's repo before we add code for it in libvirt.
In addition there is the issue that libvirt currently makes assumptions about the bus topology of guests, and assigns pci addresses to guest devices accordingly. Beyond the basic layout of buses present on the machine, certain addresses are reserved/used for certain default devices (e.g. the first video device). The q35 machine type will have a different topology, so the default addresses of these basic devices may (will? haven't looked at the details yet) change, and different addresses will be available for assigning to additional guest pci devices. (This assumption was actually already a problem, because libvirt currently incorrectly assumes the default addresses and bus layout even for non-x86 machinetypes.) To properly solve this problem, libvirt will need a method of learning the topology and default device address assignment for each machine type during capabilities discovery, and honor that information (rather than just making decisions based on assumptions) during guest device assignment. I'm already working with someone from qemu on a design and patches to implement this functionality, so to avoid duplication of effort, you should wait to see what that looks like before you do any more work in this area. (A colleague has also been working on PCI bridge support, which has the potential to conflict with the PCI-e/q35 work if not done in a coordinated fashion, so you may want to take a wait-and-see approach there as well.)

在 2013-01-09三的 21:24 -0500,Laine Stump写道:
On 01/08/2013 10:15 PM, Doug Goldstein wrote:
On Tue, Jan 8, 2013 at 8:35 PM, liguang <lig.fnst@cn.fujitsu.com> wrote:
qemu-1.3 added machine type q35, changelog said: Added Intel Q35 chipset as a new machine type, '--machine q35'. Adds PCIe support. Requires an updated SeaBIOS (bios.bin), and '-acpitable file=/seabios-path/q35-acpi-dsdt.aml' to run. so add q35 support for libvirt.
src/conf/device_conf.c | 8 +++++++- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_capabilities.c | 11 +++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 8 +++- 7 files changed, 29 insertions(+), 2 deletions(-)
I'd personally NACK this series for the time being. Per the qemu maintainers, q35 isn't really fully ready until 1.4. They're actively in the process of hashing out the machine type which will be exposed on the command line and via QMP so I think we really need to wait until that lands in upstream's repo before we add code for it in libvirt.
In addition there is the issue that libvirt currently makes assumptions about the bus topology of guests, and assigns pci addresses to guest devices accordingly. Beyond the basic layout of buses present on the machine, certain addresses are reserved/used for certain default devices (e.g. the first video device). The q35 machine type will have a different topology, so the default addresses of these basic devices may (will? haven't looked at the details yet) change, and different addresses will be available for assigning to additional guest pci devices. (This assumption was actually already a problem, because libvirt currently incorrectly assumes the default addresses and bus layout even for non-x86 machinetypes.)
To properly solve this problem, libvirt will need a method of learning the topology and default device address assignment for each machine type during capabilities discovery, and honor that information (rather than just making decisions based on assumptions) during guest device assignment.
I'm already working with someone from qemu on a design and patches to implement this functionality, so to avoid duplication of effort, you should wait to see what that looks like before you do any more work in this area. (A colleague has also been working on PCI bridge support, which has the potential to conflict with the PCI-e/q35 work if not done in a coordinated fashion, so you may want to take a wait-and-see approach there as well.)
glad to hear that, I've committed some patches to support pci-bridge, https://www.redhat.com/archives/libvir-list/2013-January/msg00577.html
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- regards! li guang

On 01/10/13 03:24, Laine Stump wrote:
In addition there is the issue that libvirt currently makes assumptions about the bus topology of guests, and assigns pci addresses to guest devices accordingly. Beyond the basic layout of buses present on the machine, certain addresses are reserved/used for certain default devices (e.g. the first video device). The q35 machine type will have a different topology, so the default addresses of these basic devices may (will? haven't looked at the details yet) change, and different addresses will be available for assigning to additional guest pci devices. (This assumption was actually already a problem, because libvirt currently incorrectly assumes the default addresses and bus layout even for non-x86 machinetypes.)
To properly solve this problem, libvirt will need a method of learning the topology and default device address assignment for each machine type during capabilities discovery, and honor that information (rather than just making decisions based on assumptions) during guest device assignment.
I'm already working with someone from qemu on a design and patches to implement this functionality, so to avoid duplication of effort, you should wait to see what that looks like before you do any more work in this area. (A colleague has also been working on PCI bridge support, which has the potential to conflict with the PCI-e/q35 work if not done in a coordinated fashion, so you may want to take a wait-and-see approach there as well.)
Both require support for multiple PCI buses and we assume bus = 0 in quite a few places. I was thinking about putting the number of available buses in qemuDomainPCIAddressSet and adding the set to qemuPCIAddressAsString arguments so we can check if the bus is available and convert the functions with int parameters (slot, function) to use virDomainDeviceInfo or virDevicePCIAddress. Any of these changes will affect qemuAssignDevicePCISlots, which is where the default PCI addresses are reserved. I could do it some time this week unless anyone thinks it's a bad idea. Jan

On 01/14/2013 12:32 PM, Ján Tomko wrote:
On 01/10/13 03:24, Laine Stump wrote:
In addition there is the issue that libvirt currently makes assumptions about the bus topology of guests, and assigns pci addresses to guest devices accordingly. Beyond the basic layout of buses present on the machine, certain addresses are reserved/used for certain default devices (e.g. the first video device). The q35 machine type will have a different topology, so the default addresses of these basic devices may (will? haven't looked at the details yet) change, and different addresses will be available for assigning to additional guest pci devices. (This assumption was actually already a problem, because libvirt currently incorrectly assumes the default addresses and bus layout even for non-x86 machinetypes.)
To properly solve this problem, libvirt will need a method of learning the topology and default device address assignment for each machine type during capabilities discovery, and honor that information (rather than just making decisions based on assumptions) during guest device assignment.
I'm already working with someone from qemu on a design and patches to implement this functionality, so to avoid duplication of effort, you should wait to see what that looks like before you do any more work in this area. (A colleague has also been working on PCI bridge support, which has the potential to conflict with the PCI-e/q35 work if not done in a coordinated fashion, so you may want to take a wait-and-see approach there as well.) Both require support for multiple PCI buses and we assume bus = 0 in quite a few places.
I was thinking about putting the number of available buses in qemuDomainPCIAddressSet and adding the set to qemuPCIAddressAsString arguments so we can check if the bus is available and convert the functions with int parameters (slot, function) to use virDomainDeviceInfo or virDevicePCIAddress.
I think much of this functionality needs to go into util/virpci.c so that it can be used by multiple drivers. Anything in util/virpci.c can't use datatypes from the conf directory. Perhaps we need a simple PCI address type that's defined in virpci.h and used by everything else. (I remember looking at this briefly quite a while back, but don't remember the results).
participants (7)
-
Cole Robinson
-
Daniel P. Berrange
-
Doug Goldstein
-
Ján Tomko
-
Laine Stump
-
li guang
-
liguang