
Hi, Dan and I have been discussing how to "fix networking", not just Xen's networking but also getting something sane wrt. QEMU/KVM etc. Comments very welcome on the writeup below. The libvirt stuff is towards the end, but I think all of it is probably useful to this list. Cheers, Mark. Virtual Networking The ability to manage virtual machines is something which is receiving a lot of focus right now. Xen, KVM, QEMU and others provide the infrastructure required to run a virtual machine, and each can provide guests with a virtual network interface. This proposal addresses the problem of how guests are networked together. We aim: * To make virtual networking "just work". Guests should be able to communicate with each other, their host and the Internet without any fuss or configuration. This should be the case even with laptops and offline machines. * To allow a greater flexibily with how guests are networked. It should be possible to isolate groups of guests in different networks, allow guests on different physical machines to communicate, firewall guests' networks from physical networks or allow guests to appear just like physical machines on physical networks. * To make networking virtual machines analogous with networking physical machines. * To support inter-networking between virtualisation technologies. User Visible Concepts ===================== It's important to consider the manner in which we expose the functionality of virtual networking. What concepts will be exposing through the UI? Are those concepts well defined and consistent? Are those concepts more complex than neccessary? Or are the too simple to be able to support the functionality we want? Real world, or "physical", concepts[1]: * Network - a number of interconnected machines. * Network Interface - hardware which enables a machine to connect to a network. * Bridge - hardware which allows enables the interconnection of machines to form a network. Bridges can also be connected to other bridges to form a larger network. * Router - hardware which connects two or more distinct networks, allowing machines on different networks to communicate with one another. Sometimes a router and a bridge are available as a combined piece of hardware - the bridge forms a network and the router connects that network to another distinct network. * Firewall - software on a router which can be used to control how machines on an "external" network (e.g. the Internet) can communicate with machines on an "internal" network. For a given type of connection, you can choose to disallow connections of a that type or forward them to a specific internal machine. Can also be used to control how internal machines can communicate with external machines. With virtual networking, we will be exposing the following "virtual" concepts: * Virtual Network - a number of interconnected virtual machines. * Virtual Network Interface - a network interface in a virtual machine. * Virtual Bridge - allows the interconnection of virtual machines to form a virtual network. A virtual bridge may be configured to also act as a virtual router and firewall. A virtual bridge may also be connected to another virtual bridge (perhaps on another physical machine) to create a larger virtual network. (Note, unprivileged users may create any of the above) Finally, where the physical world meets the virtual world: * Shared Physical Interface - if a physical interface is configured to be "shared", then any number of virtual interfaces may be connected to it allowing virtual machines to be connected to the same physical network which the physical interface is connected to. Only privileged users may configure a physical interface to be shared and/or connect guests to it. There are a few problems with all of the above: 1. The distinction between a bridge and a router requires a lot of technical knowledge to fully understand. However, the model of e.g. a LinkSys router is familiar to a lot of people - a box which allows you to network your machines together and connect that network to (and firewall off) the Internet. 2. This "shared physical interface" notion is very "makey upey". We could perhaps talk about the idea in terms of connecting a physical interface to a virtual bridge, but it exposes the bridge vs. router distinction more than we'd like. 3. Guests are connected to a specific physical interface, whereas perhaps users wish guests to be connected to "the network" - i.e. if NetworkManager switched from wireless to wired while remaining on the same subnet, perhaps we'd like to automatically switch the bridge to the new network. In reality, though, bridged networking is only really sane for machines on a fairly static network connection. [1] - Yes, these definitions aren't entirely accurate, but they describe the kind of understanding a moderately technical user might have of the concepts. Example Networks ================ Below are some example networks users may configure and an explanation of how that network would be implemented in practice. 1. A privileged user creates two (Xen) guests, each with a Virtual Network Interface. Without any special networking configuration, these two guests are connected to a default Virtual Network which contains a combined Virtual Bridge/Router/Firewall. +-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +-------+ +--------+ Notes: * "vnbr0" is a bridge device with it's own IP address on the same subnet as the guests. * IP forwarding is enabled in Dom0. Masquerading and DNAT is implemented using iptables. * We run a DHCP server and a DNS proxy in Dom0 (e.g. dnsmasq) 2. A privileged user does exactly the same thing as (1), but with QEMU guests. D N D H A N C T S P ^ ^ ^ +---+---+ | +---+---+ +-----------+ | vnbr0 | +-----------+ | Guest | +---+---+ | Guest | | A | | | B | | +---+ | +---+---+ | +---+ | | |NIC| | | vtap0 | | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ +-------+ | +-------+ ^ | | | +---+---+ | | | +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ | | +-------+ | | +-------+ +-------+ Notes: * VDE is a userspace ethernet bridge implemented using vde_switch * "vtap0" is a TAP device created by vde_switch * Everything else is the same as (1) * This could be done without vde_switch by having Guest A create vtap0 and have Guest B connect directly to Guest A's VLAN. However, if Guest A is shut down, Guest B's network would go down. 3. An unprivileged user does exactly the same thing as (2). +-----------+ +-----------+ | Guest | +----+----+ | Guest | | A | |userspace| | B | | +---+ | | network | | +---+ | | |NIC| | | stack | | |NIC| | +---+-+-+---+ +----+----+ +---+-+-+---+ ^ +-------+ | +-------+ ^ | | | +---+---+ | | | +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ | | +-------+ | | +-------+ +-------+ Notes: * Similar to (2) except there is can be no TAP device or bridge * The userspace network stack is implemented using slirpvde to provide a DHCP server and DNS proxy to the network, but also effectively a SNAT and DNAT router. * slirpvde implements ethernet, ip, tcp, udp, icmp, dhcp, tftp (etc.) in userspace. Completely crazy, but since the kernel apparently has no secure way to allow unprivileged users to leverage the kernel's network stack for this, then it must be done in userspace. 4. Same as (2), except the user also creates two Xen guests. +-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +---+---+ +--------+ | +---+---+ | vtap0 | +---+---+ | +-------+ +--+--+ +-------+ +---->+ VLAN0 +----+ VDE +---+ VLAN0 +<-----+ | +-------+ +-----+ +-------+ | V V +---+-+-+---+ +---+-+-+---+ | |NIC| | | |NIC| | | +---+ | | +---+ | | Guest | | Guest | | C | | D | +-----------+ +-----------+ Notes: * In this case we could do away with VDE and have each QEMU guest use its own TAP device. 5. Same as (3) except Guests A and C are connected to a Shared Physical Interface. +-----------+ | D +-----------+ | Guest | ^ | N D H | Guest | | A | | | A N C | B | | +---+ | +---+---+ | T S P | +---+ | | |NIC| | | eth0 | | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ | +---+---+ +---+-+-+---+ ^ | | | ^ | +--------+ +---+---+ | +---+---+ +--------+ | +>+ vif1.0 +-+ ebr0 + | + vnbr0 +-+ vif2.0 +<-+ +--------+ +---+---+ | +---+---+ +--------+ | | | +---+---+ | +---+---+ | vtap1 | | | vtap0 | +---+---+ | +---+---+ | | | +-------+ +--+--+ | +--+--+ +-------+ +->+ VLAN0 +--+ VDE + | + VDE +--+ VLAN0 +<-+ | +-------+ +-----+ | +-----+ +-------+ | V | V +---+-+-+---+ | +---+-+-+---+ | |NIC| | | | |NIC| | | +---+ | | | +---+ | | Guest | | | Guest | | C | | | D | +-----------+ | +-----------+ Notes: * The idea here is that when the admin configures eth0 to be shareable, eth0 is configured as an addressless NIC enslaved to a bridge which has the MAC address and IP address that eth0 should have * Again, VDE is redundant here. 6. Same as 2) except the QEMU guests are on a Virtual Network on another physical machine which is, in turn, connected to the Virtual Network on the first physical machine +-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +---+---+ +--------+ | +---+---+ | vtap0 | +---+---+ | +--+--+ | VDE | +--+--+ | First Physical Machine V ------------------------------------------------------------- Second Physical Machine ^ | +-------+ +--+--+ +-------+ +---->+ VLAN0 +----+ VDE +---+ VLAN0 +<-----+ | +-------+ +-----+ +-------+ | V V +---+-+-+---+ +---+-+-+---+ | |NIC| | | |NIC| | | +---+ | | +---+ | | Guest | | Guest | | C | | D | +-----------+ +-----------+ Notes: * What's going on here is that the two VDEs are connected over the network, either via a plan socket or perhaps encapsulated in another protocol like SSH or TLS One interesting thing to note from all of those examples is that although QEMU's networking options are very interesting, it doesn't actually make sense for a network to be implemented inside a guest. The network needs to be external to any guests, and so we use VDE to offer similar networking options to the ones QEMU provides. All QEMU needs to be able to do is to connect to VDE. User Interface ============== This isn't meant a UI specification, but just some notes on how this stuff might be exposed in virt-manager. * Networks List: * Name * Virtual/Physical * Status * Activity/traffic * Virtual Network Configuration: * Name * List of connected guests * Allow other Virtual Networks to connect to this (defaults to no) * Connect to other Virtual Network (defaults to none) * DHCP enabled - DHCP configuration: * IP range (optional) * Router IP address (optional) * Guest IP address/hostname assignment (optional) * Forwarding enabled - firewall configuration: * Incoming ports list and destination guest+port for each (defaults to empty) * Blocked outgoing ports lists (defaults to empty) * Virtual NICs list: * Guest interface name * Virtual Network/Shared Physical Interface * Hostname (defaults to guest name) * IP address (if assigned) * MAC address (if assigned) * Virtual NIC Configuration: * Random MAC address, or user-supplied MAC address. * Virtual Network or Shared Physical Interface to connect to. Implementation ============== Parity with the current state of networking with Xen will be achieved by: * Implementing "shared physical interface" support in Fedora's initscripts and network configuration tool. It boils down to configuring the interface (e.g. eth0) something like: ifcfg-peth0: DEVICE=peth0 ONBOOT=yes Bridge=eth0 HWADDR=00:30:48:30:73:19 ifcfg-eth0 DEVICE=eth0 Type=Bridge ONBOOT=yes BOOTPROTO=dhcp * Fixing Xen so that netloop is no longer required. Upstream have ideas about how to make Xen automatically copy any frames that are destined for Dom0 so that the netback driver doesn't run out of shared pages if Dom0 doesn't process the frames quickly enough. * Create new network/vif scripts for Xen which will connect guests to a shared physical interface's bridge. Virtual Networks will be implemented in libvirt. First, there will be an XML description of Virtual Networks e.g.: <network id="0"> <name>Foo</name> <uuid>596a5d2171f48fb2e068e2386a5c413e</uuid> <listen address="172.31.0.5" port="1234" /> <connections> <connection address="172.31.0.6" port="4321" /> </conections> <dhcp enabled="true"> <ip address="10.0.0.1" netmask="255.255.255.0" start="10.0.0.128" end="10.0.0.254" /> </dhcp> <forwarding enabled="true"> <incoming default="deny"> <allow port="123" domain="foobar" destport="321" /> </incoming> <outgoing default="allow"> <deny port="25" /> </outgoing> </forwarding> <network> In a manner similar to libvirt's QEMU support, there will be a daemon to manage Virtual Networks. The daemon will have access to a store of network definitions. The daemon will be responsible for managing the bridge devices, vde_switch/dhcp/dnses processes and the iptables rules needed for SNAT/DNAT etc. virsh command line interface would look like: $> virsh network-create foo.xml $> virsh network-dumpxml > foo.xml $> virsh network-define foo.xml $> virsh network-list $> virsh network-start Foo $> virsh network-stop Foo $> virsh network-restart Foo The libvirt API for virtual networks would be modelled on the API for virtual machines: /* * Virtual Networks API */ /** * virNetwork: * * a virNetwork is a private structure representing a virtual network. */ typedef struct _virNetwork virNetwork; /** * virNetworkPtr: * * a virNetworkPtr is pointer to a virNetwork private structure, this is the * type used to reference a virtual network in the API. */ typedef virNetwork *virNetworkPtr; /** * virNetworkCreateFlags: * * Flags OR'ed together to provide specific behaviour when creating a * Network. */ typedef enum { VIR_NETWORK_NONE = 0 } virNetworkCreateFlags; /* * List active networks */ int virConnectNumOfNetworks (virConnectPtr conn); int virConnectListNetworks (virConnectPtr conn, int *ids, int maxids); /* * List inactive networks */ int virConnectNumOfDefinedNetworks (virConnectPtr conn); int virConnectListDefinedNetworks (virConnectPtr conn, const char **names, int maxnames); /* * Lookup network by name, id or uuid */ virNetworkPtr virNetworkLookupByName (virConnectPtr conn, const char *name); virNetworkPtr virNetworkLookupByID (virConnectPtr conn, int id); virNetworkPtr virNetworkLookupByUUID (virConnectPtr conn, const unsigned char *uuid); virNetworkPtr virNetworkLookupByUUIDString (virConnectPtr conn, const char *uuid); /* * Create active transient network */ virNetworkPtr virNetworkCreateXML (virConnectPtr conn, const char *xmlDesc, unsigned int flags); /* * Define inactive persistent network */ virNetworkPtr virNetworkDefineXML (virConnectPtr conn, const char *xmlDesc); /* * Delete persistent network */ int virNetworkUndefine (virNetworkPtr network); /* * Activate persistent network */ int virNetworkCreate (virNetworkPtr network); /* * Network destroy/free */ int virNetworkDestroy (virNetworkPtr network); int virNetworkFree (virNetworkPtr network); /* * Network informations */ const char* virNetworkGetName (virNetworkPtr network); unsigned int virNetworkGetID (virNetworkPtr network); int virNetworkGetUUID (virNetworkPtr network, unsigned char *uuid); int virNetworkGetUUIDString (virNetworkPtr network, char *buf); char * virNetworkGetXMLDesc (virNetworkPtr network, int flags); Discussion points on the XML format and API: * The XML format isn't thought out at all, but briefly: * The <listen> and <connections> elements describe networks connected across physical machine boundaries. * The <dhcp> element describes the configuration of the DHCP server on the network. * The <forwarding> element describes how incoming and outgoing connections are forwarded. * Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect? * Are we needlessly replicating any mistakes from the domains API here? e.g. is the transient vs. persistent distinction useful for networks? * Is a UUID useful for networks? Yes, because it distinguishes between networks of the same name on different hosts? * Where is the connection between domains and networks in either the API or the XML format? How is a domain associated with a network? You put a bridge name in the <network>l definition and use that in the domains <interface> definition? Or you put the network name in the interface definition and have libvirt look up the bridge name when creating the guest? * Should it be possible to stop/start/restart a network? What for? If something breaks the user restarts it to see if that will fix it?

Hi, One thing which is relevant to Dan's authentication stuff ... On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL. The one exception to that is for remote connections. In that case apps should pass a URI for a remote libvirt daemon which, in turn, would be equivalent to calling virConnectOpen(NULL) on the remote host. So, remotely connecting directly to a hypervisor should be deprecated. Cheers, Mark.

On Mon, Jan 15, 2007 at 08:53:43PM +0000, Mark McLoughlin wrote:
Hi, One thing which is relevant to Dan's authentication stuff ...
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL.
Having a single virConnectOpen which initializes all backends is not going to fly because it'll create a huge namespace clash. eg, the names passed to virConnectLookupByName are only unique per-hypervisor connection - its perfectly valid to have a Xen domain called 'foo' and a QEMU domain called 'foo' on the same machine. Similarly the integer IDs are scoped per hypervisor, and the UUIDs are unique only per-hypervisor, etc, etc. The entire API is modelled on the idea of one virConnectPtr object representing the context of a single hypervisor. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Jan 15, 2007 at 08:53:43PM +0000, Mark McLoughlin wrote:
Hi, One thing which is relevant to Dan's authentication stuff ...
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL.
The one exception to that is for remote connections. In that case apps should pass a URI for a remote libvirt daemon which, in turn, would be equivalent to calling virConnectOpen(NULL) on the remote host.
So, remotely connecting directly to a hypervisor should be deprecated.
Having been kept away last night thinking about the implications of this I think you're description above could actually work, with a fairly small modification. But first, some pretty pictures: 1. The simple (current) usage of using libvirt to connect to a local hypervisor. Showing two examples - first how the current Xen backends works, and second how my prototype QEMU backend works: http://people.redhat.com/berrange/libvirt/libvirt-arch-local.png 2. The way I was always anticipating remote use of libvirt to work. The app uses libvirt locally which opens a connection to the remote machine using whatever remote management protocol is relevant for the hypervisor in question. eg, HTTP/XML-RPC for Xen, or the TLS secured binary format for the prototype QEMU backend. http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-1.png 3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png NB, the local case 1, is basically unchanged regardless of which of the two remote architectures we consider. Option 3 has some interesting properties: - For QEMU & UML we essentially already have to write a 'libvirt server' since those two don't have any existing remote maangement service. - The same network transport & authenticate system would be used across all hypervisor technologies we support, giving a consistent model. - Remote Xen access would be able to bypass XenD in the common case just like we do for the local Xen access On the flip-side: - We would be using a different remote managemnent API for Xen compared to other apps which might talk Xen-API directly - if people had a mix of apps some using libvirt & some native Xen-API they'd have to manage remote access for two services. So, going back to how this would work... - We'd supply URIs describing the hypervisor connection to open to the virConnectOpen() method as usual - If the URI does not contain a hostname, then one (or more) of the regular libvirt drivers would be activated to open a local connection to the HV. - The the URI does contain a hostname, then the special 'remote' driver would be activated. This opens a connection to the remote libvirt server on that host, strips the hostname out of the URI, and sends this stripped URI to the libvirt server. This then opens the local hypervisor connection & does pass-through of all remote calls to this connection. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Mon, Jan 15, 2007 at 08:53:43PM +0000, Mark McLoughlin wrote:
Hi, One thing which is relevant to Dan's authentication stuff ...
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL.
The one exception to that is for remote connections. In that case apps should pass a URI for a remote libvirt daemon which, in turn, would be equivalent to calling virConnectOpen(NULL) on the remote host.
So, remotely connecting directly to a hypervisor should be deprecated.
Having been kept away last night thinking about the implications of this I think you're description above could actually work, with a fairly small modification. But first, some pretty pictures:
1. The simple (current) usage of using libvirt to connect to a local hypervisor. Showing two examples - first how the current Xen backends works, and second how my prototype QEMU backend works:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local.png
2. The way I was always anticipating remote use of libvirt to work. The app uses libvirt locally which opens a connection to the remote machine using whatever remote management protocol is relevant for the hypervisor in question. eg, HTTP/XML-RPC for Xen, or the TLS secured binary format for the prototype QEMU backend.
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-1.png
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
NB, the local case 1, is basically unchanged regardless of which of the two remote architectures we consider.
Option 3 has some interesting properties:
- For QEMU & UML we essentially already have to write a 'libvirt server' since those two don't have any existing remote maangement service.
- The same network transport & authenticate system would be used across all hypervisor technologies we support, giving a consistent model.
- Remote Xen access would be able to bypass XenD in the common case just like we do for the local Xen access
On the flip-side:
- We would be using a different remote managemnent API for Xen compared to other apps which might talk Xen-API directly - if people had a mix of apps some using libvirt & some native Xen-API they'd have to manage remote access for two services.
So, going back to how this would work...
- We'd supply URIs describing the hypervisor connection to open to the virConnectOpen() method as usual
- If the URI does not contain a hostname, then one (or more) of the regular libvirt drivers would be activated to open a local connection to the HV.
- The the URI does contain a hostname, then the special 'remote' driver would be activated. This opens a connection to the remote libvirt server on that host, strips the hostname out of the URI, and sends this stripped URI to the libvirt server. This then opens the local hypervisor connection & does pass-through of all remote calls to this connection.
Dan.
This strikes me as *much* easier to manage, and the most consistent thus far with the idea that libvirt should remain as hypervisor-neutral as possible. --H

Daniel P. Berrange wrote:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
Thought provoking. It makes me wonder - should there be (or is there) a generic way to remote C shared library calls? This sort of thing exists in other languages (eg. Java RMI). Rich. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

On Tue, Jan 16, 2007 at 04:26:38PM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
Thought provoking.
It makes me wonder - should there be (or is there) a generic way to remote C shared library calls? This sort of thing exists in other languages (eg. Java RMI).
That will be ad-hoc, most likely in this context would be some kind of XML-RPC since 1/ we already linkt to libxml2 2/ we will need this at some point to use the new Xen API. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Tue, Jan 16, 2007 at 04:26:38PM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
Thought provoking.
It makes me wonder - should there be (or is there) a generic way to remote C shared library calls? This sort of thing exists in other languages (eg. Java RMI).
The trouble with C is that there's soo many to choose from :-) There's CORBA, XML-RPC, DBus, SunRPC to list but 4. The latter is interesting because you can auto-generate the C stubs for client & server...if only we can find a way to then layer it over SSL ? Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 04:26:38PM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
Thought provoking.
It makes me wonder - should there be (or is there) a generic way to remote C shared library calls? This sort of thing exists in other languages (eg. Java RMI).
The trouble with C is that there's soo many to choose from :-) There's CORBA, XML-RPC, DBus, SunRPC to list but 4. The latter is interesting because you can auto-generate the C stubs for client & server...if only we can find a way to then layer it over SSL ?
On the subject of SunRPC, the implementation in glibc in fact does allow you to replace the basic send/recv operations, so it would* be possible to write a back end which talked over (say) gnutls. On the other hand, XML-RPC has mindshare, has a nice enough C library (xmlrpc-c: http://xmlrpc-c.sourceforge.net/) and supports SSL**. Rich. * Or so it seems from looking at the code, but I'd need to write a test client & server to be sure. ** Only checked this on the client side, haven't verified that the server side can support SSL yet. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 04:26:38PM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
Thought provoking.
It makes me wonder - should there be (or is there) a generic way to remote C shared library calls? This sort of thing exists in other languages (eg. Java RMI).
The trouble with C is that there's soo many to choose from :-) There's CORBA, XML-RPC, DBus, SunRPC to list but 4. The latter is interesting because you can auto-generate the C stubs for client & server...if only we can find a way to then layer it over SSL ?
On the subject of SunRPC, the implementation in glibc in fact does allow you to replace the basic send/recv operations, so it would* be possible to write a back end which talked over (say) gnutls. On the other hand, XML-RPC has mindshare, has a nice enough C library (xmlrpc-c: http://xmlrpc-c.sourceforge.net/) and supports SSL**. Rich. * Or so it seems from looking at the code, but I'd need to write a test client & server to be sure. ** Only checked this on the client side, haven't verified that the server side can support SSL yet. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

Hi Dan, So, what you describe is similar to what I was suggesting, but the difference from what I was suggesting means that it does nothing for the actual problem :-) On Tue, 2007-01-16 at 15:57 +0000, Daniel P. Berrange wrote:
On Mon, Jan 15, 2007 at 08:53:43PM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL.
The one exception to that is for remote connections. In that case apps should pass a URI for a remote libvirt daemon which, in turn, would be equivalent to calling virConnectOpen(NULL) on the remote host.
So, remotely connecting directly to a hypervisor should be deprecated.
Having been kept away last night thinking about the implications of this I think you're description above could actually work, with a fairly small modification. But first, some pretty pictures:
1. The simple (current) usage of using libvirt to connect to a local hypervisor. Showing two examples - first how the current Xen backends works, and second how my prototype QEMU backend works:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local.png
This is actually what I'd like to see change. Here's my train of thought: - As a user running multiple types of guests, you want to just decide at creation time whether the guest should be e.g. Xen or QEMU. Apart from that, you don't really want to have to think about what type a guest is. - That implies that users don't want to have different apps for each type of virt, nor different windows, nor different tabs, nor different lists of guests ... if the app doesn't aggregate the guests, then the user will mentally have to aggregate them. - So, should each app do all the heavy lifting to aggregate virt types or should libvirt? I'd argue that while having a consistent API to access different virt types is useful, it's less useful if the app developer needs to access each hypervisor individually. - You're rightly concerned about the namespace clash. It's a problem. I really do sympathise. However, should we just punt the problem to the app developers, or worse ... to the users? - As an example, do you want a situation where someone creates a Xen guest named "Foo", a QEMU guest named "Foo" and when wanting to shutdown the QEMU guest does: $> virsh destroy Foo rather than: $> virsh --connect qemud:///system destroy Foo Oops :-) - Namespace clash #1 is the guest name. I don't think libvirt should allow users to create multiple guests of the same name. It may be technically possible to do that, but if users aggregate the namespace anyway, then it will just cause them confusion if they do. - Probably the only serious problem with that is that libvirt currently will manage Xen guests not created using libvirt. Does it make sense to do that? Will we allow the same with non-Xen? - Namespace clash #2 is the ID. These IDs are assigned by libvirt (except for Xen) and should be opaque to the user, so we could split this namespace now. Start QEMU IDs at 1000? Or prefix the integer with "qemu:"? - Namespace clash #3 is the UUID. This one's kind of funny - one would think we wouldn't need to worry about namespace clashes with "universally unique" IDs :-) We should definitely be trying to prevent from re-using UUIDs. - So ... virConnect(NULL) should be the way of obtaining a context for managing all local guests. The argument to virConnect() would only ever be used to specify a remote context. - The choice between hypervisors is made once and only once, via the domain type in the XML format. - Your "arch-local" diagram would have a single arrow going into libvirt and multiplexing out to all drivers. - Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once. Cheers, Mark.

On Tue, Jan 16, 2007 at 05:21:15PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-16 at 15:57 +0000, Daniel P. Berrange wrote:
On Mon, Jan 15, 2007 at 08:53:43PM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Personally, I think virConnect should be little more than a library context through which you access all hypervisors at once. In practical terms, the XML describing a domain is what chooses which hypervisor to connect to - e.g. all apps should pass NULL to virConnectOpen() and all drivers should handle NULL.
The one exception to that is for remote connections. In that case apps should pass a URI for a remote libvirt daemon which, in turn, would be equivalent to calling virConnectOpen(NULL) on the remote host.
So, remotely connecting directly to a hypervisor should be deprecated.
Having been kept away last night thinking about the implications of this I think you're description above could actually work, with a fairly small modification. But first, some pretty pictures:
1. The simple (current) usage of using libvirt to connect to a local hypervisor. Showing two examples - first how the current Xen backends works, and second how my prototype QEMU backend works:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local.png
This is actually what I'd like to see change.
Here's my train of thought:
- As a user running multiple types of guests, you want to just decide at creation time whether the guest should be e.g. Xen or QEMU. Apart from that, you don't really want to have to think about what type a guest is.
This reminds me of something I've not explicitly said elsewhere. While libvirt API may support multiple difference hypervisors, I'm rather expecting that the common case usage will be that any single host will only ever use one particular hypervisor. ie, a host will be providing Xen, or QEMU+KVM or VMWare or XXX - I think its reasonable to expect that people won't run both Xen and QEMU+KVM on the same host. So, one does not neccessarily have to expose the type of guest to the end user - one could say 'give me the hypervisor connection for this host' and it would auto-detect what hypervisor is available for that host. Of course some people might be perverse enough to run many HV on a host but I suspect that's rather the niche case. Probably main case I'd see is that a host is primarily running Xen as its primary HV. For some one off task, the user may fire up an unprivileged QEMU session, for a few hours.
- That implies that users don't want to have different apps for each type of virt, nor different windows, nor different tabs, nor different lists of guests ... if the app doesn't aggregate the guests, then the user will mentally have to aggregate them.
I think we'd probably end up grouping the guests based on the host they are running on. In the common case of only one HV per host, there would be no need for the user to worry about the different types of HV, unless they opted-in to accessing a non-default HV from the host.
- So, should each app do all the heavy lifting to aggregate virt types or should libvirt? I'd argue that while having a consistent API to access different virt types is useful, it's less useful if the app developer needs to access each hypervisor individually.
- You're rightly concerned about the namespace clash. It's a problem. I really do sympathise. However, should we just punt the problem to the app developers, or worse ... to the users?
Well a combination of all ? We have a hierarchy of namespaces currently, from narrowest scope to broadest: - ID: unique to an active guest on a single HV - Name: unique to a guest for its lifetime on a single host+HV - UUID: unique to a guest for its lifetime, across a datacenter
- As an example, do you want a situation where someone creates a Xen guest named "Foo", a QEMU guest named "Foo" and when wanting to shutdown the QEMU guest does:
$> virsh destroy Foo
rather than:
$> virsh --connect qemud:///system destroy Foo
Oops :-)
Indeed - but not particularly different to the case of managing two remote hosts with virsh $ virsh --connect xen://web1/ destroy Foo vs $ virsh --connect xen://db1/ destroy Foo And I don't think its viable to enforce unique naming of guests across the entire data center. There's always a big risk when using command line tools like this - for graphical tools we can do better, because UI cues can help distinguish guests better - eg grouping by host.
- Namespace clash #1 is the guest name. I don't think libvirt should allow users to create multiple guests of the same name. It may be technically possible to do that, but if users aggregate the namespace anyway, then it will just cause them confusion if they do.
- Probably the only serious problem with that is that libvirt currently will manage Xen guests not created using libvirt. Does it make sense to do that? Will we allow the same with non-Xen?
In the ideal world I'd ignore anything not managed by libvirt, but in reality I don't think that's practical. We need to be able to interoperate as cleanly as possible with other tools, either provided by the HV itself (eg xm) or by other 3rd party vendors. While my prototype QEMU backend ignores VMs not created by libvirt, work going on upstream will make it practical to manage them too.
- Namespace clash #2 is the ID. These IDs are assigned by libvirt (except for Xen) and should be opaque to the user, so we could split this namespace now. Start QEMU IDs at 1000? Or prefix the integer with "qemu:"?
It has to be an integer because of existing ABI constraints. I'm not sure we really want to maintain an internal mapping of libvirt IDs to the actual HV's view of IDs. We already provide a globally unique ID - eg the UUID, so having a second one doesn't seem like much use.
- Namespace clash #3 is the UUID. This one's kind of funny - one would think we wouldn't need to worry about namespace clashes with "universally unique" IDs :-) We should definitely be trying to prevent from re-using UUIDs.
Well, assuming we use a sane generation technique, statistically these are supposed to be globally unique. Of course users make mistakes, for example, when cloning VMs, so XenD will do reasonable effort to check uniqueness. I'm of opinion though that libvirt should avoid trying to implement policy itself, and rather deligate that policy to the underlying HV. eg If libvirt had a polucy for VM names saying a-Z, 0-9, but XenD instead requires a-Z, 0-9, _, - then we can get into crazy situation where user is trying to manage an existing VM, but libvirt incorrectly rejects it.
- Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once.
Possibly :-) I think I'll draw another diagram... Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Jan 16, 2007 at 07:09:30PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 05:21:15PM +0000, Mark McLoughlin wrote:
- Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once.
Possibly :-) I think I'll draw another diagram...
One way is to move the entire driver model out of libvirt and into a daemon, so that libvirt itself is just a very thin layer which marshalls APIs calls onto the wire. So whether local or remote the diagram looks the same: http://people.redhat.com/berrange/libvirt/libvirt-arch-local-1.png http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png Now you might say this will make the Xen stack inefficient, because there will be yet another daemon in the stack. This could certainly be true if the libvirt daemon only ever talked to XenD, but all our performance critical calls go straight to the HV. So when talking to a remote daemon I think libvirt -> libvirt daemon -> HV ought to be faster than libvirt -> XenD -> HV, simply by virtue of not involving python. It would also make it practical to run virt-manager as an unprivileged app even when managing the local Xen instance. So we could remove the need to su to root for the local instance. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 07:09:30PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 05:21:15PM +0000, Mark McLoughlin wrote:
- Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once.
Possibly :-) I think I'll draw another diagram...
One way is to move the entire driver model out of libvirt and into a daemon, so that libvirt itself is just a very thin layer which marshalls APIs calls onto the wire. So whether local or remote the diagram looks the same:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local-1.png http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png
This adds an extra daemon in the simplest case (everything running on one machine), so it makes that case harder to manage than it needs to be. The extra daemon might be required to manage all VM instances or perhaps ensure serialisation of requests when there are multiple libvirts, but is that really a requirement? With upstream patches it should be possible for an independent libvirt to enumerate both Xen & QEMU instances. Rich. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

On Tue, Jan 16, 2007 at 07:35:37PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 07:09:30PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 05:21:15PM +0000, Mark McLoughlin wrote:
- Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once.
Possibly :-) I think I'll draw another diagram...
One way is to move the entire driver model out of libvirt and into a daemon, so that libvirt itself is just a very thin layer which marshalls APIs calls onto the wire. So whether local or remote the diagram looks the same:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local-1.png http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png
Now you might say this will make the Xen stack inefficient, because there will be yet another daemon in the stack. This could certainly be true if the libvirt daemon only ever talked to XenD, but all our performance critical calls go straight to the HV. So when talking to a remote daemon I think libvirt -> libvirt daemon -> HV ought to be faster than libvirt -> XenD -> HV, simply by virtue of not involving python. It would also make it practical to run virt-manager as an unprivileged app even when managing the local Xen instance. So we could remove the need to su to root for the local instance.
Hum, honnestly I would *really* prefer to avoid systematically going though an RPC. No I don't like this idea, I prefer to keep the driver in libvirt linked in the user's space. Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, Jan 17, 2007 at 06:50:47AM -0500, Daniel Veillard wrote:
On Tue, Jan 16, 2007 at 07:35:37PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 07:09:30PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 05:21:15PM +0000, Mark McLoughlin wrote:
- Or perhaps, libvirt would *always* talk to a daemon ... whether local or remote. That way you don't have the race condition where multiple apps can create a guest of the same name or uuid at once.
Possibly :-) I think I'll draw another diagram...
One way is to move the entire driver model out of libvirt and into a daemon, so that libvirt itself is just a very thin layer which marshalls APIs calls onto the wire. So whether local or remote the diagram looks the same:
http://people.redhat.com/berrange/libvirt/libvirt-arch-local-1.png http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png
Now you might say this will make the Xen stack inefficient, because there will be yet another daemon in the stack. This could certainly be true if the libvirt daemon only ever talked to XenD, but all our performance critical calls go straight to the HV. So when talking to a remote daemon I think libvirt -> libvirt daemon -> HV ought to be faster than libvirt -> XenD -> HV, simply by virtue of not involving python. It would also make it practical to run virt-manager as an unprivileged app even when managing
Do you talk about multi-user OS? :-) It's practical for desktop workstation only.
the local Xen instance. So we could remove the need to su to root for the local instance.
Hum, honnestly I would *really* prefer to avoid systematically going though an RPC. No I don't like this idea, I prefer to keep the driver in libvirt linked in the user's space. Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
I'm not sure if the idea is completely wrong. I think possible advantage is that the libvirt will be pretty simple library and almost all development (on drivers) will be happen in the libvirtd. Karel -- Karel Zak <kzak@redhat.com>

On Wed, Jan 17, 2007 at 01:39:13PM +0100, Karel Zak wrote:
On Wed, Jan 17, 2007 at 06:50:47AM -0500, Daniel Veillard wrote:
Hum, honnestly I would *really* prefer to avoid systematically going though an RPC. No I don't like this idea, I prefer to keep the driver in libvirt linked in the user's space. Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
I'm not sure if the idea is completely wrong. I think possible advantage is that the libvirt will be pretty simple library and almost all development (on drivers) will be happen in the libvirtd.
And advantage maybe for the developper, but a definite regression for the user, and sorry the user has priority IMHO Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, 2007-01-17 at 06:50 -0500, Daniel Veillard wrote:
Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
Just a small point on this ... Are you sure that's optimising for the right thing? What libvirt API is so performance sensitive that a roundtrip on a unix domain socket would be a problem? For example, even iterating the list of domain names is going to have a negligible cost compared with loading libvirt from disk :-) However, the number of roundtrips to a management daemon *will* be an issue where the daemon is remote. And we're going to have that whether we use a daemon in the local case. i.e. even *if* daemon roundtrips turn out to be an issue for local apps, we're going to have to fix that for the remote case anyway. Cheers, Mark.

Mark McLoughlin wrote:
On Wed, 2007-01-17 at 06:50 -0500, Daniel Veillard wrote:
Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
Just a small point on this ...
Are you sure that's optimising for the right thing? What libvirt API is so performance sensitive that a roundtrip on a unix domain socket would be a problem?
For example, even iterating the list of domain names is going to have a negligible cost compared with loading libvirt from disk :-)
However, the number of roundtrips to a management daemon *will* be an issue where the daemon is remote. And we're going to have that whether we use a daemon in the local case.
i.e. even *if* daemon roundtrips turn out to be an issue for local apps, we're going to have to fix that for the remote case anyway.
I am just speculating here, but it seems to me that the remote case is going to be the more common one for most users who also care about performance. So agreeing with Mark, that is where we really need to be focusing our attention... --Hugh

On Wed, Jan 17, 2007 at 04:37:56PM +0000, Mark McLoughlin wrote:
On Wed, 2007-01-17 at 06:50 -0500, Daniel Veillard wrote:
Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
Just a small point on this ...
Are you sure that's optimising for the right thing? What libvirt API is so performance sensitive that a roundtrip on a unix domain socket would be a problem?
We have 3 cases curently: - Local privileged user - we use hypercalls for performance criticl ops - Local unprivileged user readonly - we use libvirt_proxy over unix socket - Local unprivileged user readwrite - talk insecurely to XenD Now we know from bitter experiance that talking to XenD is incredibly slow / high overhead. IIRC something like x50 slower. What I've never benchmarked is how well the libvirt_proxy performs. So currently the only time we can use the very fast pure hypercall path is when the app in question is running as root on local machine. I'd really like to stop running virt-manager as root to be honest. If we can get a local daemon providing full read-write operation, without the horrific overhead of XenD, then really the direct root+hypercall path is fairly uninteresting.
For example, even iterating the list of domain names is going to have a negligible cost compared with loading libvirt from disk :-)
Well, loading libvirt from disk is irrelevant because it'll be cached after the very first load.
However, the number of roundtrips to a management daemon *will* be an issue where the daemon is remote. And we're going to have that whether we use a daemon in the local case.
Yes, for remote management we will always have overhead. The key is whether using the same RPC daemon for local management will also have unacceptable overhead..
i.e. even *if* daemon roundtrips turn out to be an issue for local apps, we're going to have to fix that for the remote case anyway.
I'm going to run some tests/benchmarks to see what kind of performance difference there is between root+direct hypercalls, and talking via the libvirt_proxy, and via XenD. This should give us a good basis for deciding whether the root+direct hypercall case has a compelling enough performance advantage to worry about. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Jan 17, 2007 at 04:53:38PM +0000, Daniel P. Berrange wrote:
On Wed, Jan 17, 2007 at 04:37:56PM +0000, Mark McLoughlin wrote:
On Wed, 2007-01-17 at 06:50 -0500, Daniel Veillard wrote:
Thibgs which were dirt cheap become way more expensive when they don't need to, this is a severe regression from a library user standpoint.
Just a small point on this ...
Are you sure that's optimising for the right thing? What libvirt API is so performance sensitive that a roundtrip on a unix domain socket would be a problem?
We have 3 cases curently:
- Local privileged user - we use hypercalls for performance criticl ops - Local unprivileged user readonly - we use libvirt_proxy over unix socket - Local unprivileged user readwrite - talk insecurely to XenD
Now we know from bitter experiance that talking to XenD is incredibly slow / high overhead. IIRC something like x50 slower. What I've never benchmarked is how well the libvirt_proxy performs.
So currently the only time we can use the very fast pure hypercall path is when the app in question is running as root on local machine. I'd really like to stop running virt-manager as root to be honest. If we can get a local daemon providing full read-write operation, without the horrific overhead of XenD, then really the direct root+hypercall path is fairly uninteresting.
Okay that's true we lack data for the most critical paths, *but* I expect people will build load aquisition daemon using libvirt and if we make this way slower with no way to get back to previous behaviour they will be disapointed.
For example, even iterating the list of domain names is going to have a negligible cost compared with loading libvirt from disk :-)
Well, loading libvirt from disk is irrelevant because it'll be cached after the very first load.
However, the number of roundtrips to a management daemon *will* be an issue where the daemon is remote. And we're going to have that whether we use a daemon in the local case.
Yes, for remote management we will always have overhead. The key is whether using the same RPC daemon for local management will also have unacceptable overhead..
agree that's a good point, the local daemon will have to keep the list because it can do it cheaply. But I'm not sure what we have at the current driver API is really the right operations for the RPCs.
i.e. even *if* daemon roundtrips turn out to be an issue for local apps, we're going to have to fix that for the remote case anyway.
I'm going to run some tests/benchmarks to see what kind of performance difference there is between root+direct hypercalls, and talking via the libvirt_proxy, and via XenD. This should give us a good basis for deciding whether the root+direct hypercall case has a compelling enough performance advantage to worry about.
timing data will definitely help, but I hope we will also get feedback from other people who built on top of libvirt. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Tue, 2007-01-16 at 19:09 +0000, Daniel P. Berrange wrote:
This reminds me of something I've not explicitly said elsewhere. While libvirt API may support multiple difference hypervisors, I'm rather expecting that the common case usage will be that any single host will only ever use one particular hypervisor. ie, a host will be providing Xen, or QEMU+KVM or VMWare or XXX - I think its reasonable to expect that people won't run both Xen and QEMU+KVM on the same host.
So, one does not neccessarily have to expose the type of guest to the end user - one could say 'give me the hypervisor connection for this host' and it would auto-detect what hypervisor is available for that host.
e.g. a user might have the following options: 1) Run virt-manager, enter the root password, manage Xen guests 2) Run virt-manager, choose "run unprivileged", manage QEMU guests 3) Run virt-manager on the command line as root with --qemu If that *is* the only way we'd expose QEMU support, then libvirt's API is just fine in this respect. I had assumed we'd have "run under Xen" and "run under QEMU" in the "create new guest" dialog. But that's a stupid question to ask someone ... one should not need to care what type of VM it is. So, hurrah! virt-manager will be sane in this respect and only the not-so-sane app authors will get pissed with libvirt about this :-) Cheers, Mark.

On Wed, Jan 17, 2007 at 05:20:17PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-16 at 19:09 +0000, Daniel P. Berrange wrote:
This reminds me of something I've not explicitly said elsewhere. While libvirt API may support multiple difference hypervisors, I'm rather expecting that the common case usage will be that any single host will only ever use one particular hypervisor. ie, a host will be providing Xen, or QEMU+KVM or VMWare or XXX - I think its reasonable to expect that people won't run both Xen and QEMU+KVM on the same host.
So, one does not neccessarily have to expose the type of guest to the end user - one could say 'give me the hypervisor connection for this host' and it would auto-detect what hypervisor is available for that host.
e.g. a user might have the following options:
1) Run virt-manager, enter the root password, manage Xen guests
BTW this option is evil and I intend to kill it off as soon as possible. We only have it currently because there is no secure way to manage Xen from an unprivileged context. Soon it will be possible to manage Xen as an unprivileged user whether local or remote using virt-manager.
2) Run virt-manager, choose "run unprivileged", manage QEMU guests
3) Run virt-manager on the command line as root with --qemu
If that *is* the only way we'd expose QEMU support, then libvirt's API is just fine in this respect.
Not entirely sure how I'll integrate it in virt-manager. For dev/testing purposes I just did a lame hack to the initial connect dialog to choose Xen vs QEMU, but I don't want that long term, because really users shouldn't have to make that kind of decision themselves.
I had assumed we'd have "run under Xen" and "run under QEMU" in the "create new guest" dialog. But that's a stupid question to ask someone ... one should not need to care what type of VM it is.
The only change to the 'create new guest' wizard thus far is the ability to select CPU architecture (since QEMU allows sparc, mips, x86, ppc etc).
So, hurrah! virt-manager will be sane in this respect and only the not-so-sane app authors will get pissed with libvirt about this :-)
I certainly hope it will be sane :-) Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, 2007-01-16 at 19:09 +0000, Daniel P. Berrange wrote:
- Probably the only serious problem with that is that libvirt currently will manage Xen guests not created using libvirt. Does it make sense to do that? Will we allow the same with non-Xen?
In the ideal world I'd ignore anything not managed by libvirt, but in reality I don't think that's practical. We need to be able to interoperate as cleanly as possible with other tools, either provided by the HV itself (eg xm) or by other 3rd party vendors.
Personally, I'd just be a hard-ass about it :-)
While my prototype QEMU backend ignores VMs not created by libvirt, work going on upstream will make it practical to manage them too.
I like the way the QEMU backend works right now ... I guess one way of looking at it is to ask whether libvirt is: a) A common API for accessing various virtualisation management infrastructures[1] or b) A common virtualisation management infrastructure The Xen support suggests (a), the QEMU support suggests (b). But it's pretty clear the consensus is that libvirt should avoid being (b) where it can. (a), compared to (b), makes me rather queasy because you have to not only map from libvirt's model to each hypervisor's model, you also have to map each hypervisor's model back to libvirt's model. Which, I guess, suggests that libvirt's model needs to be a superset of all hypervisors' models, rather than a subset. But fair enough, I just got excited ... libvirt is (a), not (b), and I have to deal with that :-) Cheers, Mark. [1] - By "virtualisation management infrastructure", I mean e.g. whatever knows what guests exist (active or not) and can start and stop them.

On Wed, Jan 17, 2007 at 05:35:37PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-16 at 19:09 +0000, Daniel P. Berrange wrote:
- Probably the only serious problem with that is that libvirt currently will manage Xen guests not created using libvirt. Does it make sense to do that? Will we allow the same with non-Xen?
In the ideal world I'd ignore anything not managed by libvirt, but in reality I don't think that's practical. We need to be able to interoperate as cleanly as possible with other tools, either provided by the HV itself (eg xm) or by other 3rd party vendors.
Personally, I'd just be a hard-ass about it :-)
While my prototype QEMU backend ignores VMs not created by libvirt, work going on upstream will make it practical to manage them too.
I like the way the QEMU backend works right now ...
I guess one way of looking at it is to ask whether libvirt is:
a) A common API for accessing various virtualisation management infrastructures[1]
or
b) A common virtualisation management infrastructure
The Xen support suggests (a), the QEMU support suggests (b). But it's pretty clear the consensus is that libvirt should avoid being (b) where it can.
(a), compared to (b), makes me rather queasy because you have to not only map from libvirt's model to each hypervisor's model, you also have to map each hypervisor's model back to libvirt's model. Which, I guess, suggests that libvirt's model needs to be a superset of all hypervisors' models, rather than a subset.
Turning libvirt into a superset of all HVs is a rather scary concept because it might entail the API getting absolutely enourmous. Not only that, but you'd loose some of the isolation that libvirt gives you from the backend. ie if we added a bunch of APIs that only make sense for Xen, then it becomes much harder to get apps working with the non-Xen backends. So we have actually been trying to keep libvirt closer to a subset, rather than a superset. We are applying some flexibility to that rule though - we're fine adding new APIs that only work on Xen for now - provided they're at least conceptually relevant for other hypervisors. As an example, we have an setMemory API to change a guests memory on the fly, even though Xen is the only HV which currently lets you that. Its reasonable to assume that others will support that in the future possibly through memory hotplug. When doing this we're also making sure we don't expose Xen specific formats - eg for the block device hotplug add / remove, we don't use the Xen 'block device ID number' to add / remove devices, we use the generic device description XML blob. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote: [Tue Jan 16 2007, 10:57:03AM EST]
2. The way I was always anticipating remote use of libvirt to work. The app uses libvirt locally which opens a connection to the remote machine using whatever remote management protocol is relevant for the hypervisor in question. eg, HTTP/XML-RPC for Xen, or the TLS secured binary format for the prototype QEMU backend.
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-1.png
So this works to manage a remote host that might not have libvirt installed...
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
...and this requires each managed host to have libvirt(d). This is considered a reasonable requirement? Aron

On Tue, Jan 16, 2007 at 04:19:37PM -0500, Aron Griffis wrote:
Daniel P. Berrange wrote: [Tue Jan 16 2007, 10:57:03AM EST]
2. The way I was always anticipating remote use of libvirt to work. The app uses libvirt locally which opens a connection to the remote machine using whatever remote management protocol is relevant for the hypervisor in question. eg, HTTP/XML-RPC for Xen, or the TLS secured binary format for the prototype QEMU backend.
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-1.png
So this works to manage a remote host that might not have libvirt installed...
Provided the host in question provided a secure remote management system. Until Xen-API is supported, even Xen doesn't have a useful remote management system since its currents APIs have zero auth. Other non-Xen virt produces don't have any remote management.
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
...and this requires each managed host to have libvirt(d).
This is considered a reasonable requirement?
Personally it doesn't worry me too much - by all means though, I'm open to arguments against it too..... The way I currently look at the problem, needing to deploy a small C based management daemon (merely linked to an SSL library for secure comms) isn't very onerous in comparison to the enourmous pile of python code Xen already requires. For non-Xen backends we'll definitely need a daemon of some form, since QEMU / KVM / UML / etc don't have any management daemon at all. For administrators there's a certain benefit to only having to worry about opening up one daemon to the public network regardless of which virt system in use. But then maybe we actually need to support both remote management models ? Would a requirement for a libvirtd be a problem for your use cases ? Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote: [Tue Jan 16 2007, 04:54:49PM EST]
On Tue, Jan 16, 2007 at 04:19:37PM -0500, Aron Griffis wrote:
Daniel P. Berrange wrote: [Tue Jan 16 2007, 10:57:03AM EST]
2. The way I was always anticipating remote use of libvirt to work. The app uses libvirt locally which opens a connection to the remote machine using whatever remote management protocol is relevant for the hypervisor in question. eg, HTTP/XML-RPC for Xen, or the TLS secured binary format for the prototype QEMU backend.
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-1.png
So this works to manage a remote host that might not have libvirt installed...
Provided the host in question provided a secure remote management system. Until Xen-API is supported, even Xen doesn't have a useful remote management system since its currents APIs have zero auth. Other non-Xen virt produces don't have any remote management.
*nod*
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
...and this requires each managed host to have libvirt(d).
This is considered a reasonable requirement?
Personally it doesn't worry me too much - by all means though, I'm open to arguments against it too.....
The way I currently look at the problem, needing to deploy a small C based management daemon (merely linked to an SSL library for secure comms) isn't very onerous in comparison to the enourmous pile of python code Xen already requires. For non-Xen backends we'll definitely need a daemon of some form, since QEMU / KVM / UML / etc don't have any management daemon at all. For administrators there's a certain benefit to only having to worry about opening up one daemon to the public network regardless of which virt system in use.
What's the gap (if any) between libvirtd and xend capabilities? i.e. could libvirtd eventually allow dom0 to omit the python-based xen mgmt stack to shrink dom0 to a significantly thinner OS instance?
But then maybe we actually need to support both remote management models ?
IMHO unnecessary complexity.
Would a requirement for a libvirtd be a problem for your use cases ?
I don't think so. Mostly I was verifying my understanding, and that the requirement was considered. Thanks, Aron

On Tue, Jan 16, 2007 at 05:16:54PM -0500, Aron Griffis wrote:
Daniel P. Berrange wrote: [Tue Jan 16 2007, 04:54:49PM EST]
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
...and this requires each managed host to have libvirt(d).
This is considered a reasonable requirement?
Personally it doesn't worry me too much - by all means though, I'm open to arguments against it too.....
The way I currently look at the problem, needing to deploy a small C based management daemon (merely linked to an SSL library for secure comms) isn't very onerous in comparison to the enourmous pile of python code Xen already requires. For non-Xen backends we'll definitely need a daemon of some form, since QEMU / KVM / UML / etc don't have any management daemon at all. For administrators there's a certain benefit to only having to worry about opening up one daemon to the public network regardless of which virt system in use.
What's the gap (if any) between libvirtd and xend capabilities? i.e. could libvirtd eventually allow dom0 to omit the python-based xen mgmt stack to shrink dom0 to a significantly thinner OS instance?
By far the most significant thing XenD does for us is the initial guest creation work. Constructing the page tables, populating xenstore, setting up the virtual device backends, etc. There's no reason this could not be replicated in a libvirtd - the real low level bits are isolated in libxc - but I think it'd be really quite alot of work. Then there's bunch of other bits like save/restore & migration to dela with. So possible, but not anywhere on the short-medium development term radar. I agree though in principle it would be nice to slim down the dom0 management stack, preferably being able to eliminate the python runtime altogether. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Jan 16, 2007 at 11:24:52PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 16, 2007 at 05:16:54PM -0500, Aron Griffis wrote:
Daniel P. Berrange wrote: [Tue Jan 16 2007, 04:54:49PM EST]
3. The way I think you re suggesting - a libvirt server on every remote host which calls into the regular libvirt internal driver model to proxy remote calls. So even if the hypervisor in question provides a remote network management API, we will always use the local API and do *all* remote networking via the libvirt server
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
...and this requires each managed host to have libvirt(d).
This is considered a reasonable requirement?
Personally it doesn't worry me too much - by all means though, I'm open to arguments against it too.....
The way I currently look at the problem, needing to deploy a small C based management daemon (merely linked to an SSL library for secure comms) isn't very onerous in comparison to the enourmous pile of python code Xen already requires. For non-Xen backends we'll definitely need a daemon of some form, since QEMU / KVM / UML / etc don't have any management daemon at all. For administrators there's a certain benefit to only having to worry about opening up one daemon to the public network regardless of which virt system in use.
What's the gap (if any) between libvirtd and xend capabilities? i.e. could libvirtd eventually allow dom0 to omit the python-based xen mgmt stack to shrink dom0 to a significantly thinner OS instance?
By far the most significant thing XenD does for us is the initial guest creation work. Constructing the page tables, populating xenstore, setting up the virtual device backends, etc. There's no reason this could not be replicated in a libvirtd - the real low level bits are isolated in libxc - but I think it'd be really quite alot of work. Then there's bunch of other bits like save/restore & migration to dela with. So possible, but not anywhere on the short-medium development term radar. I agree though in principle it would be nice to slim down the dom0 management stack, preferably being able to eliminate the python runtime altogether.
The isolation is at the licence level too, libxc is GPL'ed not LGPL'ed and it seems trying to change the licence now would be very hard. By exporting an RPC API the Xend daemon allows to access the low level. Now if libvirt were always to use a daemon linking to libxc then we would be in a similar situation without needing xend for those. Not that I suggest to push for it from a technical point of view but this is another aspect of the relationship between the different pieces of code. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel P. Berrange wrote:
For non-Xen backends we'll definitely need a daemon of some form, since QEMU / KVM / UML / etc don't have any management daemon at all.
Is this true? Just examining assumptions ... Rich. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

On Mon, Jan 15, 2007 at 08:06:18PM +0000, Mark McLoughlin wrote:
Hi, Dan and I have been discussing how to "fix networking", not just Xen's networking but also getting something sane wrt. QEMU/KVM etc.
Comments very welcome on the writeup below. The libvirt stuff is towards the end, but I think all of it is probably useful to this list.
Since we've disappeared down a rat-hole with the other part of the thread, here's an attempt to get back on-topic :-)
1. A privileged user creates two (Xen) guests, each with a Virtual Network Interface. Without any special networking configuration, these two guests are connected to a default Virtual Network which contains a combined Virtual Bridge/Router/Firewall.
+-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +-------+ +--------+
Notes:
* "vnbr0" is a bridge device with it's own IP address on the same subnet as the guests. * IP forwarding is enabled in Dom0. Masquerading and DNAT is implemented using iptables. * We run a DHCP server and a DNS proxy in Dom0 (e.g. dnsmasq) 2. A privileged user does exactly the same thing as (1), but with QEMU guests.
D N D H A N C T S P ^ ^ ^ +---+---+ | +---+---+ +-----------+ | vnbr0 | +-----------+ | Guest | +---+---+ | Guest | | A | | | B | | +---+ | +---+---+ | +---+ | | |NIC| | | vtap0 | | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ +-------+ | +-------+ ^ | | | +---+---+ | | | +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ | | +-------+ | | +-------+ +-------+
Notes:
* VDE is a userspace ethernet bridge implemented using vde_switch * "vtap0" is a TAP device created by vde_switch * Everything else is the same as (1) * This could be done without vde_switch by having Guest A create vtap0 and have Guest B connect directly to Guest A's VLAN. However, if Guest A is shut down, Guest B's network would go down.
Since the user is privileged, another way to do without VDE is to mirror the Xen case almost exactly, creating one tap device per guest, instead of Xen's netback vif devices: +-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vtap0 +----+ vnbr0 +----+ vtap1 +<--+ +--------+ +-------+ +--------+
3. An unprivileged user does exactly the same thing as (2).
+-----------+ +-----------+ | Guest | +----+----+ | Guest | | A | |userspace| | B | | +---+ | | network | | +---+ | | |NIC| | | stack | | |NIC| | +---+-+-+---+ +----+----+ +---+-+-+---+ ^ +-------+ | +-------+ ^ | | | +---+---+ | | | +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ | | +-------+ | | +-------+ +-------+
Notes:
* Similar to (2) except there is can be no TAP device or bridge * The userspace network stack is implemented using slirpvde to provide a DHCP server and DNS proxy to the network, but also effectively a SNAT and DNAT router. * slirpvde implements ethernet, ip, tcp, udp, icmp, dhcp, tftp (etc.) in userspace. Completely crazy, but since the kernel apparently has no secure way to allow unprivileged users to leverage the kernel's network stack for this, then it must be done in userspace.
Is it practical to just have some kind of privileged proxy that would merely create & configure the tap devices on behalf of the unprivileged guests ? If we just create tap devices for any unprivileged guest, but kept them discounted from any real network device, would that still be a big hole ? Or can we leverage QEMU's builtin SLIRP or other non-TAP networking modes to construct something reasonable in userspace, without using VDE.
4. Same as (2), except the user also creates two Xen guests.
+-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +---+---+ +--------+ | +---+---+ | vtap0 | +---+---+ | +-------+ +--+--+ +-------+ +---->+ VLAN0 +----+ VDE +---+ VLAN0 +<-----+ | +-------+ +-----+ +-------+ | V V +---+-+-+---+ +---+-+-+---+ | |NIC| | | |NIC| | | +---+ | | +---+ | | Guest | | Guest | | C | | D | +-----------+ +-----------+
Notes:
* In this case we could do away with VDE and have each QEMU guest use its own TAP device.
Yep, that would make sense if the guests were privileged - best to stay close to kernel networking devices if at all possible.
5. Same as (3) except Guests A and C are connected to a Shared Physical Interface.
+-----------+ | D +-----------+ | Guest | ^ | N D H | Guest | | A | | | A N C | B | | +---+ | +---+---+ | T S P | +---+ | | |NIC| | | eth0 | | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ | +---+---+ +---+-+-+---+ ^ | | | ^ | +--------+ +---+---+ | +---+---+ +--------+ | +>+ vif1.0 +-+ ebr0 + | + vnbr0 +-+ vif2.0 +<-+ +--------+ +---+---+ | +---+---+ +--------+ | | | +---+---+ | +---+---+ | vtap1 | | | vtap0 | +---+---+ | +---+---+ | | | +-------+ +--+--+ | +--+--+ +-------+ +->+ VLAN0 +--+ VDE + | + VDE +--+ VLAN0 +<-+ | +-------+ +-----+ | +-----+ +-------+ | V | V +---+-+-+---+ | +---+-+-+---+ | |NIC| | | | |NIC| | | +---+ | | | +---+ | | Guest | | | Guest | | C | | | D | +-----------+ | +-----------+
Notes:
* The idea here is that when the admin configures eth0 to be shareable, eth0 is configured as an addressless NIC enslaved to a bridge which has the MAC address and IP address that eth0 should have * Again, VDE is redundant here.
This diagram just scares me, but I guess its merely showing two isolated networks with a different set of guests on each. Probably be much less scary if not ascii-art..
6. Same as 2) except the QEMU guests are on a Virtual Network on another physical machine which is, in turn, connected to the Virtual Network on the first physical machine
+-----------+ D +-----------+ | Guest | N D H | Guest | | A | A N C | B | | +---+ | T S P | +---+ | | |NIC| | ^ ^ ^ | |NIC| | +---+-+-+---+ +---+---+ +---+-+-+---+ ^ | ^ | +--------+ +---+---+ +--------+ | +-->+ vif1.0 +----+ vnbr0 +----+ vif2.0 +<--+ +--------+ +---+---+ +--------+ | +---+---+ | vtap0 | +---+---+ | +--+--+ | VDE | +--+--+ | First Physical Machine V ------------------------------------------------------------- Second Physical Machine ^ | +-------+ +--+--+ +-------+ +---->+ VLAN0 +----+ VDE +---+ VLAN0 +<-----+ | +-------+ +-----+ +-------+ | V V +---+-+-+---+ +---+-+-+---+ | |NIC| | | |NIC| | | +---+ | | +---+ | | Guest | | Guest | | C | | D | +-----------+ +-----------+
Notes:
* What's going on here is that the two VDEs are connected over the network, either via a plan socket or perhaps encapsulated in another protocol like SSH or TLS
This is the case where I always thought VDE did get interesting - being able to create pure userspace virtual networks across machines, without any root privileges. Gives joe-user a nice lot of power
Virtual Networks will be implemented in libvirt. First, there will be an XML description of Virtual Networks e.g.:
<network id="0"> <name>Foo</name> <uuid>596a5d2171f48fb2e068e2386a5c413e</uuid> <listen address="172.31.0.5" port="1234" /> <connections> <connection address="172.31.0.6" port="4321" /> </conections> <dhcp enabled="true"> <ip address="10.0.0.1" netmask="255.255.255.0" start="10.0.0.128" end="10.0.0.254" /> </dhcp> <forwarding enabled="true"> <incoming default="deny"> <allow port="123" domain="foobar" destport="321" /> </incoming> <outgoing default="allow"> <deny port="25" /> </outgoing> </forwarding> <network>
Got to also think how we connect guest domains to the virtual network. Currently we just have something really simple like <interface type="bridge"> <source bridge='xenbr0'/> <mac address='00:11:22:33:44:55'/> </interface> I guess we've probably want to refer to the UUID of the network to map it into the guest. Oh, do we to define a 'network 0' to the the physical network of the hos machine - what if there are multiple host NICs - any conventions we need to let us distinguish ? Maybe its best to just refer to the host network by using IP addresses - so we can deal better which case where a machine switches from eth0 -> eth1 (wired to wireless) but keeps the same IP address, or some such.
* The XML format isn't thought out at all, but briefly: * The <listen> and <connections> elements describe networks connected across physical machine boundaries. * The <dhcp> element describes the configuration of the DHCP server on the network. * The <forwarding> element describes how incoming and outgoing connections are forwarded.
Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, 2007-01-16 at 22:28 +0000, Daniel P. Berrange wrote:
On Mon, Jan 15, 2007 at 08:06:18PM +0000, Mark McLoughlin wrote:
Since we've disappeared down a rat-hole with the other part of the thread, here's an attempt to get back on-topic :-)
Indeed :-)
Since the user is privileged, another way to do without VDE is to mirror the Xen case almost exactly, creating one tap device per guest, instead of Xen's netback vif devices:
Sure. There is the argument that always using VDE is nicer because it's consistent with the non-privileged and remotely connected network versions. As you say, though, this way is consistent with the Xen version.
3. An unprivileged user does exactly the same thing as (2).
+-----------+ +-----------+ | Guest | +----+----+ | Guest | | A | |userspace| | B | | +---+ | | network | | +---+ | | |NIC| | | stack | | |NIC| | +---+-+-+---+ +----+----+ +---+-+-+---+ ^ +-------+ | +-------+ ^ | | | +---+---+ | | | +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ | | +-------+ | | +-------+ +-------+
Notes:
* Similar to (2) except there is can be no TAP device or bridge * The userspace network stack is implemented using slirpvde to provide a DHCP server and DNS proxy to the network, but also effectively a SNAT and DNAT router. * slirpvde implements ethernet, ip, tcp, udp, icmp, dhcp, tftp (etc.) in userspace. Completely crazy, but since the kernel apparently has no secure way to allow unprivileged users to leverage the kernel's network stack for this, then it must be done in userspace.
Is it practical to just have some kind of privileged proxy that would merely create & configure the tap devices on behalf of the unprivileged guests ? If we just create tap devices for any unprivileged guest, but kept them discounted from any real network device, would that still be a big hole ?
Okay, to avoid a userspace network stack, you need a way to securely allow guests running as unprivileged users to use the kernel's network stack. That implies: 1) The packets/frames have to arrive on a network interface created by the user (e.g. a TAP or SLIP iface) 2) It should not be possible to spoof as another host or adversely affect the host's connectivity, or any other machine on the same network as the host 3) slirp prevents spoofing by effectively translating the source address of any packet which leaves the virtual network, just like a router using SNAT 4) We can do the same thing by enabling IP forwarding and having all packets forwarded by the host go through SNAT 5) The problem with that is what to do about packets not being forwarded by the host, but which are destined for the host itself? SNAT in PREROUTING might do it, but that's not allowed it seems. 6) We also have to worry about whether people could e.g. screw up the host's ARP cache 7) We also have to worry about a DOS whereby someone creates lots of network interfaces And note, this isn't just about worrying about nasty guests. You have to worry about what nasty users on the host could do with a setuid helper like this. It's certainly got to be "possible" ... but I don't yet feel I know what all the bases are that need to be covered, never mind how we'd cover them.
Or can we leverage QEMU's builtin SLIRP or other non-TAP networking modes to construct something reasonable in userspace, without using VDE.
The general problem with any SLIRP derivative or similar it's another network stack implementation. That makes me nervous for security, performance, stability and portability reasons. And as I found out, the case in point is that SLIRP currently has buffer overflow vulnerabilities and isn't 64 bit clean.
Virtual Networks will be implemented in libvirt. First, there will be an XML description of Virtual Networks e.g.:
<network id="0"> <name>Foo</name> <uuid>596a5d2171f48fb2e068e2386a5c413e</uuid> <listen address="172.31.0.5" port="1234" /> <connections> <connection address="172.31.0.6" port="4321" /> </conections> <dhcp enabled="true"> <ip address="10.0.0.1" netmask="255.255.255.0" start="10.0.0.128" end="10.0.0.254" /> </dhcp> <forwarding enabled="true"> <incoming default="deny"> <allow port="123" domain="foobar" destport="321" /> </incoming> <outgoing default="allow"> <deny port="25" /> </outgoing> </forwarding> <network>
Got to also think how we connect guest domains to the virtual network.
Right, further on in the mail I said: * Where is the connection between domains and networks in either the API or the XML format? How is a domain associated with a network? You put a bridge name in the <network> definition and use that in the domains <interface> definition? Or you put the network name in the interface definition and have libvirt look up the bridge name when creating the guest?
Currently we just have something really simple like
<interface type="bridge"> <source bridge='xenbr0'/> <mac address='00:11:22:33:44:55'/> </interface>
I guess we've probably want to refer to the UUID of the network to map it into the guest.
Well, the UUID isn't much good if you can't map it. So, it would probably be the name and libvirt URI, right?
Oh, do we to define a 'network 0' to the the physical network of the hos machine - what if there are multiple host NICs - any conventions we need to let us distinguish ? Maybe its best to just refer to the host network by using IP addresses - so we can deal better which case where a machine switches from eth0 -> eth1 (wired to wireless) but keeps the same IP address, or some such.
Well, I think there should be a default virtual network defined somehow. You shouldn't need to create one unless you want a second one. But remember that under the model I'm suggesting, guests connect *either* to a virtual network or a physical network via a "shared physical interface". The shared physical interface just winds up being a bridge you enslave the guest's interface to, so the easiest answer for that is that we stick with the way it is right now for Xen and have QEMU create a TAP device and enslave that to the bridge in this mode. Dunno, it does need more thought/discussion ... I find the current <interface> stuff quite strange now - e.g. "bridge" vs. "ethernet" types and the bridge name is in <source> ? Cheers, Mark.

On Wed, 2007-01-17 at 18:38 +0000, Mark McLoughlin wrote:
On Tue, 2007-01-16 at 22:28 +0000, Daniel P. Berrange wrote:
On Mon, Jan 15, 2007 at 08:06:18PM +0000, Mark McLoughlin wrote:
Virtual Networks will be implemented in libvirt. First, there will be an XML description of Virtual Networks e.g.:
<network id="0"> <name>Foo</name> <uuid>596a5d2171f48fb2e068e2386a5c413e</uuid> <listen address="172.31.0.5" port="1234" /> <connections> <connection address="172.31.0.6" port="4321" /> </conections> <dhcp enabled="true"> <ip address="10.0.0.1" netmask="255.255.255.0" start="10.0.0.128" end="10.0.0.254" /> </dhcp> <forwarding enabled="true"> <incoming default="deny"> <allow port="123" domain="foobar" destport="321" /> </incoming> <outgoing default="allow"> <deny port="25" /> </outgoing> </forwarding> <network>
Got to also think how we connect guest domains to the virtual network.
Right, further on in the mail I said:
* Where is the connection between domains and networks in either the API or the XML format? How is a domain associated with a network? You put a bridge name in the <network> definition and use that in the domains <interface> definition? Or you put the network name in the interface definition and have libvirt look up the bridge name when creating the guest?
Currently we just have something really simple like
<interface type="bridge"> <source bridge='xenbr0'/> <mac address='00:11:22:33:44:55'/> </interface>
I guess we've probably want to refer to the UUID of the network to map it into the guest.
Well, the UUID isn't much good if you can't map it. So, it would probably be the name and libvirt URI, right?
Related to the last patch, how about we just put the network name in the interface definition? Attached are patches implementing this for both QEMU and Xen guests. Cheers, Mark.

Hey, So, the latest patches are at: http://www.gnome.org/~markmc/code/libvirt-networking/ I'm pretty happy with how things are at the moment. I've more or less cleared out my todo list on this[1], but I'm left with a big fat elephant sitting in the corner looking quite guilty ... iptables :-) Basically, once you create a virtual network, you need the following iptables rules: - Allow bridging across the vnet's bridge - e.g. just allow all bridging: $> iptables -D FORWARD 1 $> iptables -A FORWARD -m physdev ! --physdev-is-bridged -j REJECT --reject-with icmp-host-prohibited - Allow DHCP and DNS requests from guests: $> iptables -I INPUT -p tcp -m tcp --dport 53 -j ACCEPT $> iptables -I INPUT -p udp -m udp --dport 53 -j ACCEPT $> iptables -I INPUT -p udp -m udp --dport 67 -j ACCEPT - Enable forwarding and SNAT: $> echo 1 > /proc/sys/net/ipv4/ip_forward $> iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE - And any DNAT rules to e.g. re-direct port 8080 on the host to port 80 on a specific guest. Figuring out what the rules should be and adding them isn't a problem ... the problem is how to interact with the underlying distributions iptables infrastructure. e.g. in Fedora, if you just go ahead and add these rules, they'd be wiped out by "service iptables restart", or overwritten by the firewall config tool or saved and applied at boot if you used "service iptables save". Bit of a mess :/ Cheers, Mark. [1] - Well, networks for unprivileged users is another big todo item

pedant's quick glance: ;-) On Thu, Jan 25, 2007 at 03:52:03PM +0000, Mark McLoughlin wrote:
@@ -761,6 +813,61 @@ static int qemudParseXML(struct qemud_se }
+static char * +qemudNetworkIfaceConnect(struct qemud_server *server, + struct qemud_vm *vm, + struct qemud_vm_net_def *net) +{
.....
+ + snprintf(tapfdstr, sizeof(tapfdstr), "tap,fd=%d,script=", tapfd); + + return strdup(tapfdstr); ^^^^^^^^^^^^^^^^^^^ where is allocation check?
if ((p = strdup(tapfdstr))) return p;
+ + no_memory: + qemudReportError(server, VIR_ERR_NO_MEMORY, "tapfds"); + error: + if (tapfd != -1) + close(tapfd); + return NULL; +} @@ -1653,6 +1774,18 @@ char *qemudGenerateXML(struct qemud_serv net->mac[3], net->mac[4], net->mac[5]) < 0) goto no_memory;
+ if (net->type == QEMUD_NET_NETWORK) { + if (qemudBufferPrintf(&buf, " <network name='%s", net->dst.network.name) < 0) + goto no_memory; + + if (net->dst.network.tapifname[0] != '\0' && + qemudBufferPrintf(&buf, " tapifname='%s'", net->dst.network.tapifname) < 0) + goto no_memory; + + if (qemudBufferPrintf(&buf, "/>\n") < 0) + goto no_memory; + } + if (qemudBufferPrintf(&buf, " </interface>\n") < 0) ^^^^^^
There is also BufferAdd() which is cheaper than Printf if you needn't any string formatting.
goto no_memory;
Karel -- Karel Zak <kzak@redhat.com>

Karel Zak wrote:
+ return strdup(tapfdstr); ^^^^^^^^^^^^^^^^^^^ where is allocation check?
There's a strong argument that you shouldn't check for out of memory errors on small heap allocations. After all, in a typical C program there's a ratio somewhere around 10 : 1 of stack objects allocated : objects allocated on the heap (malloc, strdup). Yet stack object allocation is almost never checked for failures. So you're making your code considerably longer and harder to understand in order to catch failures in only 1 in 10 memory allocations. Moreover on most Linux distributions the stack is limited to something uselessly small like 8 MB, which makes recursive algorithms fail when there's plenty of free memory around. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Mon, Jan 29, 2007 at 09:19:29AM +0000, Richard W.M. Jones wrote:
Karel Zak wrote:
+ return strdup(tapfdstr); ^^^^^^^^^^^^^^^^^^^ where is allocation check?
There's a strong argument that you shouldn't check for out of memory errors on small heap allocations. After all, in a typical C program there's a ratio somewhere around 10 : 1 of stack objects allocated : objects allocated on the heap (malloc, strdup). Yet stack object allocation is almost never checked for failures. So you're making your code considerably longer and harder to understand in order to catch failures in only 1 in 10 memory allocations.
The more important is consistency of the coding style. We check strdup() results in the library. Karel -- Karel Zak <kzak@redhat.com>

Hi Karel, On Mon, 2007-01-29 at 09:27 +0100, Karel Zak wrote:
pedant's quick glance: ;-)
On Thu, Jan 25, 2007 at 03:52:03PM +0000, Mark McLoughlin wrote:
@@ -761,6 +813,61 @@ static int qemudParseXML(struct qemud_se }
+static char * +qemudNetworkIfaceConnect(struct qemud_server *server, + struct qemud_vm *vm, + struct qemud_vm_net_def *net) +{
.....
+ + snprintf(tapfdstr, sizeof(tapfdstr), "tap,fd=%d,script=", tapfd); + + return strdup(tapfdstr); ^^^^^^^^^^^^^^^^^^^ where is allocation check?
if ((p = strdup(tapfdstr))) return p;
Very well spotted ... I've also moved the strdup() to before the tapfds realloc() so as to not leave that in a weird state if the strdup() fails.
+ if (net->type == QEMUD_NET_NETWORK) { + if (qemudBufferPrintf(&buf, " <network name='%s", net->dst.network.name) < 0) + goto no_memory; + + if (net->dst.network.tapifname[0] != '\0' && + qemudBufferPrintf(&buf, " tapifname='%s'", net->dst.network.tapifname) < 0) + goto no_memory; + + if (qemudBufferPrintf(&buf, "/>\n") < 0) + goto no_memory; + } + if (qemudBufferPrintf(&buf, " </interface>\n") < 0) ^^^^^^
There is also BufferAdd() which is cheaper than Printf if you needn't any string formatting.
That's one for Dan ... notice the way it's not actually in my patch. I did actually think the same thing myself at the time ... :-) Thanks, Mark.

Hi, I just wanted to share my progress on this. See here for a patch set which can be applied to current CVS using quilt: http://www.gnome.org/~markmc/code/libvirt-networking/ I've appended the series file with URLs to each of the patches. Comments very welcome. Thanks, Mark. # # Dan's patches # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-daemon.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-driver.pat... # # Various fixes to Dan's patches # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-c99.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-kqemu.p... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-transient.... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-error-over... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-free-xpath... # # Some re-factoring for later # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemud-refactor-... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-config-ref... # # Misc libvirt fixes cleanups # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-unused-driver-m... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-rename-handle-t... # # Add the basic networking API and # driver methods to support it # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-api.pat... # # Add network support to virError # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-error.p... # # Add net-* commands to virsh # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-virsh.p... # # Hook up to qemud # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-qemu-st... # # Implement config parsing etc. # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-config.... # # Add support for creating a bridge # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-bridge.... # # Add support for starting dnsmasq # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-dnsmasq...

On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
# Dan's patches http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-daemon.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-driver.pat...
I've been very lame in not sending an update to these patches. I've updated them to support TLS, protocol versioning, fixed size types on the wire & network byte ordering on the wire. Shouldn't be too difficult to resolve though since I think it'll only really impact your libvirt-network-qemu-stubs.patch file.
# # Various fixes to Dan's patches # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-c99.pat...
Based on IRC discussions I think we want to avoid both -std=gnu99 & -std=c99 from the compiler flags. And just use appropriate feature macros like -D_XOPEN_SOURCE, -D_SVID_SOURCE=1 as neccessary. In particular I'd like to avoid GNU specific bits so we don't make life hard for Solaris / BSD guys.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-kqemu.p...
Hmm, yeah I imagine the build patched that flag out because of its license issues. I gues I'll have to make a 'configure' check to see if -no-kqemu is available on a particular host or not.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-transient....
That should not be neccessary in my latest patches - I fixed up the transient domain cleanup stuff in a slightly different way.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-error-over...
Looks good.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-free-xpath...
Already fixed in latest code.
# Some re-factoring for later http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemud-refactor-...
Looks good, will merge that in my next QEMU patches.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-config-ref...
Likewise, looks good.
# # Misc libvirt fixes cleanups # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-unused-driver-m...
Yep, we've lived with that baggage for too long
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-rename-handle-t...
Seems reasonable. On the note of cleanup - theres a bucket load of code in xml.c and xend_internal.c which is never called by anything which we should remove - it constantly confuses me when i work on these two files to see all this code which turns out to be unused.
# Add the basic networking API and # driver methods to support it http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-api.pat...
Looks sane in principle. Not reviewed the code in detail yet.
# Add network support to virError http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-error.p...
This one is troublesome because of the ABI issue. It'll cause issues with the virResetError, virCopyLastError, virConnCopyLastError functions, if the caller passes in an object they allocated themselves. The only way it would not be a problem is if we can ensure that virNetworkErr never gets set unless the caller has called one of the virNetworkXXX functions, because by calling those we can know for sure they've been compiled against a recent set of headers. It'd be a nasty hack though.
# Add net-* commands to virsh http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-virsh.p...
Looks sane in principle. Not reviewed the code in detail yet.
# # Hook up to qemud # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-qemu-st...
Code looks sane, but will need a fixup to use fixed size types & network byte order.
# Implement config parsing etc. http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-config.... # Add support for creating a bridge http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-bridge.... # Add support for starting dnsmasq http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-dnsmasq...
Looks sane in principle. Not reviewed the code in detail yet. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, 2007-01-22 at 16:02 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-error-over...
Looks good.
# Some re-factoring for later http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemud-refactor-...
Looks good, will merge that in my next QEMU patches.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-config-ref...
Likewise, looks good.
So, you're merging these three?
# # Misc libvirt fixes cleanups # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-unused-driver-m...
Yep, we've lived with that baggage for too long
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-rename-handle-t...
Seems reasonable.
I've committed both of these. You should probably merge in the qemud_internal.c parts into your patches. Cheers, Mark.

On Mon, Jan 22, 2007 at 04:27:52PM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-22 at 16:02 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-error-over...
Looks good.
# Some re-factoring for later http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemud-refactor-...
Looks good, will merge that in my next QEMU patches.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-config-ref...
Likewise, looks good.
So, you're merging these three?
Yes, will do.
# # Misc libvirt fixes cleanups # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-unused-driver-m...
Yep, we've lived with that baggage for too long
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-rename-handle-t...
Seems reasonable.
I've committed both of these. You should probably merge in the qemud_internal.c parts into your patches.
Ok, will do. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, 2007-01-22 at 16:02 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-transient....
That should not be neccessary in my latest patches - I fixed up the transient domain cleanup stuff in a slightly different way.
AFAICS, this is still needed ... i.e. $> virsh create foo.xml $> virsh destroy Foo $> virsh create foo.xml libvir: QEMUD error : domain Foo exists already error: Failed to create domain from foo.xml Cheers, Mark.

On Tue, Jan 23, 2007 at 10:59:31AM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-22 at 16:02 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-transient....
That should not be neccessary in my latest patches - I fixed up the transient domain cleanup stuff in a slightly different way.
AFAICS, this is still needed ... i.e.
$> virsh create foo.xml $> virsh destroy Foo $> virsh create foo.xml libvir: QEMUD error : domain Foo exists already error: Failed to create domain from foo.xml
I think one of your other patches must be breaking this, because the plain QEMU patches I posted definitely work: virsh > create /home/berrange/q.xml Domain demo created from /home/berrange/q.xml virsh > list --all Id Name State ---------------------------------- 3 demo running virsh > destroy demo Domain demo destroyed virsh > list --all Id Name State ---------------------------------- virsh > create /home/berrange/q.xml Domain demo created from /home/berrange/q.xml virsh > list --all Id Name State ---------------------------------- 4 demo running The bit of code which cleans up transient domains is at the very end of the method: static int qemudDispatchPoll(struct qemud_server *server, struct pollfd *fds) in qemud/qemud.c It basically iterates over every guest in the inactive domains list, and any which do not have a config file listed (ie, vm->configFile[0] == NULL) are purged from the list. The backend impl for 'create' command ensures this is the case by passing 0 as the last arg to qemudLoadConfigXML() which tells it not to write a persistent config file to disk. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
# Dan's patches http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-daemon.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-driver.pat...
Now updated at: http://people.redhat.com/berrange/libvirt/libvirt-qemu-daemon-2.patch http://people.redhat.com/berrange/libvirt/libvirt-qemu-driver-2.patch The major changes in these two patches since the previous time are: - Client and server now use TLS on TCP sockets (UNIX sockets are plain) - Client must have 4 files in current working dir - ca-cert.pem - CA certificate - ca-crl.pem - CA revocation list - cert.pem - client's certificate - key.pem - client's secret key This should change in future once we decide on how to handle these. - Server can enable TLS support via command line args: libvirt_qemud -l local --tls --tls-cert cert.pem --tls-key key.pem \ --tls-ca-cert ca-cert.pem --tls-ca-crl ca-crl.pem - The wire protocol uses fixed size types & requires network byte order on the wire. - Added a 'hello' message. When first connecting the client sends the max version number it supports & whether it supports clear mode & TLS mode. Server rejects clients with incompatible major, or picks maximum minor version supported by both client & server. If server requires TLS it will reject a client not advertising support of TLS mode. Upon completion of 'hello' request+reply, will do TLS handshake. If successfull, then server will enable the rest of the protocol messages, otherwise it drops the client. NB, there is bucket loads of printf() debugging in these patches since I was still experimenting with the TLS stuff.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-c99.pat...
I simply removed -std=c99 and fixed up places I'd used C99 constructs, so should no longer be needed
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-kqemu.p...
Not merged yet
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-transient....
Now unneccessary
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-error-over... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-free-xpath...
Merged these two.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemud-refactor-... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-config-ref...
Merged these two.
# Hook up to qemud http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-qemu-st...
When updating this you need two core changes: - Change all 'int' to one of int32_t, uint32_t, int64_t, uint64_t - Use 'qemud_wire_32' or 'qemud_wire_64' when reading or writing data to the qemud_packet members. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, 2007-01-22 at 21:20 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
# Dan's patches http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-daemon.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-driver.pat...
Now updated at:
http://people.redhat.com/berrange/libvirt/libvirt-qemu-daemon-2.patch http://people.redhat.com/berrange/libvirt/libvirt-qemu-driver-2.patch
Okay, I've updated my patch set to use them: http://www.gnome.org/~markmc/code/libvirt-networking/ One bug I found was that libvirt fails to connect if you don't have TLS certs, even if the server doesn't request TLS. See attached patch.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-c99.pat...
I simply removed -std=c99 and fixed up places I'd used C99 constructs, so should no longer be needed
_XOPEN_SOURCE and _POSIX_C_SOURCE disable _SVID_SOURCE, so I still need to enable that for struct ifreq and friends. Thanks, Mark.

On Tue, Jan 23, 2007 at 11:08:36AM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-22 at 21:20 +0000, Daniel P. Berrange wrote:
On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
# Dan's patches http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-daemon.pat... http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-driver.pat...
Now updated at:
http://people.redhat.com/berrange/libvirt/libvirt-qemu-daemon-2.patch http://people.redhat.com/berrange/libvirt/libvirt-qemu-driver-2.patch
Okay, I've updated my patch set to use them:
http://www.gnome.org/~markmc/code/libvirt-networking/
One bug I found was that libvirt fails to connect if you don't have TLS certs, even if the server doesn't request TLS. See attached patch.
Ahhhh, yes that makes sense - i'd missed it because I always had the certs lieing in my working dir, even when testing without TLS.
http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-qemu-no-c99.pat...
I simply removed -std=c99 and fixed up places I'd used C99 constructs, so should no longer be needed
_XOPEN_SOURCE and _POSIX_C_SOURCE disable _SVID_SOURCE, so I still need to enable that for struct ifreq and friends.
Adding -D_SVID_SOURCE=1 is compiler args is no problem - its at least a fairly portable extension in comparison to turning on all GNU extensions. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Jan 22, 2007 at 02:46:11PM +0000, Mark McLoughlin wrote:
# Add net-* commands to virsh # http://www.gnome.org/~markmc/code/libvirt-networking/libvirt-network-virsh.p...
(IMHO it's better send a patch to mailing list than an URL...:-) + {"file", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("file containing an XML network description")}, Cannot we use something like _N() rather than gettext_noop() ? + char buffer[4096]; Don't use magic numbers. Use BUFSIZ. +static vshCmdInfo info_network_list[] = { + {"syntax", "net-list"}, {"syntax", "net-list [--inactive | --all]"}, + names = vshMalloc(ctl, sizeof(char *) * maxname); + + if ((maxname = virConnectListDefinedNetworks(ctl->conn, names, maxname)) < 0) { I'm not sure if I read livirt correctly, but where in virsh we deallocate name strings? It's not specific to your network patches. I see everywhere for virConnectList* that we deallocate the "names" which is array of pointers only. See virsh.c: cmdList(): names = vshMalloc(ctl, sizeof(char *) * maxname); if ((maxname = virConnectListDefinedDomains(ctl->conn, names, maxname)) < 0) { .... if (names) free(names); return TRUE; But in xend_internal.c: xenDaemonListDefinedDomains(): names[ret++] = strdup(node->value); ^^^^^^^ where is free() for this string? It seems like nice leak(s). Right? + {"syntax", "start a network "}, {"syntax", "net-start <name>"}, + char uuid[37]; Magic number? :-) #define UUID_STRLEN 36 char uuid[UUID_STRLEN+1]; Karel -- Karel Zak <kzak@redhat.com>

Hi Karel, Thanks for the review ... Note, a lot of the code in the networking patches is just copied and pasted from elsewhere in libvirt, so I'll fix up the original code first. On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ {"file", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("file containing an XML network description")},
Cannot we use something like _N() rather than gettext_noop() ?
That's one for Dan ... I suspect it's just because gettext_noop() worked with xgettext, whereas _N() didn't. We'd need to pass --keyword=_N to xgettext. (Also, it's always been N_() anywhere I've seen it)
names[ret++] = strdup(node->value); ^^^^^^^ where is free() for this string?
It seems like nice leak(s). Right?
Yep, well spotted. I've appended the patch I committed. Thanks, Mark. Index: ChangeLog =================================================================== RCS file: /data/cvs/libvirt/ChangeLog,v retrieving revision 1.319 diff -u -p -r1.319 ChangeLog --- ChangeLog 22 Jan 2007 20:43:02 -0000 1.319 +++ ChangeLog 23 Jan 2007 12:28:16 -0000 @@ -0,0 +1,7 @@ +Mon Jan 23 12:28:42 IST 2007 Mark McLoughlin <markmc@redhat.com> + + Issues pointed out by Karel Zak <kzak@redhat.com> + + * src/virsh.c: fix up some syntax strings, use BUFSIZ + and free names returned from virConnectListDefinedDomains() + Index: src/virsh.c =================================================================== RCS file: /data/cvs/libvirt/src/virsh.c,v retrieving revision 1.42 diff -u -p -r1.42 virsh.c --- src/virsh.c 22 Jan 2007 20:43:02 -0000 1.42 +++ src/virsh.c 23 Jan 2007 12:28:16 -0000 @@ -309,7 +309,7 @@ cmdConnect(vshControl * ctl, vshCmd * cm * "list" command */ static vshCmdInfo info_list[] = { - {"syntax", "list"}, + {"syntax", "list [--inactive | --all]"}, {"help", gettext_noop("list domains")}, {"desc", gettext_noop("Returns list of domains.")}, {NULL, NULL} @@ -419,8 +419,10 @@ cmdList(vshControl * ctl, vshCmd * cmd A virDomainPtr dom = virDomainLookupByName(ctl->conn, names[i]); /* this kind of work with domains is not atomic operation */ - if (!dom) + if (!dom) { + free(names[i]); continue; + } ret = virDomainGetInfo(dom, &info); id = virDomainGetID(dom); @@ -439,6 +441,7 @@ cmdList(vshControl * ctl, vshCmd * cmd A } virDomainFree(dom); + free(names[i]); } if (ids) free(ids); @@ -546,7 +549,7 @@ cmdCreate(vshControl * ctl, vshCmd * cmd char *from; int found; int ret = TRUE; - char buffer[4096]; + char buffer[BUFSIZ]; int fd, l; if (!vshConnectionUsability(ctl, ctl->conn, TRUE)) @@ -601,7 +604,7 @@ cmdDefine(vshControl * ctl, vshCmd * cmd char *from; int found; int ret = TRUE; - char buffer[4096]; + char buffer[BUFSIZ]; int fd, l; if (!vshConnectionUsability(ctl, ctl->conn, TRUE)) @@ -677,7 +680,7 @@ cmdUndefine(vshControl * ctl, vshCmd * c * "start" command */ static vshCmdInfo info_start[] = { - {"syntax", "start a domain "}, + {"syntax", "start <domain>"}, {"help", gettext_noop("start a (previously defined) inactive domain")}, {"desc", gettext_noop("Start a domain.")}, {NULL, NULL}

On Tue, Jan 23, 2007 at 12:35:09PM +0000, Mark McLoughlin wrote:
Note, a lot of the code in the networking patches is just copied and pasted from elsewhere in libvirt, so I'll fix up the original code first.
We do have test cases for some of the virsh commands, so I'm going to check to see why these leaks were not picked up by valgrind - we have a special 'make valgrind' target in the tests directoy which runs all the test suites under the valgrind leak checker.
On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ {"file", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("file containing an XML network description")},
Cannot we use something like _N() rather than gettext_noop() ?
That's one for Dan ... I suspect it's just because gettext_noop() worked with xgettext, whereas _N() didn't. We'd need to pass --keyword=_N to xgettext.
Yes, that's exactly the reason - gettext_noop() is automatically understood by xgettext, and since there's not really many places we use it I didn't feel need to #define a shorter variant. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ char uuid[37];
Magic number? :-)
#define UUID_STRLEN 36
char uuid[UUID_STRLEN+1];
Good point. Here's a proposed API addition to put the buffer lengths as macros in libvirt.h. Anyone got objections to that? Cheers, Mark. Index: libvirt/include/libvirt/libvirt.h.in =================================================================== --- libvirt.orig/include/libvirt/libvirt.h.in +++ libvirt/include/libvirt/libvirt.h.in @@ -187,6 +187,24 @@ struct _virNodeInfo { typedef virNodeInfo *virNodeInfoPtr; +/** + * VIR_UUID_STRING_BUFLEN: + * + * This macro provides the length of the buffer required + * for virDomainGetUUID() + */ + +#define VIR_UUID_BUFLEN (16) + +/** + * VIR_UUID_STRING_BUFLEN: + * + * This macro provides the length of the buffer required + * for virDomainGetUUIDString() + */ + +#define VIR_UUID_STRING_BUFLEN (36+1) + /* library versionning */ /** Index: libvirt/proxy/libvirt_proxy.c =================================================================== --- libvirt.orig/proxy/libvirt_proxy.c +++ libvirt/proxy/libvirt_proxy.c @@ -462,7 +462,7 @@ retry2: break; case VIR_PROXY_LOOKUP_ID: { char *name = NULL; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; int len; if (req->len != sizeof(virProxyPacket)) @@ -476,9 +476,9 @@ retry2: len = 1000; name[1000] = 0; } - req->len += 16 + len + 1; - memcpy(&request.extra.str[0], uuid, 16); - strcpy(&request.extra.str[16], name); + req->len += VIR_UUID_BUFLEN + len + 1; + memcpy(&request.extra.str[0], uuid, VIR_UUID_BUFLEN); + strcpy(&request.extra.str[VIR_UUID_BUFLEN], name); } if (name) free(name); @@ -489,9 +489,9 @@ retry2: char **tmp; int ident, len; char *name = NULL; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; - if (req->len != sizeof(virProxyPacket) + 16) + if (req->len != sizeof(virProxyPacket) + VIR_UUID_BUFLEN) goto comm_error; /* @@ -504,7 +504,7 @@ retry2: if (names != NULL) { while (*tmp != NULL) { ident = xenDaemonDomainLookupByName_ids(conn, *tmp, &uuid[0]); - if (!memcmp(uuid, &request.extra.str[0], 16)) { + if (!memcmp(uuid, &request.extra.str[0], VIR_UUID_BUFLEN)) { name = *tmp; break; } @@ -530,7 +530,7 @@ retry2: } case VIR_PROXY_LOOKUP_NAME: { int ident; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; if (req->len > sizeof(virProxyPacket) + 1000) goto comm_error; @@ -542,8 +542,8 @@ retry2: req->data.arg = -1; req->len = sizeof(virProxyPacket); } else { - req->len = sizeof(virProxyPacket) + 16; - memcpy(&request.extra.str[0], uuid, 16); + req->len = sizeof(virProxyPacket) + VIR_UUID_BUFLEN; + memcpy(&request.extra.str[0], uuid, VIR_UUID_BUFLEN); req->data.arg = ident; } break; Index: libvirt/src/hash.c =================================================================== --- libvirt.orig/src/hash.c +++ libvirt/src/hash.c @@ -759,7 +759,7 @@ virGetDomain(virConnectPtr conn, const c ret->conn = conn; ret->id = -1; if (uuid != NULL) - memcpy(&(ret->uuid[0]), uuid, 16); + memcpy(&(ret->uuid[0]), uuid, VIR_UUID_BUFLEN); if (virHashAddEntry(conn->domains, name, ret) < 0) { virHashError(conn, VIR_ERR_INTERNAL_ERROR, Index: libvirt/src/internal.h =================================================================== --- libvirt.orig/src/internal.h +++ libvirt/src/internal.h @@ -145,15 +145,15 @@ enum { * Internal structure associated to a domain */ struct _virDomain { - unsigned int magic; /* specific value to check */ - int uses; /* reference count */ - virConnectPtr conn; /* pointer back to the connection */ - char *name; /* the domain external name */ - char *path; /* the domain internal path */ - int id; /* the domain ID */ - int flags; /* extra flags */ - unsigned char uuid[16]; /* the domain unique identifier */ - char *xml; /* the XML description for defined domains */ + unsigned int magic; /* specific value to check */ + int uses; /* reference count */ + virConnectPtr conn; /* pointer back to the connection */ + char *name; /* the domain external name */ + char *path; /* the domain internal path */ + int id; /* the domain ID */ + int flags; /* extra flags */ + unsigned char uuid[VIR_UUID_BUFLEN]; /* the domain unique identifier */ + char *xml; /* the XML description for defined domains */ }; /* Index: libvirt/src/libvirt.c =================================================================== --- libvirt.orig/src/libvirt.c +++ libvirt/src/libvirt.c @@ -645,8 +645,8 @@ virDomainLookupByUUID(virConnectPtr conn virDomainPtr virDomainLookupByUUIDString(virConnectPtr conn, const char *uuidstr) { - int raw[16], i; - unsigned char uuid[16]; + int raw[VIR_UUID_BUFLEN], i; + unsigned char uuid[VIR_UUID_BUFLEN]; int ret; if (!VIR_IS_CONNECT(conn)) { @@ -672,11 +672,11 @@ virDomainLookupByUUIDString(virConnectPt raw + 8, raw + 9, raw + 10, raw + 11, raw + 12, raw + 13, raw + 14, raw + 15); - if (ret!=16) { + if (ret!=VIR_UUID_BUFLEN) { virLibConnError(conn, VIR_ERR_INVALID_ARG, __FUNCTION__); return (NULL); } - for (i = 0; i < 16; i++) + for (i = 0; i < VIR_UUID_BUFLEN; i++) uuid[i] = raw[i] & 0xFF; return virDomainLookupByUUID(conn, &uuid[0]); @@ -1205,7 +1205,7 @@ virDomainGetName(virDomainPtr domain) /** * virDomainGetUUID: * @domain: a domain object - * @uuid: pointer to a 16 bytes array + * @uuid: pointer to a VIR_UUID_BUFLEN bytes array * * Get the UUID for a domain * @@ -1224,7 +1224,7 @@ virDomainGetUUID(virDomainPtr domain, un } if (domain->id == 0) { - memset(uuid, 0, 16); + memset(uuid, 0, VIR_UUID_BUFLEN); } else { if ((domain->uuid[0] == 0) && (domain->uuid[1] == 0) && (domain->uuid[2] == 0) && (domain->uuid[3] == 0) && @@ -1236,7 +1236,7 @@ virDomainGetUUID(virDomainPtr domain, un (domain->uuid[14] == 0) && (domain->uuid[15] == 0)) xenDaemonDomainLookupByName_ids(domain->conn, domain->name, &domain->uuid[0]); - memcpy(uuid, &domain->uuid[0], 16); + memcpy(uuid, &domain->uuid[0], VIR_UUID_BUFLEN); } return (0); } @@ -1244,7 +1244,7 @@ virDomainGetUUID(virDomainPtr domain, un /** * virDomainGetUUIDString: * @domain: a domain object - * @buf: pointer to a 37 bytes array + * @buf: pointer to a VIR_UUID_STRING_BUFLEN bytes array * * Get the UUID for a domain as string. For more information about * UUID see RFC4122. @@ -1254,7 +1254,7 @@ virDomainGetUUID(virDomainPtr domain, un int virDomainGetUUIDString(virDomainPtr domain, char *buf) { - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; if (!VIR_IS_DOMAIN(domain)) { virLibDomainError(domain, VIR_ERR_INVALID_DOMAIN, __FUNCTION__); @@ -1268,7 +1268,7 @@ virDomainGetUUIDString(virDomainPtr doma if (virDomainGetUUID(domain, &uuid[0])) return (-1); - snprintf(buf, 37, + snprintf(buf, VIR_UUID_STRING_BUFLEN, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x", uuid[0], uuid[1], uuid[2], uuid[3], uuid[4], uuid[5], uuid[6], uuid[7], Index: libvirt/src/proxy_internal.c =================================================================== --- libvirt.orig/src/proxy_internal.c +++ libvirt/src/proxy_internal.c @@ -761,7 +761,7 @@ xenProxyLookupByID(virConnectPtr conn, i { virProxyPacket req; virProxyFullPacket ans; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; const char *name; int ret; virDomainPtr res; @@ -786,8 +786,8 @@ xenProxyLookupByID(virConnectPtr conn, i if (ans.data.arg == -1) { return(NULL); } - memcpy(uuid, &ans.extra.str[0], 16); - name = &ans.extra.str[16]; + memcpy(uuid, &ans.extra.str[0], VIR_UUID_BUFLEN); + name = &ans.extra.str[VIR_UUID_BUFLEN]; res = virGetDomain(conn, name, uuid); if (res == NULL) @@ -825,7 +825,7 @@ xenProxyLookupByUUID(virConnectPtr conn, } memset(&req, 0, sizeof(virProxyPacket)); req.command = VIR_PROXY_LOOKUP_UUID; - req.len = sizeof(virProxyPacket) + 16; + req.len = sizeof(virProxyPacket) + VIR_UUID_BUFLEN; ret = xenProxyCommand(conn, (virProxyPacketPtr) &req, &req, 0); if (ret < 0) { xenProxyClose(conn); Index: libvirt/src/test.c =================================================================== --- libvirt.orig/src/test.c +++ libvirt/src/test.c @@ -139,7 +139,7 @@ typedef struct _testDom { int active; int id; char name[20]; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; virDomainKernel kernel; virDomainInfo info; unsigned int maxVCPUs; @@ -247,7 +247,7 @@ static int testLoadDomain(virConnectPtr xmlXPathContextPtr ctxt = NULL; xmlXPathObjectPtr obj = NULL; char *name = NULL; - unsigned char rawuuid[16]; + unsigned char rawuuid[VIR_UUID_BUFLEN]; char *dst_uuid; testCon *con; struct timeval tv; @@ -397,7 +397,7 @@ static int testLoadDomain(virConnectPtr if (memory > maxMem) memory = maxMem; - memmove(con->domains[handle].uuid, rawuuid, 16); + memmove(con->domains[handle].uuid, rawuuid, VIR_UUID_BUFLEN); con->domains[handle].info.maxMem = maxMem; con->domains[handle].info.memory = memory; con->domains[handle].info.state = domid < 0 ? VIR_DOMAIN_SHUTOFF : VIR_DOMAIN_RUNNING; @@ -487,7 +487,7 @@ static int testOpenDefault(virConnectPtr node->connections[connid].domains[0].onCrash = VIR_DOMAIN_RESTART; node->connections[connid].domains[0].onPoweroff = VIR_DOMAIN_DESTROY; strcpy(node->connections[connid].domains[0].name, "test"); - for (u = 0 ; u < 16 ; u++) { + for (u = 0 ; u < VIR_UUID_BUFLEN ; u++) { node->connections[connid].domains[0].uuid[u] = (u * 75)%255; } node->connections[connid].domains[0].info.maxMem = 8192 * 1024; @@ -901,7 +901,7 @@ virDomainPtr testLookupDomainByUUID(virC int i, idx = -1; for (i = 0 ; i < MAX_DOMAINS ; i++) { if (con->domains[i].active && - memcmp(uuid, con->domains[i].uuid, 16) == 0) { + memcmp(uuid, con->domains[i].uuid, VIR_UUID_BUFLEN) == 0) { idx = i; break; } Index: libvirt/src/virsh.c =================================================================== --- libvirt.orig/src/virsh.c +++ libvirt/src/virsh.c @@ -1030,7 +1030,7 @@ cmdDominfo(vshControl * ctl, vshCmd * cm virDomainPtr dom; int ret = TRUE; unsigned int id; - char *str, uuid[37]; + char *str, uuid[VIR_UUID_STRING_BUFLEN]; if (!vshConnectionUsability(ctl, ctl->conn, TRUE)) return FALSE; @@ -1535,7 +1535,7 @@ static int cmdDomuuid(vshControl * ctl, vshCmd * cmd) { virDomainPtr dom; - char uuid[37]; + char uuid[VIR_UUID_STRING_BUFLEN]; if (!vshConnectionUsability(ctl, ctl->conn, TRUE)) return FALSE; Index: libvirt/src/xend_internal.c =================================================================== --- libvirt.orig/src/xend_internal.c +++ libvirt/src/xend_internal.c @@ -1100,7 +1100,7 @@ xenDaemonDomainLookupByName_ids(virConne int ret = -1; if (uuid != NULL) - memset(uuid, 0, 16); + memset(uuid, 0, VIR_UUID_BUFLEN); root = sexpr_get(xend, "/xend/domain/%s?detail=1", domname); if (root == NULL) goto error; @@ -1152,7 +1152,7 @@ xenDaemonDomainLookupByID(virConnectPtr char *dst_uuid; struct sexpr *root; - memset(uuid, 0, 16); + memset(uuid, 0, VIR_UUID_BUFLEN); root = sexpr_get(xend, "/xend/domain/%d?detail=1", id); if (root == NULL) @@ -1939,7 +1939,7 @@ sexpr_to_domain(virConnectPtr conn, stru { virDomainPtr ret = NULL; char *dst_uuid = NULL; - char uuid[16]; + char uuid[VIR_UUID_BUFLEN]; const char *name; const char *tmp; @@ -2728,7 +2728,7 @@ error: static virDomainPtr xenDaemonLookupByID(virConnectPtr conn, int id) { char *name = NULL; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; virDomainPtr ret; if (xenDaemonDomainLookupByID(conn, id, &name, uuid) < 0) { @@ -2762,7 +2762,7 @@ xenDaemonLookupByID(virConnectPtr conn, int xenDaemonDomainSetVcpus(virDomainPtr domain, unsigned int vcpus) { - char buf[16]; + char buf[VIR_UUID_BUFLEN]; if ((domain == NULL) || (domain->conn == NULL) || (domain->name == NULL) || (vcpus < 1)) { @@ -2793,7 +2793,7 @@ int xenDaemonDomainPinVcpu(virDomainPtr domain, unsigned int vcpu, unsigned char *cpumap, int maplen) { - char buf[16], mapstr[sizeof(cpumap_t) * 64] = "["; + char buf[VIR_UUID_BUFLEN], mapstr[sizeof(cpumap_t) * 64] = "["; int i, j; if ((domain == NULL) || (domain->conn == NULL) || (domain->name == NULL) @@ -2929,7 +2929,7 @@ xenDaemonLookupByUUID(virConnectPtr conn char *name = NULL; char **names; char **tmp; - unsigned char ident[16]; + unsigned char ident[VIR_UUID_BUFLEN]; int id = -1; names = xenDaemonListDomainsOld(conn); @@ -2942,7 +2942,7 @@ xenDaemonLookupByUUID(virConnectPtr conn while (*tmp != NULL) { id = xenDaemonDomainLookupByName_ids(conn, *tmp, &ident[0]); if (id >= 0) { - if (!memcmp(uuid, ident, 16)) { + if (!memcmp(uuid, ident, VIR_UUID_BUFLEN)) { name = strdup(*tmp); break; } Index: libvirt/src/xm_internal.c =================================================================== --- libvirt.orig/src/xm_internal.c +++ libvirt/src/xm_internal.c @@ -209,7 +209,7 @@ static int xenXMConfigGetUUID(virConfPtr have one in its config */ static void xenXMConfigGenerateUUID(unsigned char *uuid) { int i; - for (i = 0 ; i < 16 ; i++) { + for (i = 0 ; i < VIR_UUID_BUFLEN ; i++) { uuid[i] = (unsigned char)(1 + (int) (256.0 * (rand() / (RAND_MAX + 1.0)))); } } @@ -217,7 +217,7 @@ static void xenXMConfigGenerateUUID(unsi /* Ensure that a config object has a valid UUID in it, if it doesn't then (re-)generate one */ static int xenXMConfigEnsureIdentity(virConfPtr conf, const char *filename) { - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; const char *name; /* Had better have a name...*/ @@ -242,7 +242,7 @@ static int xenXMConfigEnsureIdentity(vir /* If there is no uuid...*/ if (xenXMConfigGetUUID(conf, "uuid", uuid) < 0) { virConfValuePtr value; - char uuidstr[37]; + char uuidstr[VIR_UUID_STRING_BUFLEN]; value = malloc(sizeof(virConfValue)); if (!value) { @@ -251,7 +251,7 @@ static int xenXMConfigEnsureIdentity(vir /* ... then generate one */ xenXMConfigGenerateUUID(uuid); - snprintf(uuidstr, 37, + snprintf(uuidstr, VIR_UUID_STRING_BUFLEN, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x</uuid>\n", uuid[0], uuid[1], uuid[2], uuid[3], uuid[4], uuid[5], uuid[6], uuid[7], @@ -565,7 +565,7 @@ char *xenXMDomainFormatXML(virConnectPtr virBufferPtr buf; char *xml; const char *name; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; const char *str; int hvm = 0; long val; @@ -1168,7 +1168,7 @@ virDomainPtr xenXMDomainLookupByName(vir const char *filename; xenXMConfCachePtr entry; virDomainPtr ret; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; if (!VIR_IS_CONNECT(conn)) { xenXMError(conn, VIR_ERR_INVALID_CONN, __FUNCTION__); return (NULL); @@ -1209,7 +1209,7 @@ virDomainPtr xenXMDomainLookupByName(vir * Hash table iterator to search for a domain based on UUID */ static int xenXMDomainSearchForUUID(const void *payload, const char *name ATTRIBUTE_UNUSED, const void *data) { - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; const unsigned char *wantuuid = (const unsigned char *)data; const xenXMConfCachePtr entry = (const xenXMConfCachePtr)payload; @@ -1217,7 +1217,7 @@ static int xenXMDomainSearchForUUID(cons return (0); } - if (!memcmp(uuid, wantuuid, 16)) + if (!memcmp(uuid, wantuuid, VIR_UUID_BUFLEN)) return (1); return (0); @@ -1271,7 +1271,7 @@ int xenXMDomainCreate(virDomainPtr domai char *xml; char *sexpr; int ret; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; if ((domain == NULL) || (domain->conn == NULL) || (domain->name == NULL)) { xenXMError((domain ? domain->conn : NULL), VIR_ERR_INVALID_ARG, @@ -2046,7 +2046,7 @@ virConfPtr xenXMParseXMLToConfig(virConn virDomainPtr xenXMDomainDefineXML(virConnectPtr conn, const char *xml) { virDomainPtr ret; char filename[PATH_MAX]; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; virConfPtr conf = NULL; xenXMConfCachePtr entry = NULL; virConfValuePtr value; Index: libvirt/src/xml.c =================================================================== --- libvirt.orig/src/xml.c +++ libvirt/src/xml.c @@ -525,7 +525,7 @@ char * virDomainGetXMLDesc(virDomainPtr domain, int flags) { char *ret = NULL; - unsigned char uuid[16]; + unsigned char uuid[VIR_UUID_BUFLEN]; virBuffer buf; virDomainInfo info; @@ -1533,7 +1533,7 @@ virDomainParseXMLDesc(const char *xmldes unsigned char *virParseUUID(char **ptr, const char *uuid) { - int rawuuid[16]; + int rawuuid[VIR_UUID_BUFLEN]; const char *cur; unsigned char *dst_uuid = NULL; int i; @@ -1546,7 +1546,7 @@ unsigned char *virParseUUID(char **ptr, * pairs as long as there is 32 of them in the end. */ cur = uuid; - for (i = 0;i < 16;) { + for (i = 0;i < VIR_UUID_BUFLEN;) { rawuuid[i] = 0; if (*cur == 0) goto error; @@ -1581,7 +1581,7 @@ unsigned char *virParseUUID(char **ptr, dst_uuid = (unsigned char *) *ptr; *ptr += 16; - for (i = 0; i < 16; i++) + for (i = 0; i < VIR_UUID_BUFLEN; i++) dst_uuid[i] = rawuuid[i] & 0xFF; error:

On Tue, Jan 23, 2007 at 12:37:50PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ char uuid[37];
Magic number? :-)
#define UUID_STRLEN 36
char uuid[UUID_STRLEN+1];
Good point. Here's a proposed API addition to put the buffer lengths as macros in libvirt.h.
Anyone got objections to that?
go for it, even if UUID size really should not change :-) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Tue, Jan 23, 2007 at 12:37:50PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ char uuid[37];
Magic number? :-)
#define UUID_STRLEN 36
char uuid[UUID_STRLEN+1];
Good point. Here's a proposed API addition to put the buffer lengths as macros in libvirt.h.
Anyone got objections to that?
No objections, but I think if we're going to do this, we should take it one step further and provide APIs for converting between RAW & Printable versions of UUID in both directions. Currently we're just duping this conversion code all over the place - with inconsistent use of '-' in the printable versions. /* * uuidstr: the printable UUID string * uuid: pre-allocated buffer of length VIR_UUID_BUFLEN */ int virUUIDParseString(const char *uuidstr, unsigned char *uuid) /* * uuid: the raw UUID valu exactly VIR_UUID_BUFLEN bytes long * uuidstr: pre-allocated buffer of length VIR_UUID_STRING_BUFLEN * to be filled in printable UUID */ int virUUIDFormatString(const unsigned char *uuid, char *uuidstr) Oh and a thing to generate a random UUID too is needed by both the xm_internal and qemu & test backends int virUUIDGenerate(unsigned char *uuid); Probably we only need any of this stuff in the internal headers though, rather than public facing
--- libvirt.orig/include/libvirt/libvirt.h.in +++ libvirt/include/libvirt/libvirt.h.in @@ -187,6 +187,24 @@ struct _virNodeInfo {
typedef virNodeInfo *virNodeInfoPtr;
+/** + * VIR_UUID_STRING_BUFLEN: + * + * This macro provides the length of the buffer required + * for virDomainGetUUID() + */ + +#define VIR_UUID_BUFLEN (16) + +/** + * VIR_UUID_STRING_BUFLEN: + * + * This macro provides the length of the buffer required + * for virDomainGetUUIDString() + */ + +#define VIR_UUID_STRING_BUFLEN (36+1) + /* library versionning */
Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Jan 23, 2007 at 12:59:49PM +0000, Daniel P. Berrange wrote:
On Tue, Jan 23, 2007 at 12:37:50PM +0000, Mark McLoughlin wrote:
On Tue, 2007-01-23 at 11:20 +0100, Karel Zak wrote:
+ char uuid[37];
Magic number? :-)
#define UUID_STRLEN 36
char uuid[UUID_STRLEN+1];
Good point. Here's a proposed API addition to put the buffer lengths as macros in libvirt.h.
Anyone got objections to that?
No objections, but I think if we're going to do this, we should take it one step further and provide APIs for converting between RAW & Printable versions of UUID in both directions. Currently we're just duping this conversion code all over the place - with inconsistent use of '-' in the printable versions.
/* * uuidstr: the printable UUID string * uuid: pre-allocated buffer of length VIR_UUID_BUFLEN */ int virUUIDParseString(const char *uuidstr, unsigned char *uuid)
/* * uuid: the raw UUID valu exactly VIR_UUID_BUFLEN bytes long * uuidstr: pre-allocated buffer of length VIR_UUID_STRING_BUFLEN * to be filled in printable UUID */ int virUUIDFormatString(const unsigned char *uuid, char *uuidstr)
Oh and a thing to generate a random UUID too is needed by both the xm_internal and qemu & test backends
int virUUIDGenerate(unsigned char *uuid);
Probably we only need any of this stuff in the internal headers though, rather than public facing
Mark's patch extend the public API, that's fine, but I think the conversion internally should be private. IMHO that's part of the many think we identified to go in /lib/ shared code. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Tue, 2007-01-23 at 12:59 +0000, Daniel P. Berrange wrote:
Probably we only need any of this stuff in the internal headers though, rather than public facing
The buffer lengths should be public, though. Callers of GetID() etc. need to pass in buffers of the correct length. Cheers, Mark.

On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Okay, here's a suggestion - a virConnectPtr is a connection to a specific hypervisor *and* the virtual network supervisor on that same physical machine or user session. What I like about this is that if a domain's description mentions a virtual network, that name is scoped within the network supervisor associated with the hypervisor on which the guest is being created. Attached is a kind of a hacky patch to do something like that - if e.g. you connect to xend, you also get a "network-only" connection to qemud for managing networks. Cheers, Mark.

On Thu, Jan 25, 2007 at 03:49:41PM +0000, Mark McLoughlin wrote:
On Mon, 2007-01-15 at 20:06 +0000, Mark McLoughlin wrote:
* Since virConnect is supposed to be a connection to a specific hypervisor, does it make sense to create networks (which should be hypervisor agnostic) through virConnect?
Okay, here's a suggestion - a virConnectPtr is a connection to a specific hypervisor *and* the virtual network supervisor on that same physical machine or user session.
What I like about this is that if a domain's description mentions a virtual network, that name is scoped within the network supervisor associated with the hypervisor on which the guest is being created.
Attached is a kind of a hacky patch to do something like that - if e.g. you connect to xend, you also get a "network-only" connection to qemud for managing networks.
That's an interesting idea - your description.. "a virConnectPtr is a connection to a specific hypervisor *and* the virtual network supervisor" ..makes me thing of a slightly alternative impl. We currently have a single internal driver API 'virDriverPtr' which is just a list of function pointers for all the HV related calls. Rather than making that struct bigger to add in networking calls, how about we define two separate internal driver APIs virHypervisorDriver virNetworkDriver It strikes me that most of the different hypervisor backends will simply want to re-use the same network driver backend, so why not properly de-couple them. The virConnectOpen function would thus first lookup a hypervisor driver, and then lookup a networking driver. This avoids the somewhat nasty issue of having to figure out how to activate multiple non-conficting drivers to get the correct combo of HV & network stuff. The only small complication would be ensuring that the HV driver and network driver didn't need to open 2 separate TCP connections to the same place, but I'm sure we'd be able to figure that out. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, 2007-01-25 at 16:02 +0000, Daniel P. Berrange wrote:
That's an interesting idea - your description..
"a virConnectPtr is a connection to a specific hypervisor *and* the virtual network supervisor"
..makes me thing of a slightly alternative impl. We currently have a single internal driver API 'virDriverPtr' which is just a list of function pointers for all the HV related calls. Rather than making that struct bigger to add in networking calls, how about we define two separate internal driver APIs
virHypervisorDriver virNetworkDriver
It strikes me that most of the different hypervisor backends will simply want to re-use the same network driver backend, so why not properly de-couple them. The virConnectOpen function would thus first lookup a hypervisor driver, and then lookup a networking driver. This avoids the somewhat nasty issue of having to figure out how to activate multiple non-conficting drivers to get the correct combo of HV & network stuff.
Yep, that'd be the best way to do it.
The only small complication would be ensuring that the HV driver and network driver didn't need to open 2 separate TCP connections to the same place, but I'm sure we'd be able to figure that out.
Yep. (Also, if we go with this approach, we should probably also do s/qemud/libvirtd/ or something ... but we need to figure out what to do wrt. the "other libvirtd" first :-) Cheers, Mark.

On Thu, Jan 25, 2007 at 04:07:57PM +0000, Mark McLoughlin wrote:
On Thu, 2007-01-25 at 16:02 +0000, Daniel P. Berrange wrote:
That's an interesting idea - your description..
"a virConnectPtr is a connection to a specific hypervisor *and* the virtual network supervisor"
..makes me thing of a slightly alternative impl. We currently have a single internal driver API 'virDriverPtr' which is just a list of function pointers for all the HV related calls. Rather than making that struct bigger to add in networking calls, how about we define two separate internal driver APIs
virHypervisorDriver virNetworkDriver
It strikes me that most of the different hypervisor backends will simply want to re-use the same network driver backend, so why not properly de-couple them. The virConnectOpen function would thus first lookup a hypervisor driver, and then lookup a networking driver. This avoids the somewhat nasty issue of having to figure out how to activate multiple non-conficting drivers to get the correct combo of HV & network stuff.
Yep, that'd be the best way to do it.
The only small complication would be ensuring that the HV driver and network driver didn't need to open 2 separate TCP connections to the same place, but I'm sure we'd be able to figure that out.
Yep.
(Also, if we go with this approach, we should probably also do s/qemud/libvirtd/ or something ... but we need to figure out what to do wrt. the "other libvirtd" first :-)
Indeed - since Rich Jones is looking at a more general purpose libvirtd, we can adapt QEMU impl to play nicely with the generic daemon. I figure separate out the QEMU bits from qemud and turn them into a regular libvirt driver. Then set it up so that this driver is only used when invoked by the libvirtd, and not ever directly by the client lib - that way we still ensure a single daemon managing QEMU per node, without coupling the QEMU impl to the daemon itself. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (7)
-
Aron Griffis
-
Daniel P. Berrange
-
Daniel Veillard
-
Hugh Brock
-
Karel Zak
-
Mark McLoughlin
-
Richard W.M. Jones