[Libvir] Next features and target for development

Hi all, now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant: - migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part. - USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118 - Support for Xen-API new entry points at least for localhost access since we have remote support now - platform support: resolve the PPC64 issues - more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ? Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ? Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

* Daniel Veillard <veillard@redhat.com> [2007-07-10 10:51]:
Hi all,
now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant:
- migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part.
- USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118
- Support for Xen-API new entry points at least for localhost access since we have remote support now
- platform support: resolve the PPC64 issues
- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
NUMA support? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com

On Tue, Jul 10, 2007 at 10:57:24AM -0500, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-07-10 10:51]:
Hi all,
now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant:
- migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part.
- USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118
- Support for Xen-API new entry points at least for localhost access since we have remote support now
- platform support: resolve the PPC64 issues
- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
NUMA support?
I knew I forgot something obvious :-) https://www.redhat.com/archives/libvir-list/2007-June/thread.html#00081 Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote:
Hi all,
now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant:
- migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part.
Some issues around migration which are up for discussion: (1) Where does the data travel (ie. server -> server, server -> client -> server)? If it goes server -> server, what about routing, firewalls, etc.? (2) Are the hosts compatible? (eg. microarchitecture, hypervisor, ... other sources of incompatibility?) Should we care at the API level or just report errors from the hypervisors? (3) Should the data be encrypted when it travels? Xen uses a particular port & protocol for migration. Is this protocol one-way or two-way?
- USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118
- Support for Xen-API new entry points at least for localhost access since we have remote support now
Is there stuff we can do with Xen-API which we can't do with the sexpr API? Are upstream going to continue offering both APIs?
- platform support: resolve the PPC64 issues
- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
Or VirtualBox-OSE, which I've been playing around with today. Other things to think about: - Storage API. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Tue, Jul 10, 2007 at 05:12:38PM +0100, Richard W.M. Jones wrote:
Daniel Veillard wrote:
Hi all,
now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant:
- migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part.
Some issues around migration which are up for discussion:
(1) Where does the data travel (ie. server -> server, server -> client -> server)? If it goes server -> server, what about routing, firewalls, etc.?
I think we assume server <-> server & that the admin has the servers in a particular cluster setup to be able to talk with each other.
(2) Are the hosts compatible? (eg. microarchitecture, hypervisor, ... other sources of incompatibility?) Should we care at the API level or just report errors from the hypervisors?
The app can validate to some degree by looking at the capabilities XML to see what's supported on both ends & what the guest is using. So I'd let the app do high level validation with that, and just then propagate errors from the underlying system - keeping poliucy out of the libvirt API.
(3) Should the data be encrypted when it travels?
Implementation defined, so not something for libvirt to worry about. Its a separate task to make XenD use SSL/TLS for communcating with other XenDs
Xen uses a particular port & protocol for migration. Is this protocol one-way or two-way?
- USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118
- Support for Xen-API new entry points at least for localhost access since we have remote support now
Is there stuff we can do with Xen-API which we can't do with the sexpr API? Are upstream going to continue offering both APIs?
Upstream would like to kill the SEXPR api. So far we've prevented them doing this. SEXPR is limited in so much as its all, or nothing & this has horrific performance issues, because pulling out all the device info hits xenstored alot - hence a single SEXPR request can take as much as 1 second to complete :-( Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Jul 10, 2007 at 05:23:01PM +0100, Daniel P. Berrange wrote:
(2) Are the hosts compatible? (eg. microarchitecture, hypervisor, ... other sources of incompatibility?) Should we care at the API level or just report errors from the hypervisors?
The app can validate to some degree by looking at the capabilities XML to see what's supported on both ends & what the guest is using. So I'd let the app do high level validation with that, and just then propagate errors from the underlying system - keeping poliucy out of the libvirt API.
that was my take too. We can check a few things but as Dan said I would rather rely on propagating errors back.
(3) Should the data be encrypted when it travels?
Implementation defined, so not something for libvirt to worry about. Its a separate task to make XenD use SSL/TLS for communcating with other XenDs
It's sure a concern, the APIs should have a flag, and requirement about secure migration could be one of the bits.
Xen uses a particular port & protocol for migration. Is this protocol one-way or two-way?
initiated by the current host for the domain from what I could find some time ago, then once the TCP connection is open it's used until the migration has finished IIRC.
- Support for Xen-API new entry points at least for localhost access since we have remote support now
Is there stuff we can do with Xen-API which we can't do with the sexpr API? Are upstream going to continue offering both APIs?
Upstream would like to kill the SEXPR api. So far we've prevented them doing this. SEXPR is limited in so much as its all, or nothing & this has horrific performance issues, because pulling out all the device info hits xenstored alot - hence a single SEXPR request can take as much as 1 second to complete :-(
yeah and I'm afraid sexpr may get removed in the future, the earlier we allow both the smoother it should be for the user base. The call I'm really afraid of is the creation once, there is (was) a lot of structures to encode in XML :-\ Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Tue, 2007-07-10 at 17:23 +0100, Daniel P. Berrange wrote:
On Tue, Jul 10, 2007 at 05:12:38PM +0100, Richard W.M. Jones wrote:
(2) Are the hosts compatible? (eg. microarchitecture, hypervisor, ... other sources of incompatibility?) Should we care at the API level or just report errors from the hypervisors?
The app can validate to some degree by looking at the capabilities XML to see what's supported on both ends & what the guest is using. So I'd let the app do high level validation with that, and just then propagate errors from the underlying system - keeping poliucy out of the libvirt API.
This question comes up in other contexts than migration, too, for example, when you want to start an image you just downloaded. I think it would make sense if there was a common baseline in libvirt that could tell you if a VM has any chance of running at all - otherwise that logic will be scattered across lots of apps. David

DL> This question comes up in other contexts than migration, too, for DL> example, when you want to start an image you just downloaded. I DL> think it would make sense if there was a common baseline in DL> libvirt that could tell you if a VM has any chance of running at DL> all - otherwise that logic will be scattered across lots of apps. The migration case is significantly more strict, though, at least for Xen. A domain image may start up on two machines that are different, but the same domain would not migrate from one to the other. On that note, how about a libvirt function that allows you to compare the capabilities of two hypervisors for varying levels of compatibility? A given implementation such as Xen could analyze the hypervisor and machine characteristics to provide a "compatible for migration" result, as well as a "can run on" result (and perhaps others). Something like the following: virIsCompatible(hyp1, hyp2, COMPAT_MIGRATION); virIsCompatible(hyp1, hyp3, COMPAT_MIGRATION | COMPAT_RUN); The former would return true (in the Xen case) only if both machines were the same bitness, processor revision, etc. The latter could, potentially, return true for a given domain across Xen and qemu, if that domain is fully-virtualized. -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Tue, Jul 10, 2007 at 03:01:27PM -0700, Dan Smith wrote:
DL> This question comes up in other contexts than migration, too, for DL> example, when you want to start an image you just downloaded. I DL> think it would make sense if there was a common baseline in DL> libvirt that could tell you if a VM has any chance of running at DL> all - otherwise that logic will be scattered across lots of apps.
The migration case is significantly more strict, though, at least for Xen. A domain image may start up on two machines that are different, but the same domain would not migrate from one to the other.
On that note, how about a libvirt function that allows you to compare the capabilities of two hypervisors for varying levels of compatibility? A given implementation such as Xen could analyze the hypervisor and machine characteristics to provide a "compatible for migration" result, as well as a "can run on" result (and perhaps others). Something like the following:
virIsCompatible(hyp1, hyp2, COMPAT_MIGRATION); virIsCompatible(hyp1, hyp3, COMPAT_MIGRATION | COMPAT_RUN);
The former would return true (in the Xen case) only if both machines were the same bitness, processor revision, etc.
The latter could, potentially, return true for a given domain across Xen and qemu, if that domain is fully-virtualized.
Hum, we would probably need some logic like this between 2 hypervisor of the same type, but I'm afraid the matrix is gonna be a bit of hell if we want to handle all cases over time. It also feels a bit low level to me though I think I understand why you would like this :-) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Dan Smith wrote:
DL> This question comes up in other contexts than migration, too, for DL> example, when you want to start an image you just downloaded. I DL> think it would make sense if there was a common baseline in DL> libvirt that could tell you if a VM has any chance of running at DL> all - otherwise that logic will be scattered across lots of apps.
The migration case is significantly more strict, though, at least for Xen. A domain image may start up on two machines that are different, but the same domain would not migrate from one to the other.
On that note, how about a libvirt function that allows you to compare the capabilities of two hypervisors for varying levels of compatibility? A given implementation such as Xen could analyze the hypervisor and machine characteristics to provide a "compatible for migration" result, as well as a "can run on" result (and perhaps others). Something like the following:
virIsCompatible(hyp1, hyp2, COMPAT_MIGRATION); virIsCompatible(hyp1, hyp3, COMPAT_MIGRATION | COMPAT_RUN);
The former would return true (in the Xen case) only if both machines were the same bitness, processor revision, etc.
Processor revision is an artificial restriction. Just because you're going from an AMD rev F to a rev 10 doesn't mean that your application will stop working. In this particular case, it's actually pretty unlikely that it would stop working. Further, there are certainly cases where an application could not just depend on a feature present in a family but in a particular model. An obvious example would be the presence of VT on core duos vs core solos (although that wouldn't be an issue for guests..). I think it makes sense to separate hard compatibility where there *will* be an issue (Xen guests can't migrate to a KVM host) and soft compatibility where there *may* be an issue (going from AMD => Intel). It would probably make sense to make soft compatibility some sort of threshold too. For instance, it's much more risky to go from an AMD rev F to a P3 than going from an AMD rev 10 to a rev F. Better yet, just ignore soft compatibility altogether and let higher level tools make that decision :-) Regards, Anthony Liguori
The latter could, potentially, return true for a given domain across Xen and qemu, if that domain is fully-virtualized.

AL> Processor revision is an artificial restriction. Just because AL> you're going from an AMD rev F to a rev 10 doesn't mean that your AL> application will stop working. In this particular case, it's AL> actually pretty unlikely that it would stop working. What I really meant was something along the lines of the "flags" field in cpuinfo. Certainly you can say that it's not safe to migrate to a processor that doesn't support a superset of the flags from the source, right? AL> Further, there are certainly cases where an application could not AL> just depend on a feature present in a family but in a particular AL> model. An obvious example would be the presence of VT on core AL> duos vs core solos (although that wouldn't be an issue for AL> guests..). Are you suggesting that comparing the flags would yield a false positive? AL> I think it makes sense to separate hard compatibility where there AL> *will* be an issue (Xen guests can't migrate to a KVM host) Right, which could be enforced categorically, without increasing the "matrix" of possibilities. AL> and soft compatibility where there *may* be an issue (going from AL> AMD => Intel). It would probably make sense to make soft AL> compatibility some sort of threshold too. For instance, it's much AL> more risky to go from an AMD rev F to a P3 than going from an AMD AL> rev 10 to a rev F. Wouldn't a comparison of the flags solve this though? Especially if each driver can implement its own check... I expect native qemu quests (i.e. not using kvm) could survive a migration across processor families, but Xen paravirt guests certainly could not. AL> Better yet, just ignore soft compatibility altogether and let AL> higher level tools make that decision :-) I think the goal is to eliminate the need for every libvirt consumer to implement the same type of checks and have each implementation be slightly different. While it certainly won't cover all cases, it seems like a reasonable thing to do, as long as it's not required to perform a migration :) -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

Dan Smith wrote:
AL> Processor revision is an artificial restriction. Just because AL> you're going from an AMD rev F to a rev 10 doesn't mean that your AL> application will stop working. In this particular case, it's AL> actually pretty unlikely that it would stop working.
What I really meant was something along the lines of the "flags" field in cpuinfo.
Okay, flags is a subset of the common cpuid features. This is not by any means exhaustive of the features supported by a particular CPU but it is a reasonable starting point.
Certainly you can say that it's not safe to migrate to a processor that doesn't support a superset of the flags from the source, right?
No, it may be safe. Consider the case where you're migrating from a core duo to a core solo. This is the same chip with a few things disabled including VT which will manifest as a lack of the "vmx" flag. In this case, it's 100% safe to do the migration because a PV guest cannot use that CPU feature. Something a bit more tricky is the sse2 flag. Do most VMs really use sse2? If no application in your VM is using sse2, then you should be able to migrate from across CPUs that do not support sse2 right? IMHO, this is not a decision you want to take away from an administrator. Guiding them and helping them make an informed decision is not a bad idea but just having a blind boolean isn't going to be all that helpful.
AL> Further, there are certainly cases where an application could not AL> just depend on a feature present in a family but in a particular AL> model. An obvious example would be the presence of VT on core AL> duos vs core solos (although that wouldn't be an issue for AL> guests..).
Are you suggesting that comparing the flags would yield a false positive?
Yes.
AL> I think it makes sense to separate hard compatibility where there AL> *will* be an issue (Xen guests can't migrate to a KVM host)
Right, which could be enforced categorically, without increasing the "matrix" of possibilities.
AL> and soft compatibility where there *may* be an issue (going from AL> AMD => Intel). It would probably make sense to make soft AL> compatibility some sort of threshold too. For instance, it's much AL> more risky to go from an AMD rev F to a P3 than going from an AMD AL> rev 10 to a rev F.
Wouldn't a comparison of the flags solve this though? Especially if each driver can implement its own check... I expect native qemu quests (i.e. not using kvm) could survive a migration across processor families, but Xen paravirt guests certainly could not.
KVM guests all (in theory) expose a least-common subset of processor features unless explicitly told to expose something that's specific to a processor. Because of this, you can migrate not just across processor families but from Intel to AMD and vice versa. In certain circumstances, I do think a Xen PV guest could survive migrations even across architectures.
AL> Better yet, just ignore soft compatibility altogether and let AL> higher level tools make that decision :-)
I think the goal is to eliminate the need for every libvirt consumer to implement the same type of checks and have each implementation be slightly different. While it certainly won't cover all cases, it seems like a reasonable thing to do, as long as it's not required to perform a migration :)
Something that compared systems, and provided specific information about potential incompatibilities would be useful. Then a higher level piece of software could pick and choose what it cared about. A boolean interface is only going to have one happy consumer and that's going to be whoever decides what the policy should be :-) Regards, Anthony Liguori

AL> Okay, flags is a subset of the common cpuid features. This is not by AL> any means exhaustive of the features supported by a particular CPU but AL> it is a reasonable starting point. Right, ok. AL> No, it may be safe. Consider the case where you're migrating from a AL> core duo to a core solo. This is the same chip with a few things AL> disabled including VT which will manifest as a lack of the "vmx" AL> flag. In this case, it's 100% safe to do the migration because a PV AL> guest cannot use that CPU feature. Sure. So the Xen implementation of this check would know which flags it could safely ignore, such as "vmx". AL> Something a bit more tricky is the sse2 flag. Do most VMs really use AL> sse2? If no application in your VM is using sse2, then you should be AL> able to migrate from across CPUs that do not support sse2 right? But, since we don't know that, we would return "false", indicating that it's not definitely safe. Whether it would work or not, or if it's worth trying or not is something that may be specific to a higher-level tool or system. AL> IMHO, this is not a decision you want to take away from an AL> administrator. Guiding them and helping them make an informed AL> decision is not a bad idea but just having a blind boolean isn't AL> going to be all that helpful. All I'm suggesting here is a function that returns a boolean based on completely safe tests. I'm not saying that a migration function should refuse to run if this test is false. In this case, it seems sane for something like virt-manager to pop up a box to the user that says something like: "This is not likely to work based on some processor compatibility checks. Do you want to continue anyway?"
Are you suggesting that comparing the flags would yield a false positive?
AL> Yes. Meaning that a flags check would indicate compatibility where there is none? Do you have an example? AL> KVM guests all (in theory) expose a least-common subset of processor AL> features unless explicitly told to expose something that's specific to AL> a processor. Because of this, you can migrate not just across AL> processor families but from Intel to AMD and vice versa. Which is something that the qemu implementation of this check could test for and act intelligently, right? AL> Something that compared systems, and provided specific information AL> about potential incompatibilities would be useful. Then a higher AL> level piece of software could pick and choose what it cared about. A AL> boolean interface is only going to have one happy consumer and that's AL> going to be whoever decides what the policy should be :-) That's probably true. If the test can return "definitely compatible" or "maybe not compatible", I think it's worth having. Perhaps the test could return more granular information and the yes/no (or several levels of such) could be macros? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

Dan Smith wrote:
DL> This question comes up in other contexts than migration, too, for DL> example, when you want to start an image you just downloaded. I DL> think it would make sense if there was a common baseline in DL> libvirt that could tell you if a VM has any chance of running at DL> all - otherwise that logic will be scattered across lots of apps.
The migration case is significantly more strict, though, at least for Xen. A domain image may start up on two machines that are different, but the same domain would not migrate from one to the other.
On that note, how about a libvirt function that allows you to compare the capabilities of two hypervisors for varying levels of compatibility? A given implementation such as Xen could analyze the hypervisor and machine characteristics to provide a "compatible for migration" result, as well as a "can run on" result (and perhaps others). Something like the following:
virIsCompatible(hyp1, hyp2, COMPAT_MIGRATION); virIsCompatible(hyp1, hyp3, COMPAT_MIGRATION | COMPAT_RUN);
The former would return true (in the Xen case) only if both machines were the same bitness, processor revision, etc.
The latter could, potentially, return true for a given domain across Xen and qemu, if that domain is fully-virtualized.
Couple of observations: (1) Does this depend on the application mix? For example if you're running applications which at start-up have detected some processor feature (eg. SSE), they presumably won't be too happy if migrated to an architecture which doesn't have this feature. On the other hand, if all your applications and your operating system are compiled for baseline i386, then presumably they'll run everywhere. [I don't know the answer to this - please educate me if I'm completely wrong] (2) This doesn't need to be in libvirt, but could be in another library which is also shared between the applications that need to use it. The only thing that libvirt needs is sufficiently detailed information from virGetCapabilities. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

RJ> Some issues around migration which are up for discussion: Something else to consider is whether or not we "undefine" hosts leaving one machine during a migration. Last time I checked, Xen left a domain in "powered-off" state on the source. It seems to make more sense to me for a migration to remove the shell domain from the source machine. What will be the expected behavior here? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Tue, Jul 10, 2007 at 11:49:34AM -0700, Dan Smith wrote:
RJ> Some issues around migration which are up for discussion:
Something else to consider is whether or not we "undefine" hosts leaving one machine during a migration. Last time I checked, Xen left a domain in "powered-off" state on the source. It seems to make more sense to me for a migration to remove the shell domain from the source machine.
What will be the expected behavior here?
That's a good question really. There's definitely an argument to be made that the guest shoud be undefined on the source to prevent its accidental restart. If we wanted to make undefining after migrate compulsory, then doing it as part of the virDomainMigrate call would make sense. If it was an optional thing though, one could make use of a flag to virDomainMigrate, or simply call virDomainUndefine explicitly. Then again Xen is starting to get support for checkpointing of VMs - where the original VM is left running after it has been saved (assume the disk is snapshotted at time of save too). If you apply the concept of checkpoints to migrate (which is using all the same code as save/restore in XenD), then you could have this idea of migrating the VM & leaving it on the original host too. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

DB> That's a good question really. There's definitely an argument to DB> be made that the guest shoud be undefined on the source to prevent DB> its accidental restart. Right, given that migration implies shared storage, starting a domain in two places would be catastrophic. DB> If we wanted to make undefining after migrate compulsory, then DB> doing it as part of the virDomainMigrate call would make sense. If DB> it was an optional thing though, one could make use of a flag to DB> virDomainMigrate, or simply call virDomainUndefine explicitly. In the case of a flag, or the case of an explicit undefine, how do you handle a new virtualization technology that enforces this behavior? I think assuming this level knowledge of the underlying platform (which could change in Xen at some point, too) would be a bad idea. DB> Then again Xen is starting to get support for checkpointing of VMs DB> - where the original VM is left running after it has been saved DB> (assume the disk is snapshotted at time of save too). If you apply DB> the concept of checkpoints to migrate (which is using all the same DB> code as save/restore in XenD), then you could have this idea of DB> migrating the VM & leaving it on the original host too. Sure, but wouldn't it make sense to have a separate API for checkpoint-like behavior? Even if this means a function like virDomainMigratePreserve(), you could easily have this return "not implemented" or "not supported" for a given platform in a sensible way. You could do this in the flag case, as well, but I think it would be cleaner to define this as a separate action. Thoughts? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Tue, Jul 10, 2007 at 08:01:32PM +0100, Daniel P. Berrange wrote:
On Tue, Jul 10, 2007 at 11:49:34AM -0700, Dan Smith wrote:
RJ> Some issues around migration which are up for discussion:
Something else to consider is whether or not we "undefine" hosts leaving one machine during a migration. Last time I checked, Xen left a domain in "powered-off" state on the source. It seems to make more sense to me for a migration to remove the shell domain from the source machine.
What will be the expected behavior here?
That's a good question really. There's definitely an argument to be made that the guest shoud be undefined on the source to prevent its accidental restart.
yup, I agree
If we wanted to make undefining after migrate compulsory, then doing it as part of the virDomainMigrate call would make sense. If it was an optional thing though, one could make use of a flag to virDomainMigrate, or simply call virDomainUndefine explicitly.
I would make it the default to try to provide a default behaviour we can garantee on most hypervisors, and possibly provide an extra flag to try to not undefine if the user has a good reason (and it's supported by the underlying hypervisor) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Dan Smith wrote:
RJ> Some issues around migration which are up for discussion:
Something else to consider is whether or not we "undefine" hosts leaving one machine during a migration. Last time I checked, Xen left a domain in "powered-off" state on the source. It seems to make more sense to me for a migration to remove the shell domain from the source machine.
What will be the expected behavior here?
For KVM, the guest isn't destroyed explicitly after a migration is successful. Instead, the source guest is left in a paused state. The main reason for not destroying the guest was so that a management tool could still interact with the guest's monitor to obtain statistics on the migration. It's expected that the management tool will destroy the domain on the source machine whenever it is done working with it. The KVM source guest is still resumable too so this doubles as a mechanism for forking VMs. I think these are useful semantics that ought to be exposed. With KVM, live migration is more generic. You can use it to do light-weight checkpointing. Regards, Anthony Liguori

AL> For KVM, the guest isn't destroyed explicitly after a migration is AL> successful. Instead, the source guest is left in a paused state. AL> The main reason for not destroying the guest was so that a AL> management tool could still interact with the guest's monitor to AL> obtain statistics on the migration. It's expected that the AL> management tool will destroy the domain on the source machine AL> whenever it is done working with it. That seems entirely different than the Xen case, and entirely more useful :) AL> The KVM source guest is still resumable too so this doubles as a AL> mechanism for forking VMs. I think these are useful semantics that AL> ought to be exposed. With KVM, live migration is more generic. You AL> can use it to do light-weight checkpointing. I agree that it should be exposed, as should any future Xen checkpointing capability. However, "migration" means moving a domain to me, which is (at least at a higher level) different from checkpointing or forking. I think that most checkpoint implementations will be largely similar to the migration for that platform, but it seems like it should be exposed out of libvirt as a different operation, no matter how it is implemented. -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Tue, Jul 10, 2007 at 11:49:18AM -0400, Daniel Veillard wrote:
Hi all,
now that 0.3.0 is out, it's probably time to build the next set of features we aims at developping in the next months, the list I have currently is short, but still significant:
- migration API: now that we have remote support it should be possible to build an API for migration of domains between 2 connections. Could be as simple as int virDomainMigrate(virDomainPtr domain, virConnectPtr to, int flags); sounds like a fun and very useful part.
Yep, that's something interesting to look at.
- USB support: we discussed that already, but the initial patch did not match the XML format suggestions we should try to resurrect this http://libvirt.org/search.php?query=USB&scope=LISTS http://www.redhat.com/archives/libvir-list/2007-March/thread.html#00118
I took at look at this a few weeks back, but before I got anywhere near doing the libvirt coding, I blocked on the fact that USB in QEMU is horribly unreliable. The revised XML format I suggested http://www.redhat.com/archives/libvir-list/2007-March/msg00205.html is more or less OK. But we will need to add some more attributes to mak it possible to do hot-plug add/remove. I got stuck trying to figure this out and not had time to re-visit yet.
- Support for Xen-API new entry points at least for localhost access since we have remote support now
Yeah, talking Xen-API over the UNIX domain socket should be something worth looking at. In theory it could be faster than the SEXPR based protocol since we'd only be asking for data we actuarlly need. In practice I'm not at all certain whether it will be faster - since I fear we may need far more round-trip requests. So this all needs a proof of concept done - implementing the listDomainIDs, getDomainInfo and a DumpXML method would be the 3 APIs I'd start with. With those we'd get a good idea of the complexity / performance.
- platform support: resolve the PPC64 issues
- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
VirtualBox, VMWare, too....
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
- Mandatory access control for APIs - use SELinux engine to enforce the acls, in similar way to DBus. NB, using SELinux in an application is totally independnant on whether you have SELinux enabled for the kernel or not. - Storage APIs - previously discussed for allocating/enumerating volumes on a host. - Device listing - enumeration of devices on a host - virt-manager wants to know about host ethernet devices (and whether they're bridged) so it can display options when creating guests, and host USB devices (so we can hot-plug a host USB device straight into the guest OS), and host disks / partitions so we can hand them off to a guest (unclear whether this should be part of a storage API or not - TBD) Dan -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
Hmm, well, at least some of them can be used at the same time, i.e. running a WinXP guest in qemu/kvm and an linux guest using lguest or uml at the same time works just fine. Having only one instance of virt-manager running to manage this would be great. Although that maybe isn't a libvirt issue, virt-manager could just support multiple connections, something which is useful for virt-manager anyway now we have remote support ;). A related but slightly trickier issue is that you can run the very same disk image using different hypervisors. An raw disk image working fine in qemu should boot equally well in xen (as hvm). qemu can handle vmware disk images. paravirt_ops makes switching hypervisors easy even for paravirtualized guests. Do we want support such things in libvirt? Should make it easy to migrate existing VMs from vmware to kvm or from xen to lguest. Oh, we'll need a vmware engine for that to work, not sure whenever that is possible. just some random ideas, Gerd

On Tue, Jul 10, 2007 at 06:26:36PM +0200, Gerd Hoffmann wrote:
- more engine support: OpenVZ is on the work, is there interest in lguest, UML or for example Solaris zones ?
Hmm, well, at least some of them can be used at the same time, i.e. running a WinXP guest in qemu/kvm and an linux guest using lguest or uml at the same time works just fine. Having only one instance of virt-manager running to manage this would be great. Although that maybe isn't a libvirt issue, virt-manager could just support multiple connections, something which is useful for virt-manager anyway now we have remote support ;).
You can already open many connections - 'File -> Open Connection' from the main window. The UI for this isn't great though, so we're already planning to adapt it to give better view of multiple connections.
A related but slightly trickier issue is that you can run the very same disk image using different hypervisors. An raw disk image working fine in qemu should boot equally well in xen (as hvm). qemu can handle vmware disk images. paravirt_ops makes switching hypervisors easy even for paravirtualized guests.
Yes, that is an intesting thing to look at. The next python-virtinst release will have support for cloning VMs - this just does a 'deep' copy of all the disks & creates a new VM config for the copy. It would be interesting to be able to clone to a different HV target, as well as being able to simply 'move' a disk to a different HV target without actually copying it. Not sure if we'd need more libvirt stuff to be able to do this, or if you can manage it all from apps on top. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Yes, that is an intesting thing to look at. The next python-virtinst release will have support for cloning VMs - this just does a 'deep' copy of all the disks & creates a new VM config for the copy. It would be interesting to be able to clone to a different HV target, as well as being able to simply 'move' a disk to a different HV target without actually copying it.
As for the cloning of the next release, only "deep copy" does in virtinst (--preserve-data, but it is). The StorageAPI is prepared, so can call those and enhance the cloning I think. https://www.redhat.com/archives/libvir-list/2007-April/msg00159.html

On Tue, Jul 10, 2007 at 06:26:36PM +0200, Gerd Hoffmann wrote:
Do we want support such things in libvirt? Should make it easy to migrate existing VMs from vmware to kvm or from xen to lguest. Oh, we'll need a vmware engine for that to work, not sure whenever that is possible.
I have asked at least a few time but never got any echo back :-\ Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Di Juli 10 2007, Daniel Veillard wrote:
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
What I am missing in open source virtualization solutions is the possibility to make snapshots in a tree structure and to freely jump from one state to another. VMWare workstation provides this feature. And in addition it would be nice to clone a virtual machine from a snapshot, i.e. select a snapshot and start a new instance starting from that state. Regards, Till

On Tue, Jul 10, 2007 at 09:06:29PM +0200, Till Maas wrote:
On Di Juli 10 2007, Daniel Veillard wrote:
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
What I am missing in open source virtualization solutions is the possibility to make snapshots in a tree structure and to freely jump from one state to another. VMWare workstation provides this feature. And in addition it would be nice to clone a virtual machine from a snapshot, i.e. select a snapshot and start a new instance starting from that state.
Well the problem is snapshotting the storage devices and doing this as efficiently as possible. I admit that I do the 1/ stop the domain 2/ rsync -vP /xen/domain.img backup: /xen/domain.img a bit too often to my taste and I would love something nicer, but at this point we need better support at the hypervisor and integration with system specific support. Not a piece of cake, and not something which can be attacked just at the libvirt level. At the moment it's a bit out of reach unfortunately, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mi Juli 11 2007, Daniel Veillard wrote:
a bit too often to my taste and I would love something nicer, but at this point we need better support at the hypervisor and integration with system specific support. Not a piece of cake, and not something which can be attacked just at the libvirt level.
Qemu already provides the methods to support this, at least for powered off machines, but maybe for suspended machines, too. A little more details are in my other mail. Regards, Till

Till Maas wrote:
On Di Juli 10 2007, Daniel Veillard wrote:
Now is a good time to suggest new potential directions, and I certainly forgot some obvious points, so what did I missed ?
What I am missing in open source virtualization solutions is the possibility to make snapshots in a tree structure and to freely jump from one state to another. VMWare workstation provides this feature. And in addition it would be nice to clone a virtual machine from a snapshot, i.e. select a snapshot and start a new instance starting from that state.
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM? Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Mi Juli 11 2007, Richard W.M. Jones wrote:
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM?
The use their own subsystem. VMWare uses .vmdk files to store the harddisk contents. When a snapshot is created, it creates a new one that depends on the old one and stores every change in the new one. When the machine is running during the snapshot, the complete state of the machine is stored, too. I guess VirtualBox does it similiar. Qemu also provides this feature, except that afaik it is only possible to savely create snapshots of powered off machines with the qcow2 image type. With qemu-img one can create a new disk image that depends on an old one and every changes are written to the new one. But maybe it is also possible to do this with running machines, when one suspends them, creates the new qcow2 image and starts the machine again, but I have never tried this. Regards, Till

Till Maas wrote:
On Mi Juli 11 2007, Richard W.M. Jones wrote:
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM?
The use their own subsystem. VMWare uses .vmdk files to store the harddisk contents. When a snapshot is created, it creates a new one that depends on the old one and stores every change in the new one. When the machine is running during the snapshot, the complete state of the machine is stored, too. I guess VirtualBox does it similiar.
Qemu also provides this feature, except that afaik it is only possible to savely create snapshots of powered off machines with the qcow2 image type.
This is not correct. QEMU has supported (for a very long time) the ability to save/restore snapshots of running machines. In QEMU 0.9.0, instead of saving snapshots to an external file, snapshots are saved along with disk snapshots to the actual disk file. This of course requires that the disk format support this and currently qcow2 is the only format that does. Regards, Anthony Liguori
With qemu-img one can create a new disk image that depends on an old one and every changes are written to the new one. But maybe it is also possible to do this with running machines, when one suspends them, creates the new qcow2 image and starts the machine again, but I have never tried this.
Regards, Till

On Wed, Jul 11, 2007 at 08:59:08AM -0500, Anthony Liguori wrote:
Till Maas wrote:
On Mi Juli 11 2007, Richard W.M. Jones wrote:
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM?
The use their own subsystem. VMWare uses .vmdk files to store the harddisk contents. When a snapshot is created, it creates a new one that depends on the old one and stores every change in the new one. When the machine is running during the snapshot, the complete state of the machine is stored, too. I guess VirtualBox does it similiar.
Qemu also provides this feature, except that afaik it is only possible to savely create snapshots of powered off machines with the qcow2 image type.
This is not correct. QEMU has supported (for a very long time) the ability to save/restore snapshots of running machines. In QEMU 0.9.0, instead of saving snapshots to an external file, snapshots are saved along with disk snapshots to the actual disk file. This of course requires that the disk format support this and currently qcow2 is the only format that does.
Which makes it rather useless - pretty much all my guests are either LVM or partitions, and sometimes raw files. I understand why this was done because it lets you do incremental checkpointing & restore. I think it'd be usefult to also add back support for saving to external files. I was looking at the code & think it would be really very easy to do, without impacting current code. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Wed, Jul 11, 2007 at 08:59:08AM -0500, Anthony Liguori wrote:
Till Maas wrote:
On Mi Juli 11 2007, Richard W.M. Jones wrote:
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM?
The use their own subsystem. VMWare uses .vmdk files to store the harddisk contents. When a snapshot is created, it creates a new one that depends on the old one and stores every change in the new one. When the machine is running during the snapshot, the complete state of the machine is stored, too. I guess VirtualBox does it similiar.
Qemu also provides this feature, except that afaik it is only possible to savely create snapshots of powered off machines with the qcow2 image type.
This is not correct. QEMU has supported (for a very long time) the ability to save/restore snapshots of running machines. In QEMU 0.9.0, instead of saving snapshots to an external file, snapshots are saved along with disk snapshots to the actual disk file. This of course requires that the disk format support this and currently qcow2 is the only format that does.
Which makes it rather useless - pretty much all my guests are either LVM or partitions, and sometimes raw files. I understand why this was done because it lets you do incremental checkpointing & restore. I think it'd be usefult to also add back support for saving to external files. I was looking at the code & think it would be really very easy to do, without impacting current code.
My plan was to make the migrate URI flexible such that an unknown URI called out to an external program. Somehow, an fd to a monitor would be passed so that the program could decide whether to "pause" before doing the save or to do the save live. This would allow you to write a "lvm" script that knew how to checkpoint LVM and could redirect the saved state to an external file using some sort of common naming convention. That way, "savevm lvm://foo" would result in the lvm volumes being checkpointed and the state being saved to /var/run/qemu/foo.state or something like that. The goal is to eliminate the distinction between savevm/migrate since they are really the same thing (savevm just pauses the VM first). Regards, Anthony Liguori
Dan.

On Wed, Jul 11, 2007 at 09:26:38AM -0500, Anthony Liguori wrote:
Daniel P. Berrange wrote:
On Wed, Jul 11, 2007 at 08:59:08AM -0500, Anthony Liguori wrote:
Till Maas wrote:
On Mi Juli 11 2007, Richard W.M. Jones wrote:
I'm interested to know how VirtualBox / VMWare deal with disk storage. Do they provide their own storage subsystems which support this or do they interact with things like LVM?
The use their own subsystem. VMWare uses .vmdk files to store the harddisk contents. When a snapshot is created, it creates a new one that depends on the old one and stores every change in the new one. When the machine is running during the snapshot, the complete state of the machine is stored, too. I guess VirtualBox does it similiar.
Qemu also provides this feature, except that afaik it is only possible to savely create snapshots of powered off machines with the qcow2 image type. This is not correct. QEMU has supported (for a very long time) the ability to save/restore snapshots of running machines. In QEMU 0.9.0, instead of saving snapshots to an external file, snapshots are saved along with disk snapshots to the actual disk file. This of course requires that the disk format support this and currently qcow2 is the only format that does.
Which makes it rather useless - pretty much all my guests are either LVM or partitions, and sometimes raw files. I understand why this was done because it lets you do incremental checkpointing & restore. I think it'd be usefult to also add back support for saving to external files. I was looking at the code & think it would be really very easy to do, without impacting current code.
My plan was to make the migrate URI flexible such that an unknown URI called out to an external program. Somehow, an fd to a monitor would be passed so that the program could decide whether to "pause" before doing the save or to do the save live.
Or you could just let the management tool enter 'stop' in the monitor if they required that the VM be paused for save, rather than kept running for migrate.
This would allow you to write a "lvm" script that knew how to checkpoint LVM and could redirect the saved state to an external file using some sort of common naming convention. That way, "savevm lvm://foo" would result in the lvm volumes being checkpointed and the state being saved to /var/run/qemu/foo.state or something like that.
Ok, that sounds like it would be sufficient to let us implement the current save/restore API in libvirt which allows ad-hoc files. I do also want to thing about an alternate API for 'managed' save/restore - where the HV manages save/restore files internally & the tools just say 'save a snapshot'.
The goal is to eliminate the distinction between savevm/migrate since they are really the same thing (savevm just pauses the VM first).
Yep, that makes alot of sense for the QEMU impl Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

AL> The goal is to eliminate the distinction between savevm/migrate since AL> they are really the same thing (savevm just pauses the VM first). But from a high level, there are (at least) two distinct management operations in my mind: relocation and checkpointing. Relocation implies that a guest leaves the source machine and appears on the destination. Checkpointing implies that the domain doesn't move. If we take these two actions, can we not still provide for all the cases? For example: /* Migrate explicitly undefines the host */ virDomainMigrate(dom, "host"); /* Xen case */ virDomainMigrate(dom, "tcp://host"); /* qemu case */ virDomainMigrate(dom, "lvm://foo"); /* qemu error case */ /* Checkpoint does not undefine the host */ virDomainCheckpoint(dom, "foo"); /* Xen unimplemented case */ virDomainCheckpoint(dom, "lvm://foo"); /* qemu case */ Is that not sane? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Wed, Jul 11, 2007 at 08:07:48AM -0700, Dan Smith wrote:
AL> The goal is to eliminate the distinction between savevm/migrate since AL> they are really the same thing (savevm just pauses the VM first).
But from a high level, there are (at least) two distinct management operations in my mind: relocation and checkpointing. Relocation implies that a guest leaves the source machine and appears on the destination. Checkpointing implies that the domain doesn't move. If we take these two actions, can we not still provide for all the cases? For example:
/* Migrate explicitly undefines the host */ virDomainMigrate(dom, "host"); /* Xen case */ virDomainMigrate(dom, "tcp://host"); /* qemu case */ virDomainMigrate(dom, "lvm://foo"); /* qemu error case */
/* Checkpoint does not undefine the host */ virDomainCheckpoint(dom, "foo"); /* Xen unimplemented case */ virDomainCheckpoint(dom, "lvm://foo"); /* qemu case */
Is that not sane?
I really would not mix the two at the API level. W.r.t. the virDomainMigrate please recheck what I suggested initially, you really want a pointer to an existing connection, not an URI and hostname. Sure you could get the virConnectPtr based on the URI, but it's better to rely on the user to do that step independantly. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

DV> you really want a pointer to an existing connection, not an URI DV> and hostname. Sure you could get the virConnectPtr based on the DV> URI, but it's better to rely on the user to do that step DV> independantly. So that implies that in order to perform a migration with libvirt, the user will need the libvirt daemon running and accessible on the remote machine, is that right? If so, why should this be necessary? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Wed, Jul 11, 2007 at 09:05:11AM -0700, Dan Smith wrote:
DV> you really want a pointer to an existing connection, not an URI DV> and hostname. Sure you could get the virConnectPtr based on the DV> URI, but it's better to rely on the user to do that step DV> independantly.
So that implies that in order to perform a migration with libvirt, the user will need the libvirt daemon running and accessible on the remote machine, is that right?
yes
If so, why should this be necessary?
Because from an API point of view if you do a migration to some hypervisor, you're likely to need a connection to that hypervisor in the application anyway, so it's better to have the registration and the migration separate. Simplifies coding, error handling etc ... Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Dan Smith wrote:
DV> you really want a pointer to an existing connection, not an URI DV> and hostname. Sure you could get the virConnectPtr based on the DV> URI, but it's better to rely on the user to do that step DV> independantly.
So that implies that in order to perform a migration with libvirt, the user will need the libvirt daemon running and accessible on the remote machine, is that right? If so, why should this be necessary?
1) It's required for tcp:// migration in QEMU/KVM 2) Xen is insane for allowing arbitrary incoming migrations from anywhere. When the Xen migration model is made more sane, you'll probably be forced to tell a destination node that it should accept an incoming migration for a particular domain. Regards, Anthony Liguori

On Wed, Jul 11, 2007 at 11:57:28AM -0500, Anthony Liguori wrote:
Dan Smith wrote:
DV> you really want a pointer to an existing connection, not an URI DV> and hostname. Sure you could get the virConnectPtr based on the DV> URI, but it's better to rely on the user to do that step DV> independantly.
So that implies that in order to perform a migration with libvirt, the user will need the libvirt daemon running and accessible on the remote machine, is that right? If so, why should this be necessary?
1) It's required for tcp:// migration in QEMU/KVM
2) Xen is insane for allowing arbitrary incoming migrations from anywhere. When the Xen migration model is made more sane, you'll probably be forced to tell a destination node that it should accept an incoming migration for a particular domain.
Yep, that's one of the things on my hitlist for making Xen secure. Right now if you enable migration in Xen you're basically opening up your Dom0 to anyone. One thing that has been discussed before is that you ask the remote end for a 'one time token' of some form. When starting the migration the source would send this token to the remote end which would check it for validity before accepting the migration. So in this model you'd need libvirt connections to both source & dest. So libvirt would get the token from one connection, and use it when starting migration on the other. So ultimately, if you want secure migration, you need to have a libvirt connection to both ends. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel Veillard wrote:
On Wed, Jul 11, 2007 at 08:07:48AM -0700, Dan Smith wrote:
AL> The goal is to eliminate the distinction between savevm/migrate since AL> they are really the same thing (savevm just pauses the VM first).
But from a high level, there are (at least) two distinct management operations in my mind: relocation and checkpointing. Relocation implies that a guest leaves the source machine and appears on the destination. Checkpointing implies that the domain doesn't move. If we take these two actions, can we not still provide for all the cases? For example:
/* Migrate explicitly undefines the host */ virDomainMigrate(dom, "host"); /* Xen case */ virDomainMigrate(dom, "tcp://host"); /* qemu case */ virDomainMigrate(dom, "lvm://foo"); /* qemu error case */
This is not sufficient for KVM. The Xen notion of migration is fundamentally broken. The architecture is such that on any arbitrary machine you can just issue a command and with no other information and it can migrate to any other Xen machine. It completely ignores issues like authentication and authorization. For a saner migration implementation, at least the following is needed: 1) a way to tell a destination machine that it can receive a particular migration 2) a way to tell the source machine to migrate to that destination 3) a way to pass credential requests through to the caller of libvirt #1 is not needed for Xen b/c Xen's migration is broken. For KVM, this may be needed to actually make sure a QEMU instance is listening for an incoming migration. This step is not strictly needed for the ssh:// protocol but is strictly needed for the tcp:// protocol. #3 is needed by the ssh:// migration protocol. One can imagine libvirt making wrappers to tunnel Xen migration traffic too in which case both #1 and #3 would also be needed.
/* Checkpoint does not undefine the host */ virDomainCheckpoint(dom, "foo"); /* Xen unimplemented case */ virDomainCheckpoint(dom, "lvm://foo"); /* qemu case */
Is that not sane?
I really would not mix the two at the API level. W.r.t. the virDomainMigrate please recheck what I suggested initially, you really want a pointer to an existing connection, not an URI and hostname. Sure you could get the virConnectPtr based on the URI, but it's better to rely on the user to do that step independantly.
You need a URI. It's not always just a hostname and a port. Consider the two current ways of KVM migration ssh://[user@]host/ and tcp://host:port. The best interface for QEMU/KVM would be a: virDomainSend(dom, "url://...."); and a: virDomainReceive(conn, "url://..."); For tcp://, recv would launch a qemu instance listening on the right port. For ssh://, recv would be a nop. Such an interface could easily support checkpointing, save, resume, etc. If you wanted to introduce a higher level interface for Migrate and Checkpoint, that would be fine too provided it just used these lower level functions and constructed appropriate URIs. Regards, Anthony Liguori
Daniel

AL> It completely ignores issues like authentication and AL> authorization. Excellent point. For that reason, a libvirt connection to the remote makes sense to me. AL> virDomainSend(dom, "url://...."); Don't you not want a connection to the remote machine here? virDomainSend(dom, rconn, "url://..."); AL> Such an interface could easily support checkpointing, save, AL> resume, etc. AL> If you wanted to introduce a higher level interface for Migrate and AL> Checkpoint, that would be fine too provided it just used these lower AL> level functions and constructed appropriate URIs. I'm quite happy with that... :) -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

Dan Smith wrote:
AL> It completely ignores issues like authentication and AL> authorization.
Excellent point. For that reason, a libvirt connection to the remote makes sense to me.
AL> virDomainSend(dom, "url://....");
Don't you not want a connection to the remote machine here?
Sure. Regards, Anthony Liguori
virDomainSend(dom, rconn, "url://...");
AL> Such an interface could easily support checkpointing, save, AL> resume, etc.
AL> If you wanted to introduce a higher level interface for Migrate and AL> Checkpoint, that would be fine too provided it just used these lower AL> level functions and constructed appropriate URIs.
I'm quite happy with that... :)
participants (11)
-
Anthony Liguori
-
Dan Smith
-
Daniel P. Berrange
-
Daniel Veillard
-
David Lutterkort
-
Gerd Hoffmann
-
Kazuki Mizushima
-
Richard W.M. Jones
-
Ryan Harper
-
Till Maas
-
Till Maas