[libvirt] [re-send][PATCH 0/3] non-shared migration without target disk images

by now, if you want a non-shared migration, you have to create same files at target as source side, which seems intorlerable! this issue was reported by Reinier Schoof http://www.redhat.com/archives/libvir-list/2011-December/msg00451.html These patches try to let migrate --copy-storage-* work without pre-exist disk image files at target side. we may also hope qemu to aware of this issue hooks/qemu | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/Makefile.am | 3 ++ src/util/hooks.c | 2 +- 3 files changed, 59 insertions(+), 1 deletions(-)

this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files. Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- hooks/qemu | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 55 insertions(+), 0 deletions(-) create mode 100755 hooks/qemu diff --git a/hooks/qemu b/hooks/qemu new file mode 100755 index 0000000..8b93668 --- /dev/null +++ b/hooks/qemu @@ -0,0 +1,55 @@ +#!/usr/bin/python -u + +import sys +import string +import os +import subprocess +from xml.etree import ElementTree as ET + +id, opstr, subopstr, extra = range(4) + +if __name__ == "__main__": + args = sys.argv[1:] + hook_log = "/var/log/libvirt/hooks.log" + redir_fp = open(hook_log, 'a') + sys.stdout = redir_fp + if args: + if args[opstr] == "migrate": + file_sz = "0xffffffff" + c = 0 + dom_xml = sys.stdin.read() + ntree = ET.fromstring(dom_xml) + disks = ntree.findall('.//disk/source') + if disks == None: + sys.exit(1) + file_fmt = ntree.findall('.//disk/driver') + if file_fmt == None: + sys.exit(1) + for i in disks: + image = i.get('file') + if image == None: + print 'No disk images' + sys.exit(1) + if os.path.isfile(image): + continue + if os.path.isdir(os.path.dirname(image)) != True: + try: + os.makedirs(os.path.dirname(image)) + except: + print 'Fail to create dir of disk images' + ff = file_fmt[c].get('type') + if ff == None: + ff = "raw" + try: + subprocess.Popen(['qemu-img', "create", "-f", ff, image, file_sz], + stdout = redir_fp) + except: + print 'Fail to create disk images' + sys.exit(1) + c = c+1 + print 'Created disk images for migration' + else: + print 'qemu <id> <ops> <subops> <extra>' + + sys.stdout.close() + -- 1.7.2.5

On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed, and to then pre-create the proper empty files and give them correct SELinux labels. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed,
really? sorry, I don't know any implicit libvirt development rules, but, as I see, there are none hooks used by libvirt until now, why don't we use them instead of just design and obsolete? can't hooks be installed just like other libvirt files? then, create files by python or c have difference?
and to then pre-create the proper empty files and give them correct SELinux labels.
-- liguang lig.fnst@cn.fujitsu.com FNST linux kernel team

On 10/08/2012 08:33 PM, li guang wrote:
在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed,
really? sorry, I don't know any implicit libvirt development rules, but, as I see, there are none hooks used by libvirt until now, why don't we use them instead of just design and obsolete?
The problem with doing this as a python hook instead of directly in the libvirt code is that you still got the sVirt SELinux labeling wrong. Really, the file MUST be created by the libvirt C code, as that is the code that knows what SELinux label to apply to the new file. Furthermore, hooks are reserved for the user. We should not be making libvirt rely on the installation of hooks for correct operation of the normal case.
can't hooks be installed just like other libvirt files? then, create files by python or c have difference?
Hooks are limited in what they can do - they are documented as being unable to call back in to other libvirt calls. They are also less efficient - why spawn a python process, when we already have all the information to create the file in C code? Ultimately, I'd like to see this wart of --copy-storage fixed, but fixed properly in the C code and not by adding a hook file. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

在 2012-10-08一的 21:54 -0600,Eric Blake写道:
On 10/08/2012 08:33 PM, li guang wrote:
在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed,
really? sorry, I don't know any implicit libvirt development rules, but, as I see, there are none hooks used by libvirt until now, why don't we use them instead of just design and obsolete?
The problem with doing this as a python hook instead of directly in the libvirt code is that you still got the sVirt SELinux labeling wrong. Really, the file MUST be created by the libvirt C code, as that is the code that knows what SELinux label to apply to the new file. Furthermore, hooks are reserved for the user. We should not be making libvirt rely on the installation of hooks for correct operation of the normal case.
can't hooks be installed just like other libvirt files? then, create files by python or c have difference?
Hooks are limited in what they can do - they are documented as being unable to call back in to other libvirt calls. They are also less efficient - why spawn a python process, when we already have all the information to create the file in C code?
Ultimately, I'd like to see this wart of --copy-storage fixed, but fixed properly in the C code and not by adding a hook file.
OK, I see, thanks a lot! -- liguang lig.fnst@cn.fujitsu.com FNST linux kernel team

On Mon, Oct 08, 2012 at 08:05:21PM -0600, Eric Blake wrote:
On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed, and to then pre-create the proper empty files and give them correct SELinux labels.
Correctly doing auto-creation of disk images on the target host is alot more complicated that you might assume, and the hook script here misses alot of the hard bits - If the migration fails, you need to clean up these disk images you just created, because the data in them may be incomplete or corrupt. - If the source image is using encryption, then the target image should use encryption too - If the source image uses a backing file, then when incremental copy is requested, we might need to maintain the backing store. - if the migration API did not have either of the --copy-* flags set we should not be doing any creation of disk images - We need to handling of non-file based disk images, because even block devices may be local-only. - If the user passed a custom XML for the target that differs from the source, this may affect decisions we need to make above wrt encryption or backing files Finally, we should *not* do auto-creation by default - we need to add a new migrate flag to allow apps to turn it on, only if they want it. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

在 2012-10-10三的 15:00 +0100,Daniel P. Berrange写道:
On Mon, Oct 08, 2012 at 08:05:21PM -0600, Eric Blake wrote:
On 10/08/2012 07:51 PM, liguang wrote:
this hook aimed at migration only, it supposed to be used as a helper of migrate --copy-storage-* features to remove the unreasonable limitation of pre-exist disk images at migration target. someone can add more functions to hook files.
NACK. Instead of writing a hook that runs outside of libvirt and has to be installed to be of use, we should instead fix libvirt to do the file creation itself. That is, you should be patching src/qemu/qemu_migration.c to detect when copy-storage-* has been passed, and to then pre-create the proper empty files and give them correct SELinux labels.
Correctly doing auto-creation of disk images on the target host is alot more complicated that you might assume, and the hook script here misses alot of the hard bits
- If the migration fails, you need to clean up these disk images you just created, because the data in them may be incomplete or corrupt.
- If the source image is using encryption, then the target image should use encryption too
- If the source image uses a backing file, then when incremental copy is requested, we might need to maintain the backing store.
- if the migration API did not have either of the --copy-* flags set we should not be doing any creation of disk images
- We need to handling of non-file based disk images, because even block devices may be local-only.
- If the user passed a custom XML for the target that differs from the source, this may affect decisions we need to make above wrt encryption or backing files
Finally, we should *not* do auto-creation by default - we need to add a new migrate flag to allow apps to turn it on, only if they want it.
Daniel
perfect consideration! suppose this situation, I create disk images qemu needed at target by qemu-img create -f 'format' 'image' 'size', then virsh migrate --copy-storage-* everything runs smoothly, so, as i stated before, this patch limited to create-image-by-hand alleviation. -- liguang lig.fnst@cn.fujitsu.com FNST linux kernel team

this usage was suggested by man-page of waitpid, returns true if the child terminated normally Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/util/hooks.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/util/hooks.c b/src/util/hooks.c index f5890d2..55b98ca 100644 --- a/src/util/hooks.c +++ b/src/util/hooks.c @@ -280,7 +280,7 @@ virHookCall(int driver, virCommandSetOutputBuffer(cmd, output); ret = virCommandRun(cmd, &exitstatus); - if (ret == 0 && exitstatus != 0) { + if (ret == 0 && WIFEXITED(exitstatus) == 0) { virReportError(VIR_ERR_HOOK_SCRIPT_FAILED, _("Hook script %s %s failed with error code %d"), path, drvstr, exitstatus); -- 1.7.2.5

On 10/08/2012 07:51 PM, liguang wrote:
this usage was suggested by man-page of waitpid, returns true if the child terminated normally
NACK. virCommandRun already did this for you.
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/util/hooks.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/util/hooks.c b/src/util/hooks.c index f5890d2..55b98ca 100644 --- a/src/util/hooks.c +++ b/src/util/hooks.c @@ -280,7 +280,7 @@ virHookCall(int driver, virCommandSetOutputBuffer(cmd, output);
ret = virCommandRun(cmd, &exitstatus); - if (ret == 0 && exitstatus != 0) { + if (ret == 0 && WIFEXITED(exitstatus) == 0) { virReportError(VIR_ERR_HOOK_SCRIPT_FAILED, _("Hook script %s %s failed with error code %d"), path, drvstr, exitstatus);
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this usage was suggested by man-page of waitpid, returns true if the child terminated normally
NACK. virCommandRun already did this for you.
right, but not exactly, virCommandRun will leave raw exit-status out of there, so if the waited-command exit with a code '1', then the caller of virCommandRun will see exit-status value 0x100, and report an odd '256' exit-status. obviously, the low byte has no meaning. originally I think I should not handle this exit-status, so I use WIFEXITED, if I have to take care of it, at least the low byte should be ignored and report a correct exit-status value '1'
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/util/hooks.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/util/hooks.c b/src/util/hooks.c index f5890d2..55b98ca 100644 --- a/src/util/hooks.c +++ b/src/util/hooks.c @@ -280,7 +280,7 @@ virHookCall(int driver, virCommandSetOutputBuffer(cmd, output);
ret = virCommandRun(cmd, &exitstatus); - if (ret == 0 && exitstatus != 0) { + if (ret == 0 && WIFEXITED(exitstatus) == 0) { virReportError(VIR_ERR_HOOK_SCRIPT_FAILED, _("Hook script %s %s failed with error code %d"), path, drvstr, exitstatus);
-- liguang lig.fnst@cn.fujitsu.com FNST linux kernel team

On 10/08/2012 08:51 PM, li guang wrote:
在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this usage was suggested by man-page of waitpid, returns true if the child terminated normally
NACK. virCommandRun already did this for you.
right, but not exactly, virCommandRun will leave raw exit-status out of there,
Ah, after re-reading the code, so it does (I had thought we changed it to guarantee a return of -1 on any !WIFEXITED() exit, and a sanitized WEXITSTATUS() value, rather than making all the callers do that, but I guess not).
so if the waited-command exit with a code '1', then the caller of virCommandRun will see exit-status value 0x100, and report an odd '256' exit-status.
That depends on your OS. There are some systems out there where normal exit is in the low-order rather than high-order byte. But that's why we have virProcessTranslateStatus(), to do the work correctly.
ret = virCommandRun(cmd, &exitstatus); - if (ret == 0 && exitstatus != 0) { + if (ret == 0 && WIFEXITED(exitstatus) == 0) {
You have a logic bug - the old code was checking for non-zero status, and you switched it to check for zero status.
virReportError(VIR_ERR_HOOK_SCRIPT_FAILED, _("Hook script %s %s failed with error code %d"), path, drvstr, exitstatus);
This is not a correct error message - if you ever use WIFEXITED, you must also use WEXITSTATUS, otherwise you have a mismatch. And for this particular error message, I think we are losing useful information; I argue that we'd probably get a much better error message if we changed this to let virCommandRun do ALL the work: ret = virCommandRun(cmd, NULL); -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

在 2012-10-08一的 21:05 -0600,Eric Blake写道:
On 10/08/2012 08:51 PM, li guang wrote:
在 2012-10-08一的 20:05 -0600,Eric Blake写道:
On 10/08/2012 07:51 PM, liguang wrote:
this usage was suggested by man-page of waitpid, returns true if the child terminated normally
NACK. virCommandRun already did this for you.
right, but not exactly, virCommandRun will leave raw exit-status out of there,
Ah, after re-reading the code, so it does (I had thought we changed it to guarantee a return of -1 on any !WIFEXITED() exit, and a sanitized WEXITSTATUS() value, rather than making all the callers do that, but I guess not).
so if the waited-command exit with a code '1', then the caller of virCommandRun will see exit-status value 0x100, and report an odd '256' exit-status.
That depends on your OS. There are some systems out there where normal exit is in the low-order rather than high-order byte. But that's why we have virProcessTranslateStatus(), to do the work correctly.
virProcessTranslateStatus seems just a int to string translation
ret = virCommandRun(cmd, &exitstatus); - if (ret == 0 && exitstatus != 0) { + if (ret == 0 && WIFEXITED(exitstatus) == 0) {
You have a logic bug - the old code was checking for non-zero status, and you switched it to check for zero status.
'WIFEXITED(exitstatus) == 0' means waited-command exit unexpectedly, so, report error, my thought is try to ignore the normal 'exit(x)'s x value' returned by waited-command.
virReportError(VIR_ERR_HOOK_SCRIPT_FAILED, _("Hook script %s %s failed with error code %d"), path, drvstr, exitstatus);
This is not a correct error message - if you ever use WIFEXITED, you must also use WEXITSTATUS, otherwise you have a mismatch. And for this particular error message, I think we are losing useful information; I argue that we'd probably get a much better error message if we changed this to let virCommandRun do ALL the work:
ret = virCommandRun(cmd, NULL);
-- liguang lig.fnst@cn.fujitsu.com FNST linux kernel team

Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> --- src/Makefile.am | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index 4ae741b..3c84bca 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -49,6 +49,9 @@ augeas_DATA = augeastestdir = $(datadir)/augeas/lenses/tests augeastest_DATA = +driverhook_DATA = $(top_srcdir)/hooks/qemu +driverhookdir = $(sysconfdir)/libvirt/hooks + # These files are not related to driver APIs. Simply generic # helper APIs for various purposes UTIL_SOURCES = \ -- 1.7.2.5

On 10/08/2012 07:51 PM, liguang wrote:
by now, if you want a non-shared migration, you have to create same files at target as source side, which seems intorlerable!
this issue was reported by Reinier Schoof http://www.redhat.com/archives/libvir-list/2011-December/msg00451.html
That's only one instance; the problem itself has been known for much longer, but few enough people use --copy-storage-* flags of migration that no one has bothered to fix it. By the way, the qemu developers have stated that the current implementation of the --copy-storage flags is poorly implemented and may even risk data loss in guests in some cases; they are working on better more efficient ways to do this (such as NBD exports that libvirt would hook up for the duration of the migration), but it will still be a while before all that is ready. Libvirt, of course, will expose the same interface as before, but mapped to the new underlying qemu support.
These patches try to let migrate --copy-storage-* work without pre-exist disk image files at target side.
we may also hope qemu to aware of this issue
It's not qemu that has to be made aware, it is libvirt. Remember, with libvirt, you are using sVirt to constrain the set of files that qemu can open. Qemu already knows how to open(O_CREAT) the missing files, but if the files don't already exist, the SELinux prevents that creation. That's why libvirt has to be the one to pre-create the files. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
li guang
-
liguang