[libvirt-users] Using virsh blockcopy -- what's it supposed to accomplish?

I am experimenting with the blockcopy command, and after figuring out how to integrate qemu-nbd, nbd-client and dumpxml/undefine/blockcopy/define/et. al. I have one remaining question: What's the point? The "replication" disk file is not, from what I can ascertain, bootable. I expect this operation to create a pristine copy of my source qcow2 file (at a given point in time) which implies that I can swap that copy in and use it just like the original. Neither using --finish nor --pivot (both appear successful) give me a mirror that seems to serve any purpose. It seems especially pointless if I use --pivot because anything that happens after the pivot ends up lost if I don't actually have a usable qcow2 file. I find lots of discussion online about getting the steps to work, but as yet find nothing about using the resulting file. What am I missing here? libvirt (1.2.2) and qemu (2.2.0) as distributed with Ubuntu Trusty. -- Gary R Hook Senior Kernel Engineer NIMBOXX, Inc

On 12/22/2014 03:27 PM, Gary R Hook wrote:
I am experimenting with the blockcopy command, and after figuring out how to integrate qemu-nbd, nbd-client and dumpxml/undefine/blockcopy/define/et. al. I have one remaining question:
What's the point?
Among other uses, live storage migration. Let's say you are running on a cluster, where your VM is running locally but was booted from network-accessed storage. You don't want any guest downtime, but you want to have the faster performance made possible by accessing local storage instead of the network-accessed storage. virsh blockcopy can be used to change qemu's notion of where the active layer of the disk lives without any guest time, by copying then pivoting to a local file.
The "replication" disk file is not, from what I can ascertain, bootable.
Correct in the current implementation, if you don't manually freeze guest I/O prior to the point where you abort the copy (whether you do a straight abort, leaving the copy as the point in time, or whether you do a pivot, leaving the original as the point in time). But I would like to add a --quiesce option to blockcopy, similar to what is already available for snapshot-create --quiesce. The idea is that just before breaking sync, you tell the guest to freeze all I/O, so that when you do break sync, the disk you are no longer using _is_ a consistent image (and depending on how well your guest is able to freeze I/O, it may well be bootable). But until that is implemented, you can use 'virsh domfsfreeze' as the manual access to freezing guest I/O, if you have new enough qemu and also have qemu-guest-agent wired up in your guest.
I expect this operation to create a pristine copy of my source qcow2 file (at a given point in time) which implies that I can swap that copy in and use it just like the original.
Neither using --finish nor --pivot (both appear successful) give me a mirror that seems to serve any purpose. It seems especially pointless if I use --pivot because anything that happens after the pivot ends up lost if I don't actually have a usable qcow2 file.
How is it not usable? When you break sync at the conclusion of a blockcopy, the image that you no longer use is an accurate snapshot of the state of the disk at the time you broke sync; but whether or not that is useful to a guest depends on how much influence unflushed I/O that was still in guest memory at the time you broke sync will have on your data.
I find lots of discussion online about getting the steps to work, but as yet find nothing about using the resulting file.
What am I missing here?
libvirt (1.2.2) and qemu (2.2.0) as distributed with Ubuntu Trusty.
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Dec 22, 2014 at 03:50:58PM -0700, Eric Blake wrote:
On 12/22/2014 03:27 PM, Gary R Hook wrote:
I am experimenting with the blockcopy command, and after figuring out how to integrate qemu-nbd, nbd-client and dumpxml/undefine/blockcopy/define/et. al. I have one remaining question:
What's the point?
Among other uses, live storage migration.
Let's say you are running on a cluster, where your VM is running locally but was booted from network-accessed storage. You don't want any guest downtime, but you want to have the faster performance made possible by accessing local storage instead of the network-accessed storage. virsh blockcopy can be used to change qemu's notion of where the active layer of the disk lives without any guest time, by copying then pivoting to a local file.
To add to Eric's explanation, I recently wrote a small example about it here (this was tested with libvirt 1.2.6 & QEMU 2.1): http://kashyapc.com/2014/07/06/live-disk-migration-with-libvirt-blockcopy/
The "replication" disk file is not, from what I can ascertain, bootable.
Correct in the current implementation, if you don't manually freeze guest I/O prior to the point where you abort the copy (whether you do a straight abort, leaving the copy as the point in time, or whether you do a pivot, leaving the original as the point in time). But I would like to add a --quiesce option to blockcopy, similar to what is already available for snapshot-create --quiesce.
I remember a RHEL7 bug you filed for that, Eric, https://bugzilla.redhat.com/show_bug.cgi?id=1151629 -- blockcopy --keep-overlay ought to have --quiesce option Something similar needs to be cloned upstream? -- /kashyap

On 12/23/14 6:17 AM, Kashyap Chamarthy wrote:
On Mon, Dec 22, 2014 at 03:50:58PM -0700, Eric Blake wrote:
On 12/22/2014 03:27 PM, Gary R Hook wrote:
I am experimenting with the blockcopy command, and after figuring out how to integrate qemu-nbd, nbd-client and dumpxml/undefine/blockcopy/define/et. al. I have one remaining question:
What's the point?
Among other uses, live storage migration.
Let's say you are running on a cluster, where your VM is running locally but was booted from network-accessed storage. You don't want any guest downtime, but you want to have the faster performance made possible by accessing local storage instead of the network-accessed storage. virsh blockcopy can be used to change qemu's notion of where the active layer of the disk lives without any guest time, by copying then pivoting to a local file.
To add to Eric's explanation, I recently wrote a small example about it here (this was tested with libvirt 1.2.6 & QEMU 2.1):
http://kashyapc.com/2014/07/06/live-disk-migration-with-libvirt-blockcopy/
I read that article. Now shut down the domain (post-pivot) which is using the new disk file, and start it up, without using a block device. This is the part that no one seems to write about, nor do I see that in your example. But thank you very much for your help and your articles; very much appreciated. -- Gary R Hook Senior Kernel Engineer NIMBOXX, Inc

On 12/23/2014 05:24 PM, Gary R Hook wrote:
I read that article.
Now shut down the domain (post-pivot) which is using the new disk file, and start it up, without using a block device. This is the part that no one seems to write about, nor do I see that in your example. But thank you very much for your help and your articles; very much appreciated.
What do you mean by "without using a block device"? Are you trying to revert back to the pre-copy file? Libvirt is supposed to rewrite the domain XML to reflect the end result of breaking the mirroring (whether you pivot or abort back to the original), and further starts of the domain should use the correct current file (which might not be the file that the earlier domain start used). If you abort a blockcopy before it is complete, the destination is useless (incomplete). If you end a blockcopy after it reached mirroring phase, the the file that you abandon (whether the original if you pivoted, or the destination if you aborted) is a point-in-time snapshot of the disk at the point you quit the mirroring; this disk snapshot is liable to need fsck and otherwise have inconsistencies unless you also ensured that guest I/O was stable before the point of breaking the mirroring (basically, using guest-agent freezing and thawing around the operation). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 1/8/15 2:48 PM, Eric Blake wrote:
On 12/23/2014 05:24 PM, Gary R Hook wrote:
I read that article.
Now shut down the domain (post-pivot) which is using the new disk file, and start it up, without using a block device. This is the part that no one seems to write about, nor do I see that in your example. But thank you very much for your help and your articles; very much appreciated.
What do you mean by "without using a block device"? Are you trying to revert back to the pre-copy file? Libvirt is supposed to rewrite the domain XML to reflect the end result of breaking the mirroring (whether you pivot or abort back to the original), and further starts of the domain should use the correct current file (which might not be the file that the earlier domain start used). If you abort a blockcopy before it is complete, the destination is useless (incomplete). If you end a blockcopy after it reached mirroring phase, the the file that you abandon (whether the original if you pivoted, or the destination if you aborted) is a point-in-time snapshot of the disk at the point you quit the mirroring; this disk snapshot is liable to need fsck and otherwise have inconsistencies unless you also ensured that guest I/O was stable before the point of breaking the mirroring (basically, using guest-agent freezing and thawing around the operation).
I've responded to Kashyap about this with the solution to my problem. It was a usage error. We want a copy that can be used in place of the original. Seems very simple. If you set up an NBD chain intuitively, based on assumed behavior, then the file you create is _not_ usable when all is said and done. See my other post. Based on experiences and observation, writing to an NBD device is _not_ equivalent to writing to a disk file. I can only conclude this is because the far end (the NBD server) is not acting like a disk file. Which, upon reflection, makes perfect sense. OMG it would be nice if qemu-nbd behavior and usage were documented. 8 years and still nothing of substance in the man page. There's a to-do there, I guess. -- Gary R Hook Senior Kernel Engineer NIMBOXX, Inc

On Thu, Jan 08, 2015 at 06:14:05PM -0600, Gary R Hook wrote:
On 1/8/15 2:48 PM, Eric Blake wrote:
On 12/23/2014 05:24 PM, Gary R Hook wrote:
I read that article.
Now shut down the domain (post-pivot) which is using the new disk file, and start it up, without using a block device. This is the part that no one seems to write about, nor do I see that in your example. But thank you very much for your help and your articles; very much appreciated.
What do you mean by "without using a block device"? Are you trying to revert back to the pre-copy file? Libvirt is supposed to rewrite the domain XML to reflect the end result of breaking the mirroring (whether you pivot or abort back to the original), and further starts of the domain should use the correct current file (which might not be the file that the earlier domain start used). If you abort a blockcopy before it is complete, the destination is useless (incomplete). If you end a blockcopy after it reached mirroring phase, the the file that you abandon (whether the original if you pivoted, or the destination if you aborted) is a point-in-time snapshot of the disk at the point you quit the mirroring; this disk snapshot is liable to need fsck and otherwise have inconsistencies unless you also ensured that guest I/O was stable before the point of breaking the mirroring (basically, using guest-agent freezing and thawing around the operation).
I've responded to Kashyap about this with the solution to my problem. It was a usage error.
I think you're referring to the comment you made on this post[1]. I fixed the things you pointed out. Thanks for the review, Gary. [1] http://kashyapc.com/2014/07/06/live-disk-migration-with-libvirt-blockcopy/ -- /kashyap

On 12/22/14 4:50 PM, Eric Blake wrote:
On 12/22/2014 03:27 PM, Gary R Hook wrote:
I am experimenting with the blockcopy command, and after figuring out how to integrate qemu-nbd, nbd-client and dumpxml/undefine/blockcopy/define/et. al. I have one remaining question:
What's the point?
Among other uses, live storage migration.
There is so very much to say here, but I will endeavor to be brief and to the point: And then what? Please note that I am working with libvirt 1.2.2, as stated in my OP. And up front: I'm pretty sure I'm missing data points and working from a position of ignorance and unreasonable expectation. Your bearing with me is much, much appreciated.
Let's say you are running on a cluster, where your VM is running locally but was booted from network-accessed storage. You don't want any guest downtime, but you want to have the faster performance made possible by accessing local storage instead of the network-accessed storage. virsh blockcopy can be used to change qemu's notion of where the active layer of the disk lives without any guest time, by copying then pivoting to a local file.
I think I totally understand that. But I don't care about the old file, I care about the new one. And also, once you've moved to a block device (post-pivot) there's no going back, is there? The old qcow2 file can be an old snapshot, but that's about it. And the new file is not, in and of itself, usable as it stands. I tested a pivot and redefine of my domain using the new block device (/dev/nbd2, e.g.) and I was able to shut down and then start the domain successfully. Which is no help whatsoever without forever including the NBD device. Or: what am I missing here? I would expect that the domain, no matter where its disk files are located, may at some point need to be shutdown, then restarted. If I don't have an actual "mirror" (actually, a replication is what I want) then I'm still missing the point of this feature. Although I guess I could block copy and then migrate with non-shared storage to get something as usable and flexible as the original. Except that requires another host. Let me summarize: I want to mirror a disk and at any point in time switch over and use the copy qcow2 file directly, in every way possible, as my new backing file. Including defining, shutting down and starting up domains arbitrarily. The old file is old news; forget about it. I want to move on. How does one accomplish that?
The "replication" disk file is not, from what I can ascertain, bootable.
Correct in the current implementation, if you don't manually freeze guest I/O prior to the point where you abort the copy (whether you do a straight abort, leaving the copy as the point in time, or whether you do a pivot, leaving the original as the point in time). But I would like to add a --quiesce option to blockcopy, similar to what is already available for snapshot-create --quiesce. The idea is that just before breaking sync, you tell the guest to freeze all I/O, so that when you do break sync, the disk you are no longer using _is_ a consistent image (and depending on how well your guest is able to freeze I/O, it may well be bootable). But until that is implemented, you can use 'virsh domfsfreeze' as the manual access to freezing guest I/O, if you have new enough qemu and also have qemu-guest-agent wired up in your guest.
Wow. Clearly something past 1.2.2. So my first step is to update to a current level? Dang it.
I expect this operation to create a pristine copy of my source qcow2 file (at a given point in time) which implies that I can swap that copy in and use it just like the original.
Neither using --finish nor --pivot (both appear successful) give me a mirror that seems to serve any purpose. It seems especially pointless if I use --pivot because anything that happens after the pivot ends up lost if I don't actually have a usable qcow2 file.
How is it not usable? When you break sync at the conclusion of a blockcopy, the image that you no longer use is an accurate snapshot of the state of the disk at the time you broke sync; but whether or not that is useful to a guest depends on how much influence unflushed I/O that was still in guest memory at the time you broke sync will have on your data.
As stated above, I don't want the old file, I want the _new_ one. If I am running a block copy to a remote system (because, you know, that's one of the cool things about NBD) and my local system dies or otherwise becomes unrecoverable, I'd like to go to the remote system and bring up the guest there with minimal downtime in the event of a catastrophic failure. Right now that appears to be impossible. And I'm the only person to think this is reasonable. For _me_, if I shut down my local guest, which causes the nbd-client process to exit, I expect the far end's server to flush blocks and create a disk file with integrity. What could it possibly be waiting for to prevent that from happening almost immediately? (He asks, without looking at the code...)
libvirt (1.2.2) and qemu (2.2.0) as distributed with Ubuntu Trusty.
-- Gary R Hook Senior Kernel Engineer NIMBOXX, Inc
participants (3)
-
Eric Blake
-
Gary R Hook
-
Kashyap Chamarthy