On Mon, Nov 14, 2011 at 12:24:22PM +0200, Michael S. Tsirkin wrote:
On Mon, Nov 14, 2011 at 10:16:10AM +0000, Daniel P. Berrange wrote:
> On Sat, Nov 12, 2011 at 12:25:34PM +0200, Avi Kivity wrote:
> > On 11/11/2011 12:15 PM, Kevin Wolf wrote:
> > > Am 10.11.2011 22:30, schrieb Anthony Liguori:
> > > > Live migration with qcow2 or any other image format is just not going
to work
> > > > right now even with proper clustered storage. I think doing a block
level flush
> > > > cache interface and letting block devices decide how to do it is the
best approach.
> > >
> > > I would really prefer reusing the existing open/close code. It means
> > > less (duplicated) code, is existing code that is well tested and
doesn't
> > > make migration much of a special case.
> > >
> > > If you want to avoid reopening the file on the OS level, we can reopen
> > > only the topmost layer (i.e. the format, but not the protocol) for now
> > > and in 1.1 we can use bdrv_reopen().
> > >
> >
> > Intuitively I dislike _reopen style interfaces. If the second open
> > yields different results from the first, does it invalidate any
> > computations in between?
> >
> > What's wrong with just delaying the open?
>
> If you delay the 'open' until the mgmt app issues 'cont', then you
loose
> the ability to rollback to the source host upon open failure for most
> deployed versions of libvirt. We only fairly recently switched to a five
> stage migration handshake to cope with rollback when 'cont' fails.
>
> Daniel
I guess reopen can fail as well, so this seems to me to be an important
fix but not a blocker.
If if the initial open succeeds, then it is far more likely that a later
re-open will succeed too, because you have already elminated the possibility
of configuration mistakes, and will have caught most storage runtime errors
too. So there is a very significant difference in reliability between doing
an 'open at startup + reopen at cont' vs just 'open at cont'
Based on the bug reports I see, we want to be very good at detecting and
gracefully handling open errors because they are pretty frequent.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|