On Wed, Nov 25, 2020 at 04:36:39PM +0000, Daniel P. Berrangé wrote:
On Wed, Nov 25, 2020 at 04:49:14PM +0100, Christian Ehrhardt wrote:
> Thanks for the hint Daniel, it is indeed not migration specific - it
> seems that virs-ssh-helper is just very slow.
>
> rm testfile; virsh -c
> qemu+ssh://testkvm-hirsute-to/system?proxy=netcat vol-download --pool
> uvtool h-migr-test.qcow testfile & for i in $(seq 1 20); do sleep 1s;
> ll -laFh testfile; done
> [1] 42285
> -rw-r--r-- 1 root root 24M Nov 25 15:20 testfile
> -rw-r--r-- 1 root root 220M Nov 25 15:20 testfile
> -rw-r--r-- 1 root root 396M Nov 25 15:20 testfile
> -rw-r--r-- 1 root root 558M Nov 25 15:20 testfile
> -rw-r--r-- 1 root root 756M Nov 25 15:20 testfile
> -rw-r--r-- 1 root root 868M Nov 25 15:20 testfile
> [1]+ Done virsh -c
> qemu+ssh://testkvm-hirsute-to/system?proxy=netcat vol-download --pool
> uvtool h-migr-test.qcow testfile
>
> rm testfile; virsh -c
> qemu+ssh://testkvm-hirsute-to/system?proxy=native vol-download --pool
> uvtool h-migr-test.qcow testfile & for i in $(seq 1 20); do sleep 1s;
> ll -laFh testfile; done
> [1] 42307
> -rw-r--r-- 1 root root 1.8M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 6.8M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 9.8M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 13M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 15M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 16M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 18M Nov 25 15:21 testfile
> -rw-r--r-- 1 root root 19M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 21M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 22M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 23M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 25M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 26M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 27M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 28M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 29M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 30M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 31M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 32M Nov 25 15:22 testfile
> -rw-r--r-- 1 root root 32M Nov 25 15:22 testfile
>
> That is ~150-200 MB/s vs 1-5 MB/s and as seen it seems to start slow
> AND degrades further.
> I'm not at 90MB overall and down to ~150 KB/s
>
> > and we'll probably wnt to colllect debug
> > level logs from src+dst hosts.
>
> I already had debug level logs of the migration [1] attached to the
> launchpad bug I use to take my notes on this.
> Taken with these configs:
> log_filters="1:qemu 1:libvirt 3:object 3:json 3:event 1:util"
> log_outputs="1:file:/var/log/libvirtd.log"
>
> You can fetch the logs (of a migration), but I'm happy to generate you
> logs of any other command (or with other log settings) as you'd prefer
> them.
>
> The network used in this case is a bridge between two containers, but
> we can cut out more components.
> I found that the same vol-download vs 127.0.0.1 gives the same results.
> That in turn makes it easier to gather results as we only need one system.
Yep, that's useful, I'm able to reproduce this problem myself too
now. Will do some local tests and report back...
Sigh, the problem is way too many reallocs, repeatedly growing and shrinking
the buffer we use for I/O.
I guess we never noticed this awfulness in the virsh console code it was
copied from, as the data volumes are lower.
Switching to a fixed size buffer makes it massively faster. I'll prep a
patch asap.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|