[libvirt] Unbounded client streams

Dear list, as you might have seen, I've been playing with our streams lately. One of the things I wanted to try was how my sparse streams deal with vol-download and slow writer (e.g. slow disk). Note to whomever wants to try that out: blkio,throttle.write_* is no good for this. So I went the old way: diff --git i/tools/virsh-util.c w/tools/virsh-util.c index 4b86e29..8f5738d 100644 --- i/tools/virsh-util.c +++ w/tools/virsh-util.c @@ -149,6 +149,8 @@ virshStreamSink(virStreamPtr st ATTRIBUTE_UNUSED, { int *fd = opaque; + sleep(60); + return safewrite(*fd, bytes, nbytes); } And found out something ugly: the memory footprint of virsh just keeps growing as vol-download progresses. I've tracked down the problem to: 1) The client event loop (virNetClientIOEventLoop) sees some incoming data, so it reads it (virNetClientIOHandleInput -> virNetClientCallDispatch) and queues the packet (virNetClientCallDispatchStream) regardless of the length of already queued packets 2) Since the server sees client has read all the data, it sends even more data. And the queue in step 1) just grows and grows. Ideally, the packets are taken out from the queue by virStreamRecv (which boils down to virNetClientStreamRecvPacket), but that is not on the schedule for the next minute. Frankly, I don't have any idea how to fix this. We can stop reading the data (i.e. not set POLLIN on the client <-> server socket) if the queue reaches certain length. BUT, this would mean no other call is processed. Not even keepalive - and the connection would break. The other option is to pretend I've never send this e-mail and you've read nothing O:-) Michal

2017-04-25 10:30 GMT+03:00 Michal Privoznik <mprivozn@redhat.com>:
Probably forgot to mention, the problem happens regardless of sparse streams or the regular ones we have already. It reproduces on both.
Does it possible to have two streams - one for upload/download that contains many messages and can take huge processing time. And second side by side for other messages. Like data channel and control channel? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 04/25/2017 01:37 PM, Vasiliy Tolstov wrote:
2017-04-25 10:30 GMT+03:00 Michal Privoznik <mprivozn@redhat.com>:
Probably forgot to mention, the problem happens regardless of sparse streams or the regular ones we have already. It reproduces on both.
Does it possible to have two streams - one for upload/download that contains many messages and can take huge processing time. And second side by side for other messages. Like data channel and control channel?
No, currently it is not possible. We don't have multiplexed data streams over virConnect. While that might be relatively easily to implement for some transports (like libssh2), for some it would require change of the transport layer (e.g. TCP/UNIX sockets don't support multistreaming, so we'd need to use SCTP). Or implement multistreaming ourselves which is something I'd like very much to avoid. Michal

25 Апр 2017 г. 15:01 пользователь "Michal Privoznik" <mprivozn@redhat.com> написал: On 04/25/2017 01:37 PM, Vasiliy Tolstov wrote:
2017-04-25 10:30 GMT+03:00 Michal Privoznik <mprivozn@redhat.com>:
Probably forgot to mention, the problem happens regardless of sparse streams or the regular ones we have already. It reproduces on both.
Does it possible to have two streams - one for upload/download that contains many messages and can take huge processing time. And second side by side for other messages. Like data channel and control channel?
No, currently it is not possible. We don't have multiplexed data streams over virConnect. While that might be relatively easily to implement for some transports (like libssh2), for some it would require change of the transport layer (e.g. TCP/UNIX sockets don't support multistreaming, so we'd need to use SCTP). Or implement multistreaming ourselves which is something I'd like very much to avoid. Michal I'm not check keepalive messages,but why not possible to extend connect reply to pass connid and send pong messages from client with connid? If someone sends keepalive message with different connid extend it timeout?

25 Апр 2017 г. 10:23 пользователь "Michal Privoznik" <mprivozn@redhat.com> написал: Dear list, as you might have seen, I've been playing with our streams lately. One of the things I wanted to try was how my sparse streams deal with vol-download and slow writer (e.g. slow disk). Note to whomever wants to try that out: blkio,throttle.write_* is no good for this. So I went the old way: diff --git i/tools/virsh-util.c w/tools/virsh-util.c index 4b86e29..8f5738d 100644 --- i/tools/virsh-util.c +++ w/tools/virsh-util.c @@ -149,6 +149,8 @@ virshStreamSink(virStreamPtr st ATTRIBUTE_UNUSED, { int *fd = opaque; + sleep(60); + return safewrite(*fd, bytes, nbytes); } And found out something ugly: the memory footprint of virsh just keeps growing as vol-download progresses. I've tracked down the problem to: 1) The client event loop (virNetClientIOEventLoop) sees some incoming data, so it reads it (virNetClientIOHandleInput -> virNetClientCallDispatch) and queues the packet (virNetClientCallDispatchStream) regardless of the length of already queued packets 2) Since the server sees client has read all the data, it sends even more data. And the queue in step 1) just grows and grows. Ideally, the packets are taken out from the queue by virStreamRecv (which boils down to virNetClientStreamRecvPacket), but that is not on the schedule for the next minute. Frankly, I don't have any idea how to fix this. We can stop reading the data (i.e. not set POLLIN on the client <-> server socket) if the queue reaches certain length. BUT, this would mean no other call is processed. Not even keepalive - and the connection would break. I think breaking connection not bad,simplify retry with new offset. The other option is to pretend I've never send this e-mail and you've read nothing O:-) Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Apr 25, 2017 at 09:22:51AM +0200, Michal Privoznik wrote:
Dear list,
as you might have seen, I've been playing with our streams lately. One of the things I wanted to try was how my sparse streams deal with vol-download and slow writer (e.g. slow disk). Note to whomever wants to try that out: blkio,throttle.write_* is no good for this. So I went the old way:
And found out something ugly: the memory footprint of virsh just keeps growing as vol-download progresses. I've tracked down the problem to:
1) The client event loop (virNetClientIOEventLoop) sees some incoming data, so it reads it (virNetClientIOHandleInput -> virNetClientCallDispatch) and queues the packet (virNetClientCallDispatchStream) regardless of the length of already queued packets
2) Since the server sees client has read all the data, it sends even more data.
Yep, from the server's POV the current protocol design relies on the client not consuming data if it is not ready to handle it. There were two reasons for this. First for efficiency, we wanted to avoid a round trip for every single block of data we wish to upload / download. Second, and more importantly when using the streams for a bi-directional interactive console, we've no idea when there will be data to download, so we just wanted the server to send it as soon as it was received from the guest.
And the queue in step 1) just grows and grows. Ideally, the packets are taken out from the queue by virStreamRecv (which boils down to virNetClientStreamRecvPacket), but that is not on the schedule for the next minute.
Yep.
Frankly, I don't have any idea how to fix this. We can stop reading the data (i.e. not set POLLIN on the client <-> server socket) if the queue reaches certain length. BUT, this would mean no other call is processed. Not even keepalive - and the connection would break.
My original intention was certainly that we would stop reading data off the socket if we had too much queued, but this predated the existance of the keepalive code. I see three possible options (besides ignoring it) - Turn off the keepalive somehow when we want to pause reading from the stream - Somehow introduce stream "chunking". eg assume a chunk size of 10 MB is somehow enabled. The server would send 10 MB, and then not send any more data until the client issued a "continue" message of some kind, whereupon a further 10 MB is permitted to be sent. - Stop multiplexing streams on the socket. Have some way to open up additional sockets to the server, one per stream. The main connection would somehow provide a one time only auth token, so we don't need to force apps to go through an authentication step again. Option 2 is probably less scary to retrofit. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 04/25/2017 04:06 PM, Daniel P. Berrange wrote:
On Tue, Apr 25, 2017 at 09:22:51AM +0200, Michal Privoznik wrote:
Dear list,
I see three possible options (besides ignoring it)
- Turn off the keepalive somehow when we want to pause reading from the stream
- Somehow introduce stream "chunking". eg assume a chunk size of 10 MB is somehow enabled. The server would send 10 MB, and then not send any more data until the client issued a "continue" message of some kind, whereupon a further 10 MB is permitted to be sent.
This could work. But what I am worried about is that this relies on the other side playing nicely. IOW the attack surface is still the same. BTW: I've done testing the other way when iohelper is slow. In this case the connection dies due to keepalive. Michal

On Wed, Apr 26, 2017 at 09:48:12 +0200, Michal Privoznik wrote:
On 04/25/2017 04:06 PM, Daniel P. Berrange wrote:
On Tue, Apr 25, 2017 at 09:22:51AM +0200, Michal Privoznik wrote:
Dear list,
I see three possible options (besides ignoring it)
- Turn off the keepalive somehow when we want to pause reading from the stream
- Somehow introduce stream "chunking". eg assume a chunk size of 10 MB is somehow enabled. The server would send 10 MB, and then not send any more data until the client issued a "continue" message of some kind, whereupon a further 10 MB is permitted to be sent.
This could work. But what I am worried about is that this relies on the other side playing nicely. IOW the attack surface is still the same.
BTW: I've done testing the other way when iohelper is slow. In this case the connection dies due to keepalive.
Just a note, keepalive is actually the smallest issue in the "don't read any data from libvirt connection" approach. What about asynchronous events from the server or multiple threads using the same connection? Jirka

On Wed, Apr 26, 2017 at 09:48:12AM +0200, Michal Privoznik wrote:
On 04/25/2017 04:06 PM, Daniel P. Berrange wrote:
On Tue, Apr 25, 2017 at 09:22:51AM +0200, Michal Privoznik wrote:
Dear list,
I see three possible options (besides ignoring it)
- Turn off the keepalive somehow when we want to pause reading from the stream
- Somehow introduce stream "chunking". eg assume a chunk size of 10 MB is somehow enabled. The server would send 10 MB, and then not send any more data until the client issued a "continue" message of some kind, whereupon a further 10 MB is permitted to be sent.
This could work. But what I am worried about is that this relies on the other side playing nicely. IOW the attack surface is still the same.
If the client receives more than 10 MB without it having sent the "continue" message, then it is justified in just dropping the connection, so I think that avoids the attack from malicious server.
BTW: I've done testing the other way when iohelper is slow. In this case the connection dies due to keepalive.
Right, so to solve it we would need the chunking in both directions. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (4)
-
Daniel P. Berrange
-
Jiri Denemark
-
Michal Privoznik
-
Vasiliy Tolstov