[libvirt] Raising limits for our RPC messages

Dear list, maybe you've seen us raising limit for various parts of RPC messages (for eaxmple: d15b29be, 66bfc7cc61ca0, e914dcfd, etc.). It usually happens when we receive a report that current limits are not enough. Well, we just did: https://bugzilla.redhat.com/show_bug.cgi?id=1440683 The thing is, virConnectGetAllDomainStats() is unable to encode RPC message because there's so much domains & stats to send that the limit is reached. Now, I was thinking how to approach this problem. Yes, we can raise the limits again. But we also have another approach: split data into multiple messages. The current limit for an RPC message is 4MB. Say you have 9MB of data you want to send to the other side. We could construct 3 messages (4 + 4 + 1) and send them to the other side where the data would be reconstructed. At RPC level, we could use @status header file to represent intermediate messages. That is, for the example above messages with the following headers would be constructed: {.len = 4MB, prog, vers, proc, .type = VIR_NET_REPLY, .serial, .status = VIR_NET_CONTINUE} {.len = 4MB, prog, vers, proc, .type = VIR_NET_REPLY, .serial, .status = VIR_NET_CONTINUE} {.len = 1MB, prog, vers, proc, .type = VIR_NET_REPLY, .serial, .status = VIR_NET_OK} (the actual @len accounts for message header too, so the message would be split into <nearly 4MB + nearly 4MB + slightly over 1MB> chunks, but that is not important right now) Just like we have in streams, VIR_NET_CONTINUE tells that there is more data yet to come, VIR_NET_OK could then mark the last packet in the stream. Cool you say, this could work. But not really. The problem with this approach is: a) we just transform problem to a different one (the limit of number of messages data can be split into) b) increased overhead (message header is sent multiple times) c) I don't expect we will ever support cherry-picking of messages (caller picks just the message that contains data they are interested in and discards the rest). So after all, we will effectively have the limit of message size increased to $limit_of_split_messages * $current_limit_of_message_size. Instead of introducing complex code that splits data into messages and reconstructs back, we might as well increase the current limit of message size. Michal

On Thu, May 04, 2017 at 01:39:17PM +0200, Michal Privoznik wrote:
Dear list,
maybe you've seen us raising limit for various parts of RPC messages (for eaxmple: d15b29be, 66bfc7cc61ca0, e914dcfd, etc.). It usually happens when we receive a report that current limits are not enough. Well, we just did:
[snip]
So after all, we will effectively have the limit of message size increased to $limit_of_split_messages * $current_limit_of_message_size. Instead of introducing complex code that splits data into messages and reconstructs back, we might as well increase the current limit of message size.
I guess it is helpful to consider the reason why we have a limit in the first place. Originally the RPC message buffer was statically allocated on the stack :-) We fixed that, but then the struct containing the buffer was still fully allocated in te struct. We fixed that too, and the RPC message struct dynamically grows the buffer as needed. At this point, the limit really just exists to avoid accidental / deliberate DoS attack on server / client by making unreasonably large requests. Whether we have 1 large message or 3 fragmented messages, we're going to end up serializing the RPC top the messages at the same time for sake of simplicity, so our peak memory usage is going to be the same in both cases. So I agree, that fragmenting messages is just adding complexity without a real win. So I think it is reasonable to simply increase the buffer size as we have done before. That said, we should bear this problem in mind before adding more "bulk query" APIs, as it isn't sensible to carry on increasing RPC message size forever. As somepoint we have to consider that serializing 100's of MB of data into an RPC message is an inherantly inefficient design, and consider alternative API designs without this. For bulk stats query we could do something totally radical and have the facility for the client to send us a shared memory region which we asynchronously populate with stats, avoiding the RPC layer entirely. Obviously only works for local connections, but I get the impression that most mgmt apps have a node-local agent talking to libvirtd anyway. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2017-05-04 14:49 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
So I think it is reasonable to simply increase the buffer size as we have done before.
That said, we should bear this problem in mind before adding more "bulk query" APIs, as it isn't sensible to carry on increasing RPC message size forever. As somepoint we have to consider that serializing 100's of MB of data into an RPC message is an inherantly inefficient design, and consider alternative API designs without this.
For bulk stats query we could do something totally radical and have the facility for the client to send us a shared memory region which we asynchronously populate with stats, avoiding the RPC layer entirely. Obviously only works for local connections, but I get the impression that most mgmt apps have a node-local agent talking to libvirtd anyway.
Does it possible to compress/uncompress data before/after transmission? I think that stats data can be efficient compressed... -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On Thu, May 04, 2017 at 02:55:26PM +0300, Vasiliy Tolstov wrote:
2017-05-04 14:49 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
So I think it is reasonable to simply increase the buffer size as we have done before.
That said, we should bear this problem in mind before adding more "bulk query" APIs, as it isn't sensible to carry on increasing RPC message size forever. As somepoint we have to consider that serializing 100's of MB of data into an RPC message is an inherantly inefficient design, and consider alternative API designs without this.
For bulk stats query we could do something totally radical and have the facility for the client to send us a shared memory region which we asynchronously populate with stats, avoiding the RPC layer entirely. Obviously only works for local connections, but I get the impression that most mgmt apps have a node-local agent talking to libvirtd anyway.
Does it possible to compress/uncompress data before/after transmission? I think that stats data can be efficient compressed...
That doesn't really help memory usage, as you still have todo the XDR encode step to create the data that you then feed to the compressor. It only saves on amount of data going over the wire, at the cost of burning CPU time on compression. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2017-05-04 15:01 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
That doesn't really help memory usage, as you still have todo the XDR encode step to create the data that you then feed to the compressor. It only saves on amount of data going over the wire, at the cost of burning CPU time on compression.
Yes... gRPC ? Or you don't plan to change XDR ? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 05/04/2017 02:04 PM, Vasiliy Tolstov wrote:
2017-05-04 15:01 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
That doesn't really help memory usage, as you still have todo the XDR encode step to create the data that you then feed to the compressor. It only saves on amount of data going over the wire, at the cost of burning CPU time on compression.
Yes... gRPC ? Or you don't plan to change XDR ?
I don't think we are gonna change something. btw gRPC doesn't have C bindings, so we cannot use it (not saying we want). Michal

On Thu, May 04, 2017 at 03:04:33PM +0300, Vasiliy Tolstov wrote:
2017-05-04 15:01 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
That doesn't really help memory usage, as you still have todo the XDR encode step to create the data that you then feed to the compressor. It only saves on amount of data going over the wire, at the cost of burning CPU time on compression.
Yes... gRPC ? Or you don't plan to change XDR ?
I don't see gRPC adding any benefit over what we use already and doesn't solve the problem we're discussing here. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Daniel P. Berrange
-
Michal Privoznik
-
Vasiliy Tolstov