On Thu, Mar 15, 2012 at 10:47:25AM -0600, Eric Blake wrote:
On 03/15/2012 08:31 AM, Eric Blake wrote:
> It may also be worth considering the addition of _just_ a new ranged RPC
> call, but no new public API, by making the public API smart enough to
> automatically call the ranged RPC multiple times to build up a single
> return to the user. After all, if we exposed the ranged call to the
> user, then each client will have to add the logic to make the multiple
> calls; whereas our current limitation is _only_ in the RPC aspect (once
> the data is across the RPC and into the target process' memory space,
> there's no longer a 64k cap to worry about).
Thinking more about this idea:
Suppose we have the following in remote_protocol.x:
Existing:
struct remote_storage_pool_list_volumes_args {
remote_nonnull_storage_pool pool;
int maxnames;
};
struct remote_storage_pool_list_volumes_ret {
remote_nonnull_string names<REMOTE_STORAGE_VOL_NAME_LIST_MAX>; /*
insert@1 */
};
...
REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES = 92, /* autogen skipgen
priority:high */
(note the change from autogen to skipgen for src/remote)
New:
struct remote_storage_pool_list_volumes_get_transaction_args {
remote_nonnull_storage_pool pool;
int maxnames;
};
struct remote_storage_pool_list_volumes_get_transaction_ret {
int transaction;
};
struct remote_storage_pool_list_volumes_transaction_args {
remote_nonnull_storage_pool pool;
int transaction;
int start;
};
struct remote_storage_pool_list_volumes_ret {
remote_nonnull_string names<REMOTE_STORAGE_VOL_NAME_LIST_MAX>; /*
insert@1 */
};
...
REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION = ... /*
skipgen skipgen priority:high */
REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION = ... /* skipgen
skipgen priority:high */
Then, in pseudocode, we change the currently-generated src/remote
handler for storagePoolListVolumes to do:
if maxnames <= REMOTE_STORAGE_VOL_NAME_LIST_MAX
use REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES /* existing code */
else {
int transaction, i = 0;
pass pool and maxnames through
REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION to set transaction
while (i < maxnames) {
pass pool and transaction through
REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION to get 128 more names
i += REMOTE_STORAGE_VOL_NAME_LIST_MAX;
}
}
Then, in daemon/remote.c, we leave the existing handler (which calls
back into virStoragePoolListVolumes with 256 or less entries) untouched,
and add two new hand-written RPC handlers:
the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION:
register that a sequenced transaction has been started
allocate an array of maxnames and position and call
virStoragePoolListVolumes, and tie that result to the sequence
return the sequence transaction number
the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION:
validates that the sequence number is valid, and that the start
value matches the recorded position
prepares the return RPC with the next 128 names from the array tied
to the sequence, and updates the position
if the position reached the end, free the array and end the
sequenced transaction
we also update virnetserver.c to support sequenced commands - the idea
here is that any time a transaction will need more than one RPC message
between client and server, then a transaction number can be generated
which reserves memory tied to the transaction; as long as that
transaction number exists, then further commands sent with reference to
that transaction can reuse that memory; the transaction is then closed
if the server and client complete the transaction, but also closed (and
frees the reserved memory) if the connection disappears. To avoid a
misbehaving client from DoS'ing a server, we could also require that
while a transaction is open, the next RPC call from the client must
reference that transaction, and must come within a reasonable time
limit. To some extent, we already have a 'continue' status for a
message; right now, a 'call' can only be replied by 'ok' or
'error', but
this would be a case where 'call' is now replied by 'continue' to state
that a transaction is in effect, and where we could reuse the 'serial'
for as long as the transaction is in effect.
To me this all feels rather complicated to deal with. The reason
for the RPC limits was to prevent memory usage DOS. The actual
RPC message size is only half of the story though - the actual
storage pool API still requires memory to give you back the string
list. So if the list of volume names consumes 1 MB of memory,
we're going to consume that just by invoking the API. Now copying
the strings into an RPC message will double that to 2 MB.
Using the sequence of RPC calls to iterate over this is only
addressing that second part of memory usage. So I'm not really
feeling that's a huge win, given the complexity it introduces.
I'm inclined to say, that if people are creating setups with
1000's of volume & guests, then they can probably spare the
extra memory for us to increase the main RPC message payload
limit somewhat.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|