Re: [Libvir] Scability / performance fix for virDomainLookupByID

Monday, 10 July 2006

On Fri, Jul 07, 2006 at 03:47:59PM +0100, Daniel P. Berrange wrote:
...
 Attached is a patch to significantly increase scalability /
performance of 
 the xenDaemonLookupByID method. The current implementation would get a
 list of all domain names from XenD, and then iterate doing a HTTP GET on
 /xend/domain/[name] until the domain with match ID was found. THis had
 O(n) complexity, with the result that when running on a system with 20
 actives domains, 'virsh list' would have O(n^2) complexity needing ~230  
 HTTP calls, giving a runtime of ~9 seconds.

 The patch is to make the code do a HTTP GET on /xend/domain/[id] which we
 just discovered is a valid URL to access. This makes the method call O(1),
 and 'virsh list' is now a saner O(n), and completes in ~1 second. While 
 still not great performance, this is certainly much better. I think it
 ought to be possible to optimize the code still further so that XenD is
 avoided altogether for simple commands which can be fullfilled purely
 with data available from Hypervisor, but that will need further 
 investigation.  i

So my previous post considered performance when doing a single iteration
over all active domains. I've now moved onto look at what happens when
you do  multiple iterations - eg a monitoring application sampling state
every 5 seconds.

Take code which connects to the HV, and then periodically lists all domains
and gets their status, eg

    conn = virConnectOpen(NULL);

    for (j = 0 ; j < 5 ; j++) {
      nid = virConnectListDomains(conn, ids, sizeof(ids)/sizeof(int));

      for (i = 0 ; i < nid ; i++) {
        int id = ids[i];

        dom = virDomainLookupByID(conn, id);

        ret = virDomainGetInfo(dom, &info);

        printf("Domains %d: %d CPUs\n", id, info.nrVirtCpu);
      }
    }

Then, every single call to virDomainLookupByID will trigger an HTTP GET
to resolve  id -> name,uuid.  Now the virConnect  data structure contains
a hash table storing all known domains, however, the xenDaemonLookupByID
method only uses the cache once its done the HTTP GET to resolv id -> name.

This is obviously needed the first time around, however, subsequent calls
to xenDaemonLookupByID have no need to do the HTTP GET since the cache
already contains a virDomain instance with id, name, uuid resolved.

If you measure the above code snippet on a machine with 20 domains, then
5 iterations of the outer loop will require 101  HTTP GET's and take about
5 seconds to complete.

If we modify the xenDaemonLookupByID method to check the cache immediately,
doing a cache lookup on ID, rather than name, we only need to do the HTTP
GET request the very first time a domain is looked up. Thus, only the first
iteration of the outer loop will result in HTTP calls - you can do as many
iterations as you like and there will only ever by 21 HTTP GETs performed,
and total runtime for 5 iterations drops to 1 second. The first iteration
does HTTP GETS to resolve name & uuid, but all subsquent iterations only
need to do HyperCalls so are lightening fast.

I'm attaching a patch which implements this speed up. The only issue not
being dealt with here is cache eviction. Xen Domain IDs will always increase,
no ID is re-used until the IDs wrap around - I can't remember whether the
ID space is 16 or 32-bits, but even with 16bits we'd need 32,000 domains to
have been created before wrap-around becomes an issue. We'd need to address
it long term, but for now I think we can live with this. Cache evication
could probaly be done in the virListDomains method - ie evict any entries
with IDs not currently running

Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Libvir] Scability / performance fix for virDomainLookupByID