
History: we currently provide two TCP sockets, one clear text, no auth, the other TLS with x509 client certificate verification for auth. We also provide a UNIX domain socket relying on file perms to restrict access. I have previously provided a policykit patch for the latter allowing admin defined auth policy for UNIX socket access. The PolicyKit patch was flawed in that it did not provide a way for a client app to determine whether the UNIX socket needed PolicyKit auth ahead of time or not. This required apps to make assumptions prior to connecting which is not really viable. This mail is working towards a more flexible authentication solution, and goes straight for the big picture by integrating SASL authentication. This gives us integration with Kerberos (GSSAPI) and PAM and whatever the hell else SAS supports[1]. The critical decision in all this is the wire protocol & how to adapt it to slot in authentication in a way that is compatible with existing clients. As it stands the decision to use clear data vs TLS on the wire is based on the socket the client connects to - we do a TLS handshake at the very start. We need to be able to support SASL on all sockets, TCP, TLS & UNIX since they can all benefit from stuff like Kerberos. So I decided we need to integrate at the RPC layer for this to work. The general protocol picture ---------------------------- It helps to look at qemud/remote_protocol.x when reading this next ... Every API basically has an RPC call & reply pair. The reply messages have a status field, either 'REMOTE_OK' or 'REMOTE_ERROR'. In the former case the normal API return values follow on the wire. In the latter case a virErrroPtr object is serialized on the wire. For this patch I decided to add a 3rd code, REMOTE_AUTH. If an application tries to make an RPC call on a socket requiring authentication, and has not yet authenticated the server will return REMOTE_AUTH code. It then also returns an int (remote_auth_type) specifying the authentication method to use. I have defined two: enum remote_auth_type { REMOTE_AUTH_NONE = 0, REMOTE_AUTH_SASL = 1 }; With plans to add REMOTE_AUTH_POLKIT in a future patch. A legacy client getting back REMOTE_AUTH code will just quit the connection attempt since they don't support authentication. If the admin so desires they can still provide the TLS socket in a non-authenticated mode and only turn on SASL for the TCP socket. So the decision about whether to enable legacy clients is admin controlled. This is the best we can do. A new client getting back REMOTE_AUTH code will then read the remote_auth_type off the wire. If the requested type is one that the client supports then it can begin the authentication process, otherwise we virRaiseError and stop connecting. The SASL specific picture ------------------------- For the SASL auth the process involves a multi-phase handshake looking something like: client server 1. -> ask for mechanism list -> new ctx 2. new ctx <- list of mechanisms <- 3. start auth -> initial auth data -> start auth 4. step auth <- reply auth data <- 5. -> step auth data -> step auth goto 4. <- reply auth data <- Authentication can complete at step 4 in this process, or steps 4 & 5 can repeat an arbitrary number of times. So, to implement this if it neccessary to define 3 new RPC calls internal to the remote driver/daemon (aka not exposed to public API like the rest of the RPC calls). These are: REMOTE_PROC_AUTH_SASL_INIT = 66, REMOTE_PROC_AUTH_SASL_START = 67, REMOTE_PROC_AUTH_SASL_STEP = 68 These are basically just punting back & forth the data going in & out of the appropriate SASL apis. sasl_{server,client}_{new,start,step} See the man pages for more info. So on the server end, if a socket is configured to require SASL auth, the server will reject all RPC calls *except* for those three above with the REMOTE_AUTH code. Once SASL auth has completed, it will allow all the normal RPC calls. The effect is basically that the client is not able to call virConnectOpen & friends until auth has completed. On the client end, the fun is in the 'call' method of remote_internal.c. This has been split in two. The original method is now 'onecall', and there is a thin wrapper named 'call'. 'call' simply invokves onecall, and if it gets back a REMOTE_AUTH code, will do the SASL handshake & then re-run the original call. So again the effect is basically that the first virConnectOpen will cause the auth handshake to be performed. The SASL implementation details ------------------------------- This is the bare minimum SASL integration. I have not attempted to hook up any callbacks for gathering credentials. This basically means that the only SASL mechanism which works is Kerberos GSSAPI - its credentials are fetched out-of-band & so don't require callbacks. We do need to consider callbacks later so we can do username/password auth, and all the various other methods SASL has. As well as authentication, some SASL mechanisms provide a way to negotiate a data encryption layer for the subsequent session. GSSAPI is the only commonly used mechanism which supports this. I have not implemented this yet though. What we would do though is to enable this capability on the the plain TCP socket only. This would make the TCP socket truely secure, and avoid any extra overhead on the TLS socket or UNIX domain socket. I have set the wire packet size for the SASL negotiation to 8192 bytes at a time. This has been sufficient so far, but I need to validate this before we commit, because this will be wire ABI sensitive. Or I could change the XDR spec to be variable length. Anyway needs re-visiting This only deals with authentication. I have not attempted any authoriztion controls. So anyone who has a valid kerberos principle can connect to the server. We clearly need a local group list as we do for the x509 client certificates. Ultimately we could try LDAP lookups & other intersting suggestions. The SASL stuff is detected in configure & enabled/disabled as appropriate. I need to add extra config file params though to let the sysadmin control what socket uses what auth mechanism. The setup / usage details ------------------------- As I said, I only used GSSAPI so far. To use this all your clients need to be able to kinit & get a ticket. For testing I have setup my own personal Kerberos server using Fedora 7 & FreeIPA (http://freeipa.org/). Each libvirt server needs to be issued with a Kerberos service principle. I am using the word 'libvirt' as the service name. The service principle must match the FQDN of the server host. So on your Kerberos server you can issue a service principle with kadmin.local
addprinc libvirt/cherry.virt.boston.redhat.com@VIRT.BOSTON.REDHAT.COM ktadd -k /tmp/cherry-libvirt.tab libvirt/cherry.virt.boston.redhat.com@VIRT.BOSTON.REDHAT.COM quit
Then copy the /tmp/cherry-libvirt.tab file to /etc/libvirt/krb5.tab on the server in question. (Change the hostnames & REALM as needed of course). If you want to change the GSSAPI mechanism, the settings are in the file /etc/sasl2/libvirt.conf. Though again only GSSAPI works so far. If you edit the /etc/libvirt/libvirtd.conf file you can enable listen_tcp=1 and run run libvirtd --listen.
From the client machine it should now be possible to do
$ kinit berrange@VIRT.BOSTON.REDHAT.COM $ virsh --connect xen+tcp://cherry.virt.boston.redhat.com/ You probably screwed something up though along the way because Kerberos is nasty like that. If you use --verbose with libvirtd it'll show details of any errors during authentication. On the client end you can add in a param on the connect URI of ?debug=stderr and it'll print some info about the auth process to stderr. Check that it shows 'GSSAPI' as a valid mechanism. If it doesn't, then your server is not configured as it should be - check keytab file - check you have cyrus-sasl-gssapi RPM. If that's ok, then check 'klist' on the client - if the first phase of chatter between the client & the KDC worked you should see the principle of the server cached on the client. If you don't, then check the KDC logs in /var/log/krb5kdc.log Kerberos/GSSAPI error reporting sucks really bad. That said, I'm sure I'm missing something, because the errors I get out of SASL are even worse than normal. BTW, you can see various 'XXX' in my patch. Basically all of them need to be addressed before I'd consider this patch suitable to merge. configure.in | 39 +++ include/libvirt/virterror.h | 1 libvirt.spec.in | 3 qemud/Makefile.am | 21 +- qemud/internal.h | 8 qemud/libvirtd.init.in | 3 qemud/libvirtd.sysconf | 3 qemud/qemud.c | 29 ++ qemud/remote.c | 370 ++++++++++++++++++++++++++++++++++-- qemud/remote_dispatch_localvars.h | 5 qemud/remote_dispatch_proc_switch.h | 24 ++ qemud/remote_dispatch_prototypes.h | 3 qemud/remote_protocol.c | 164 +++++++++++++++ qemud/remote_protocol.h | 59 +++++ qemud/remote_protocol.x | 53 ++++- src/Makefile.am | 3 src/remote_internal.c | 307 +++++++++++++++++++++++++++++ src/virsh.c | 4 src/virterror.c | 6 tests/Makefile.am | 2 20 files changed, 1073 insertions(+), 34 deletions(-) Regards, Dan. [1] Not entirely true. Some auth methods require credentials to be fetched from the user, so we can only support methods for which we have callbacks. This patch ignores the callback question, so only support GSSAPI where the credentials are fetched out-of-band. This can be addressed later. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|