History: we currently provide two TCP sockets, one clear text, no auth,
the other TLS with x509 client certificate verification for auth. We
also provide a UNIX domain socket relying on file perms to restrict access.
I have previously provided a policykit patch for the latter allowing admin
defined auth policy for UNIX socket access.
The PolicyKit patch was flawed in that it did not provide a way for a client
app to determine whether the UNIX socket needed PolicyKit auth ahead of time
or not. This required apps to make assumptions prior to connecting which
is not really viable.
This mail is working towards a more flexible authentication solution, and
goes straight for the big picture by integrating SASL authentication. This
gives us integration with Kerberos (GSSAPI) and PAM and whatever the hell
else SAS supports[1].
The critical decision in all this is the wire protocol & how to adapt it
to slot in authentication in a way that is compatible with existing clients.
As it stands the decision to use clear data vs TLS on the wire is based on
the socket the client connects to - we do a TLS handshake at the very start.
We need to be able to support SASL on all sockets, TCP, TLS & UNIX since
they can all benefit from stuff like Kerberos. So I decided we need to
integrate at the RPC layer for this to work.
The general protocol picture
----------------------------
It helps to look at qemud/remote_protocol.x when reading this next ...
Every API basically has an RPC call & reply pair. The reply messages have
a status field, either 'REMOTE_OK' or 'REMOTE_ERROR'. In the former case
the normal API return values follow on the wire. In the latter case a
virErrroPtr object is serialized on the wire.
For this patch I decided to add a 3rd code, REMOTE_AUTH. If an application
tries to make an RPC call on a socket requiring authentication, and has not
yet authenticated the server will return REMOTE_AUTH code. It then also returns
an int (remote_auth_type) specifying the authentication method to use. I have
defined two:
enum remote_auth_type {
REMOTE_AUTH_NONE = 0,
REMOTE_AUTH_SASL = 1
};
With plans to add REMOTE_AUTH_POLKIT in a future patch.
A legacy client getting back REMOTE_AUTH code will just quit the connection
attempt since they don't support authentication. If the admin so desires
they can still provide the TLS socket in a non-authenticated mode and only
turn on SASL for the TCP socket. So the decision about whether to enable
legacy clients is admin controlled. This is the best we can do.
A new client getting back REMOTE_AUTH code will then read the remote_auth_type
off the wire. If the requested type is one that the client supports then it
can begin the authentication process, otherwise we virRaiseError and stop
connecting.
The SASL specific picture
-------------------------
For the SASL auth the process involves a multi-phase handshake looking
something like:
client server
1. -> ask for mechanism list -> new ctx
2. new ctx <- list of mechanisms <-
3. start auth -> initial auth data -> start auth
4. step auth <- reply auth data <-
5. -> step auth data -> step auth
goto 4. <- reply auth data <-
Authentication can complete at step 4 in this process, or steps 4 & 5
can repeat an arbitrary number of times.
So, to implement this if it neccessary to define 3 new RPC calls internal
to the remote driver/daemon (aka not exposed to public API like the rest
of the RPC calls). These are:
REMOTE_PROC_AUTH_SASL_INIT = 66,
REMOTE_PROC_AUTH_SASL_START = 67,
REMOTE_PROC_AUTH_SASL_STEP = 68
These are basically just punting back & forth the data going in & out of
the appropriate SASL apis. sasl_{server,client}_{new,start,step} See
the man pages for more info.
So on the server end, if a socket is configured to require SASL auth, the
server will reject all RPC calls *except* for those three above with the
REMOTE_AUTH code. Once SASL auth has completed, it will allow all the
normal RPC calls. The effect is basically that the client is not able to
call virConnectOpen & friends until auth has completed.
On the client end, the fun is in the 'call' method of remote_internal.c.
This has been split in two. The original method is now 'onecall', and there
is a thin wrapper named 'call'. 'call' simply invokves onecall, and if
it
gets back a REMOTE_AUTH code, will do the SASL handshake & then re-run the
original call. So again the effect is basically that the first virConnectOpen
will cause the auth handshake to be performed.
The SASL implementation details
-------------------------------
This is the bare minimum SASL integration. I have not attempted to hook
up any callbacks for gathering credentials. This basically means that the
only SASL mechanism which works is Kerberos GSSAPI - its credentials are
fetched out-of-band & so don't require callbacks. We do need to consider
callbacks later so we can do username/password auth, and all the various
other methods SASL has.
As well as authentication, some SASL mechanisms provide a way to negotiate
a data encryption layer for the subsequent session. GSSAPI is the only
commonly used mechanism which supports this. I have not implemented this
yet though. What we would do though is to enable this capability on the
the plain TCP socket only. This would make the TCP socket truely secure,
and avoid any extra overhead on the TLS socket or UNIX domain socket.
I have set the wire packet size for the SASL negotiation to 8192 bytes at
a time. This has been sufficient so far, but I need to validate this
before we commit, because this will be wire ABI sensitive. Or I could
change the XDR spec to be variable length. Anyway needs re-visiting
This only deals with authentication. I have not attempted any authoriztion
controls. So anyone who has a valid kerberos principle can connect to the
server. We clearly need a local group list as we do for the x509 client
certificates. Ultimately we could try LDAP lookups & other intersting
suggestions.
The SASL stuff is detected in configure & enabled/disabled as appropriate.
I need to add extra config file params though to let the sysadmin control
what socket uses what auth mechanism.
The setup / usage details
-------------------------
As I said, I only used GSSAPI so far. To use this all your clients need to
be able to kinit & get a ticket. For testing I have setup my own personal
Kerberos server using Fedora 7 & FreeIPA (
http://freeipa.org/). Each libvirt
server needs to be issued with a Kerberos service principle. I am using the
word 'libvirt' as the service name. The service principle must match the
FQDN of the server host. So on your Kerberos server you can issue a service
principle with
kadmin.local
addprinc
libvirt/cherry.virt.boston.redhat.com(a)VIRT.BOSTON.REDHAT.COM
ktadd -k /tmp/cherry-libvirt.tab
libvirt/cherry.virt.boston.redhat.com(a)VIRT.BOSTON.REDHAT.COM
quit
Then copy the /tmp/cherry-libvirt.tab file to /etc/libvirt/krb5.tab on the
server in question. (Change the hostnames & REALM as needed of course).
If you want to change the GSSAPI mechanism, the settings are in the file
/etc/sasl2/libvirt.conf. Though again only GSSAPI works so far.
If you edit the /etc/libvirt/libvirtd.conf file you can enable listen_tcp=1
and run run libvirtd --listen.
From the client machine it should now be possible to do
$ kinit berrange(a)VIRT.BOSTON.REDHAT.COM
$ virsh --connect
xen+tcp://cherry.virt.boston.redhat.com/
You probably screwed something up though along the way because Kerberos
is nasty like that. If you use --verbose with libvirtd it'll show details
of any errors during authentication. On the client end you can add in
a param on the connect URI of ?debug=stderr and it'll print some info
about the auth process to stderr. Check that it shows 'GSSAPI' as a
valid mechanism. If it doesn't, then your server is not configured as it
should be - check keytab file - check you have cyrus-sasl-gssapi RPM.
If that's ok, then check 'klist' on the client - if the first phase of
chatter between the client & the KDC worked you should see the principle
of the server cached on the client. If you don't, then check the KDC logs
in /var/log/krb5kdc.log
Kerberos/GSSAPI error reporting sucks really bad. That said, I'm sure
I'm missing something, because the errors I get out of SASL are even worse
than normal.
BTW, you can see various 'XXX' in my patch. Basically all of them need to
be addressed before I'd consider this patch suitable to merge.
configure.in | 39 +++
include/libvirt/virterror.h | 1
libvirt.spec.in | 3
qemud/Makefile.am | 21 +-
qemud/internal.h | 8
qemud/libvirtd.init.in | 3
qemud/libvirtd.sysconf | 3
qemud/qemud.c | 29 ++
qemud/remote.c | 370 ++++++++++++++++++++++++++++++++++--
qemud/remote_dispatch_localvars.h | 5
qemud/remote_dispatch_proc_switch.h | 24 ++
qemud/remote_dispatch_prototypes.h | 3
qemud/remote_protocol.c | 164 +++++++++++++++
qemud/remote_protocol.h | 59 +++++
qemud/remote_protocol.x | 53 ++++-
src/Makefile.am | 3
src/remote_internal.c | 307 +++++++++++++++++++++++++++++
src/virsh.c | 4
src/virterror.c | 6
tests/Makefile.am | 2
20 files changed, 1073 insertions(+), 34 deletions(-)
Regards,
Dan.
[1] Not entirely true. Some auth methods require credentials to be fetched
from the user, so we can only support methods for which we have callbacks.
This patch ignores the callback question, so only support GSSAPI where
the credentials are fetched out-of-band. This can be addressed later.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules:
http://search.cpan.org/~danberr/ -=|
|=- Projects:
http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|