Re: [libvirt] [PATCH 0/9] Add support for (qcow*) volume encryption

----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
Not quite: the main case of a "dumb" client would be a large-scale virtualization management software that contains a primary store of encryption information, and gives each node access only to those keys that are currently necessary by the node to run its domains; the fact that each node has access to only a limited set of keys prevents an attacker that compromises a single node from reading disk images of all domains managed in the entire site, even if the disk image storage (e.g. unencrypted NFS) does not allow managing access by each node separately.
Such a client must be able to transfer the actual secrets, not only identifiers, to libvirt. (The idea of a "dumb" client, that does not know the specifics of the format, is an additional feature on top, but one that implies that the client does send the actual secrets.)
This implies a flow of secrets
Key server --\ +-> libvirt client -> libvirt daemon -> qemu MGMT server --/
This does not in fact guarentee that secrets for a particular node are only used on the node for which they are intended, because the key server cannot be sure of what libvirt daemons the libvirt client is connected to. A client in this case is the central, fully trusted, management system (e.g. oVirt), there is no need to protect against it. A more likely flow is
MGMT client (no knowledge of secrets) | v MGMT server + key server (integrated or separate but cooperating) | v libvirt daemon | v qemu
What I am suggesting is that libvirt daemon should communicate with the key server directly in all cases, and take the client out of the loop. The client should merely indicate whether it wants encryption or not, and never be able to directly access any key material itself. With a direct trust relationship between the key server and each libvirtd daemon, you do now have a guarentee that keys are only ever used on the node for which they are intended. You also have the additional guarentee that no libvirt client can ever see any key secrets or passphrases since it has been taken completely out of the loop. As far as I understand it, the whole point of virtual machine encryption is that the nodes are _not_ trusted, and different encryption keys protect data on different nodes.
If all nodes are trusted, what additional functionality does volume encryption with per-volume keys provide? If the nodes are trusted to read only data from the domains they currently host, the nodes could just as well use an encrypted local hard drive to store all images, or share a single key to encrypt all images stored on a NFS/SAN.
Key server | V MGMT server -> libvirt client -> libvirt daemon -> qemu
Storage of secrets in a separate keystore is more important for "local" libvirt deployments, where libvirt manages the primary, long-term, store of the secrets.
Nope, I think that having a keystore for all scenarios is the desirable end goal. Passing secrets around in the XML should ideally never be done at all - we should aim to always have a keystore that can be used. Secrets in the XML would just be a fallback for a rarely used niche cases, or disaster recovery, or experimentation.
That means that any deployment of more than one node with migration requires a separate server providing a shared keystore, even if there is only one client to manage the nodes. The N nodes x N "management consoles" case requires a centralized key store, but it's not necessary to impose it on the "1 management console" case.
2. A desktop key agent (eg gnome-keyring)
This would be useful for the unprivileged libvirtd instances that run in the context of the desktop login session. Users already have SSH, GPG, website keys stored here, so having keys for their VM disks is obviously desirable
Another option here is to let the client store the secrets in gnome-keyring, and transfer them to libvirt only when starting a domain (especially when there are no persistent domains). That doesn't affect the design in any way, though.
This is undesriable because it lets requires that any client which wishes to start the guest must have access to the secrets. We really need to be able to have separation of this, so that when we introduce fine grained access control, you can setup separate roles for users who can access / work with secrets, vs users who can start/define guests.
3. A off-node key management server
This would be useful for a large scale data center / cluster cloud deployment. This is good to allow management scalability and better separation of responsiblities of adminstration.
If no keystore is in use, then I clearly all keys must go in & out of libvirt using the XML, which is pretty much what you're doing in this series. I would say though, there is no point in clearing the secret from the virStorageVolDefPtr instance/XML after volume creation, since the secret is going to be kept in memory forever for any guest using the volume. By not clearing the secret, an app could create a volume, requesting automatic key generation, then just do virStorageVolDumpXML(vol, VIR_STORAGE_VOL_XML_SECURE) to extract it and pass to onto the XML for the guest that's created
That is unreliable with current implementation, because a pool refresh creates all information about volumes anew by reading the volume files; therefore, after any pool refresh libvirt "forgets" the secrets directly associated with a volume (not secrets associated with an use of a volume in a domain). If client A creates a volume, client B can refresh the pool before client A is able to read the automatically generated secrets. One of the reasons the patch is clearing the information immediately is to make sure a {virStorageVolCreateXML; virStorageVolGetXMLDesc) always fails and no client is written depending on this racy operation sequence.
There is another, more important, reason why the node should never return encryption data to anyone who can connect to it. Consider the above- described situation of a large-scale virtualization deployment that uses volume image encryption to restrict access of nodes to volume data: If domain migration is supported, the nodes (all of them, or at least nodes within some groups) must be able to connect "read-write" to other nodes.
This is just another argument for taking clients out of the loop completely and having libvirt daemon always talk directly with keystores.
With a keystore we'd likely need a handful of APIs
- create a secret providing a passphrase - list all known secret UUIDs - get the passphrase assoicated with a secret - delete a secret based on UUID - lookup a secret UUID based on the a disk path
We might not need all of these. There are three main use cases for storing keys outside of XML:
1) This is a "small-scale deployment", where libvirt is the primary key store, and anyone that is able to connect to libvirtd is implicitly allowed to use the secrets. In this case the keystore can be managed completely automatically - creating a volume, or perhaps using it for the first time, implies storing a secret, and deleting a volume implies deleting a secret. Secrets are identified using a volume- unique object (is that a path, or an UUID?), which is completely transparent to the client. As long as the volume is "known" to libvirt, the client does not even need to specify any <encryption> element when creating a domain.
(1a, a "small-scale deployment" where there are multiple clients that should be protected against each other, and libvirt is the primary key store, does not make sense unless a client account system is added to libvirt.
2) The secrets are primarily stored outside libvirt (to restrict node access to image data), and libvirt stores only secrets for currently defined persistent domains (to support domain autostart). In this case the keystore can be managed completely automatically - persistent domain definition implies storing a secret, deleting the domain implies deleting a secret. Secrets are identified using a (domain, volume) pair, and this identifier is never exposed to the client.
Deleting the domain does not imply deleting a secret, since secrets are really associated with disks, not domains. It does, in this case - if the domain is not running on the node, the secret should not be stored on the node at all. A disk may be shared by multiple domains. If a disk is shared, there will be multiple (domain volume) key ID pairs, and deleting one instace of the secret does not delete the other.
In addition, when you delete a domain, libvirt does not delete the disk. It is the clients responsiblity to delete disks after the fact. In this scenario the client does the long-term key storage. libvirt does not need to - and shouldn't - store the key merely because a volume exists.
In any case items 1 & 2 are really just 2 different implementations of the same concept. There is a keystore, and libvirt talks to it. Whether the keystore is local, or remote from the node is a minor impl detail. Whether libvirt can request any secret, even if does not need to know it, is not a minor detail.
Use of a keystore of some format should be our primary goal here, since it takes the client out of the loop. When libvirt does ACLs on clients this is even more important, because when you revoke a clients' access to libvirt you can be sure they don't have a record of any of the secrets, since they never had any opportunity to see / use them during the time they were authenticated
3) An external key store (such as a KMIP server) is used. In this case the management of access rights, listing and deleting secrets, would be performed by interacting with the external key store directly. Secrets are identified by using the external key store identifiers, and the client and libvirtd send these identifiers to each other.
As long as we allow for secrets to be passed in the XML it will always be possible for a client to talk to a keystore, obtain the secrets and pass them in the XML. This is really not a desirable model to aim for by default though, because it requires clients to know about all the different types of keystore. If a client does not know about a particular type of keystore, then it becomes unable to manage guests on that libvirtd. Not to mention the issue of trust of the client & revocation of access.
Yes, libvirt will grow the concept of a user in the future, as well as access controls on (user, object, operation) tuples Overall, it seems this all boils down to one thing - is libvirt intended to be a simple "virtualization server" that locally performs requested operations, managed by a separate "virtualization management" client that has full knowledge about the site (nodes, storage, secrets, access rights), with clients connecting to the "virtualization management" software instead of libvirtd, or a complete "virtualization management solution" that integrates all site-wide knowedge, accessed by dumb clients using the libvirt protocol?
My reading of http://www.libvirt.org/goals.html implies the former; you seem to talk about the latter model. Have I misunderstood the role of libvirt in a large-scale deployment? Thank you, Mirek

On Fri, Jul 24, 2009 at 07:25:54AM -0400, Miloslav Trmac wrote:
----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
Not quite: the main case of a "dumb" client would be a large-scale virtualization management software that contains a primary store of encryption information, and gives each node access only to those keys that are currently necessary by the node to run its domains; the fact that each node has access to only a limited set of keys prevents an attacker that compromises a single node from reading disk images of all domains managed in the entire site, even if the disk image storage (e.g. unencrypted NFS) does not allow managing access by each node separately.
Such a client must be able to transfer the actual secrets, not only identifiers, to libvirt. (The idea of a "dumb" client, that does not know the specifics of the format, is an additional feature on top, but one that implies that the client does send the actual secrets.)
This implies a flow of secrets
Key server --\ +-> libvirt client -> libvirt daemon -> qemu MGMT server --/
This does not in fact guarentee that secrets for a particular node are only used on the node for which they are intended, because the key server cannot be sure of what libvirt daemons the libvirt client is connected to.
A client in this case is the central, fully trusted, management system (e.g. oVirt), there is no need to protect against it. A more likely flow is
MGMT client (no knowledge of secrets) | v MGMT server + key server (integrated or separate but cooperating) | v libvirt daemon | v qemu
What I am suggesting is that libvirt daemon should communicate with the key server directly in all cases, and take the client out of the loop. The client should merely indicate whether it wants encryption or not, and never be able to directly access any key material itself. With a direct trust relationship between the key server and each libvirtd daemon, you do now have a guarentee that keys are only ever used on the node for which they are intended. You also have the additional guarentee that no libvirt client can ever see any key secrets or passphrases since it has been taken completely out of the loop.
As far as I understand it, the whole point of virtual machine encryption is that the nodes are _not_ trusted, and different encryption keys protect data on different nodes.
I did not mean to imply that libvirtd on a node should have access to *all* secrets. Each libvirtd daemon has its own identity, and when talking to a keys server it would authenticate and the key server would only allow it access to some sub-set of keys. ie, only the keys neccessary for VMs it needs to run. If you include the secrets in the XML for a guest, that does not ensure those secrets are only accessible to that host. For example, when performing migration, the XML doc for a guest is passed directly from the source libvirtd to the destination libvirtd. So an admin could ssh into a node, run 'virsh migrate' and the guest & secrets would be transferred to another host. If you have each libvirtd requesting secrets directly from the keystore, at the time it starts a guest, then should an admin issue a migrate command manually, the destination libvirtd would still be unable to access the secrets of the incoming VM, since the keystore will not have been configured to allow it access. We also have to bear in mind how the MGMT server communicates with the libvirt daemon. One likely option is using a messaging service, which offers authenticity but not secrecy. ie, libvirt receiving a request off the message bus can be sure the request came from the mgmtm server, but cannot be sure it wasn't seen by other users during transport. Thus by including secrets in the XML, you could be exposing them to the message bus adminstrator. Taking your diagram, I think I would generalize it still further to allow for mgmt server to be optional separate from key server, and to introduce a "message bus" between MGMT server & libvirt. Really pushing the limits of ASCII art... MGMT client | V MGMT server <---> Key server | ^ V | message bus | | | V | libvirt daemon <----/ | V QEMU To start a VM, the sequence of steps would be: 1. MGMT client says 'boot vm X on node Y' to MGMT server 2. MGMT server says 'allow node Y access to secrets for VM X' to key server 3. MGMT server puts 'start vm X' message o nmessage bus to node Y 4. libvirt on node Y says 'give secrets for vm X' to key server 5. libvirt spawns QEMU, passing secrets If node E intercepts the 'start vm X' message from the bus it cannot see the secrets directly. If node 'E' asks the key server for secrets for VM X, it will be refused, since the MGMT server has not authorized it to see them. If sysadmin on node N, tries to migrate the VM to node E, node E will not be able to start the VM, since it again has not been authorized to fetch the secrets. To actually allow a migration, the sequence of steps would be 1. MGMT client says 'migrate vm X from node Y to node Z' to MGMT server 2. MGMT server says 'allow node Z access to secrets S' to key server 3. MGMT server puts 'migrate vm X to node Z' message on message bus to node Y 4. libvirt on node Y issues the migration operation, passing XML for the VM to node Z 5. libvirt on node Z says 'give secrets for vm X' to key server 6. libvirt on node Z spawns QEMU, passing secrets 7. MGMT server says 'deny node Y access to secrets for vm X' to key server I believe this deals with all the use cases, specifically - A node can only see secrets for VMs it is configured to run - MGMT server does not need to ever see the secrets itself. It merely controls access for which nodes can see them, and can request generation of new secrets - Messages between the MGMT server & libvirtd do not need to be encrypted, since they don't include secrets - Other users who authenticate to libvirt on a node cannot move a VM to an unauthorized node, since it can't see secrets - VM save memory images (which include the fully XML) do not ever expose the secrets. So VM save image cannot be restored on an unauthorized node.
If all nodes are trusted, what additional functionality does volume encryption with per-volume keys provide? If the nodes are trusted to read only data from the domains they currently host, the nodes could just as well use an encrypted local hard drive to store all images, or share a single key to encrypt all images stored on a NFS/SAN.
I think the outline above should address your concern here.
Yes, libvirt will grow the concept of a user in the future, as well as access controls on (user, object, operation) tuples
Overall, it seems this all boils down to one thing - is libvirt intended to be a simple "virtualization server" that locally performs requested operations, managed by a separate "virtualization management" client that has full knowledge about the site (nodes, storage, secrets, access rights), with clients connecting to the "virtualization management" software instead of libvirtd, or a complete "virtualization management solution" that integrates all site-wide knowedge, accessed by dumb clients using the libvirt protocol?
libvirt is intended to deal with resources on a single node, and provide a means for command & control of that node. I don't think anything I have said suggests / requires that libvirt needs full knowledge of the whole site. What we're dealing with here is a layered architecture, that can be deployed in a variety of different scenarios. The MGMT server authenticates & authorizes connections from MGMT clients. Libvirtd can be made to authenticate & authorize connections from the MGMT server and possibly other clients apps - that's a deployment decision of the specific MGMT application / adminstrator. libvirtd does not need to know the secrets of all VMs for all nodes, it only needs the ability to use key service which can provide secrets it needs. That key service may be local to the node, it may be on storage shared by several nodes, or it may be over the network on a remote node. A remote key service should of course authenticate / authorize each libvirtd that connects and requests access to keys, to ensure a node cannot access keys for VMs on another node. By providing a pluggable key service backend for libvirt I believe we can satisfy all these different deployment scenarios that have been discussed thus far, without needing to favour one. I think it is very important that all these deployment scenarios can be supported without including the secrets in the XML config, since there are faaaar too many ways in which the XML config itself may be exposed to undesirable places. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Replying to my own message... On Fri, Jul 24, 2009 at 02:04:33PM +0100, Daniel P. Berrange wrote:
We also have to bear in mind how the MGMT server communicates with the libvirt daemon. One likely option is using a messaging service, which offers authenticity but not secrecy. ie, libvirt receiving a request off the message bus can be sure the request came from the mgmtm server, but cannot be sure it wasn't seen by other users during transport. Thus by including secrets in the XML, you could be exposing them to the message bus adminstrator.
[snip]
- A node can only see secrets for VMs it is configured to run - MGMT server does not need to ever see the secrets itself. It merely controls access for which nodes can see them, and can request generation of new secrets - Messages between the MGMT server & libvirtd do not need to be encrypted, since they don't include secrets - Other users who authenticate to libvirt on a node cannot move a VM to an unauthorized node, since it can't see secrets - VM save memory images (which include the fully XML) do not ever expose the secrets. So VM save image cannot be restored on an unauthorized node.
[snip]
one. I think it is very important that all these deployment scenarios can be supported without including the secrets in the XML config, since there are faaaar too many ways in which the XML config itself may be exposed to undesirable places.
Now that I look back on this, the implications of storing any passphrases or secrets or keys in the XML are just horrific. There are soo many places in which this data would leak out to untrusted sources. Just look at virt-manager, virt-install, virsh, libvirtd, and indeed libvirt.so itself. All of these tools have logging / debugging options which cause the full XML docs of guest domains and storage volumes to be sent to log files, or syslog. When debuging issues reported with bugzilla we pretty much require that people provide logs from libvirt.so and apps involved. This presents such an unacceptably high risk of compromising secrets, that IMHO we not add any support in libvirt for storing secrets in the XML whatsoever. We should go straight for one of 2 options - API for clients to directly create/delete/list/generate And/or - libvirtd backend that talks to a key sever to indirectly fetch secrets And in the XML docs always reference keys based on a unique identifier of some form. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
Miloslav Trmac