Re: [libvirt] [PATCH 0/9] Add support for (qcow*) volume encryption

Daniel, thanks for the review. ----- "Daniel P. Berrange" <berrange@redhat.com> wrote: > > New XML tags are defined to represent encryption parameters (currently > > format and passphrase, more can be added in the future), e.g. > > <encryption format='qcow'> > > <passphrase>c2lsbHk=</passphrase> > > </encryption> > > (passphrase content uses base64) > > I don't think we need 'format=qcow' in there - the guest XML already > has ability to have a <driver name='qemu' type='qcow2'/> element, > and the storage vol XML also has a <format type='qcow2'/> element. The format of encrypted data is conceptually separate from the format of the image: for example, a future LVM crypt implementation is planned that will support on-line encryption key change (re-encrypting the data without needing to stop its users). This can be naturally used on partitions, but it can also be used on a ("loopback-mounted") file stored on another filesystem - which is useful because this gives us on-line encryption key change capability for both partitions and files without creating two separate implementations of the functionality. (The same loopback setup could be done with LUKS today, although it is rather pointless.) So the fact that qemu treats the unencrypted file using the "raw" format does not imply that no encryption is used, or that any particular encryption implementation is used. For these reasons I think the encryption format should be explicitly specified. (The same information could be represented in the <driver> element, but there is currently no equivalent generic functionality for the <volume> element. It is also better to keep this information "within the <encryption> node", because this allows the "dumb" client discussed below to simply store the <encryption> node for each volume, and attach it inside <domain> descriptions, without any further understanding of the encryption format.) > I think it would be good to change the naming of the inner element > a little too. I think having it called 'secret' and then a 'type' > attribute would be a little nicer. eg > > <encryption> > <secret type='passphrase'/>c2lsbHk=</secret> > </encryption> OK, I'll change that. > It might be desirable to add encryption algorithm, but that can probably > wait since qcow doesn't support multiple algorithms at this time. Exactly. The expected content of the <encryption> element is "format"-specific. > > The <encryption> tag can be added to a <volume> node passed to > > virStorageVolCreateXML() to create an encrypted volume, or to a > > <disk> node inside a <domain> to specify what encryption parameters to > > use for a domain. If the domain is persistent, the parameters > > (including the passphrase) will be saved unencrypted in /etc/libvirtd; > > the primary use case is to store the parameters outside of libvirtd, > > (perhaps by virt-manager in a GNOME keyring). > > - Support for "dumb" clients that don't know anything about encryption > > formats and the required parameters: adding an encryption format to libvirt > > would automatically make it supported in all clients. > > > > Such a client would only request that a volume should be created when > > creating it, and libvirt would choose an appropriate format, parameters > > and passphrase/key and return it to the client, who could later pass it > > unmodified inside a <domain>. > > > > This requires public API additions to let libvirt return the encryption > > information as one of the results of a volume creation operation. > > > > - Support for storing the passphrases/keys used by persistent domains > > outside of the main XML files, e.g. in a separate passphrase-encrypted > > file that must be entered on libvirtd startup. > > I think these two points overlap quite alot. Not quite: the main case of a "dumb" client would be a large-scale virtualization management software that contains a primary store of encryption information, and gives each node access only to those keys that are currently necessary by the node to run its domains; the fact that each node has access to only a limited set of keys prevents an attacker that compromises a single node from reading disk images of all domains managed in the entire site, even if the disk image storage (e.g. unencrypted NFS) does not allow managing access by each node separately. Such a client must be able to transfer the actual secrets, not only identifiers, to libvirt. (The idea of a "dumb" client, that does not know the specifics of the format, is an additional feature on top, but one that implies that the client does send the actual secrets.) Storage of secrets in a separate keystore is more important for "local" libvirt deployments, where libvirt manages the primary, long-term, store of the secrets. > As you say there are two initial approaches to persistence of secrets > > - Keep the keys in the domain XML files > - Use a separate keystore > > For the separate keystore, there are probably 3 interesting targets > to consider > > 1. A simple text (or pkcs11) file managed by libvirtd on the host > > This would be useful for the privileged libvirtd to use in > ad-hoc, small scale deployments. Perhaps allowing it to be > shared between a small number of hosts on NFS, or GFS etc > > 2. A desktop key agent (eg gnome-keyring) > > This would be useful for the unprivileged libvirtd instances > that run in the context of the desktop login session. Users > already have SSH, GPG, website keys stored here, so having > keys for their VM disks is obviously desirable Another option here is to let the client store the secrets in gnome-keyring, and transfer them to libvirt only when starting a domain (especially when there are no persistent domains). That doesn't affect the design in any way, though. > 3. A off-node key management server > > This would be useful for a large scale data center / cluster > cloud deployment. This is good to allow management scalability > and better separation of responsiblities of adminstration. > > If no keystore is in use, then I clearly all keys must go in & out > of libvirt using the XML, which is pretty much what you're doing > in this series. I would say though, there is no point in clearing > the secret from the virStorageVolDefPtr instance/XML after volume > creation, since the secret is going to be kept in memory forever > for any guest using the volume. By not clearing the secret, an > app could create a volume, requesting automatic key generation, > then just do virStorageVolDumpXML(vol, VIR_STORAGE_VOL_XML_SECURE) > to extract it and pass to onto the XML for the guest that's created That is unreliable with current implementation, because a pool refresh creates all information about volumes anew by reading the volume files; therefore, after any pool refresh libvirt "forgets" the secrets directly associated with a volume (not secrets associated with an use of a volume in a domain). If client A creates a volume, client B can refresh the pool before client A is able to read the automatically generated secrets. One of the reasons the patch is clearing the information immediately is to make sure a {virStorageVolCreateXML; virStorageVolGetXMLDesc) always fails and no client is written depending on this racy operation sequence. There is another, more important, reason why the node should never return encryption data to anyone who can connect to it. Consider the above-described situation of a large-scale virtualization deployment that uses volume image encryption to restrict access of nodes to volume data: If domain migration is supported, the nodes (all of them, or at least nodes within some groups) must be able to connect "read-write" to other nodes. Yet, if the volume image encryption is to restrict access to volume data, such connections must not allow extracting encryption secrets, or any compromised node could simply ask the other nodes for their secrets and read the data. Thus, even a read-write connection should not allow any client to read secrets. I did not realize this when I started writing the original patches; I'll modify them not to allow returning the secrets using virDomainGetXMLDesc(), which is currently possible. The only potential exception is volume creation with secret auto-generation (not implemented in these patches), and in this case it is important to ensure that only the client that initiated the volume creation, not other clients, is can get the generated secrets. The most natural implementation would be a new operation (virStorageVolCreateXMLFromAndReturn???) that generates the secrets, creates the volume, and returns the secrets in a single step. In this scenario virStorageVolDumpXML does not need to return the secrets. > If a keystore is in use, then I'd suggest we should have explicit > APIs for secret management, As argued below, that might not be necessary, depending on the intended scope of libvirt. > and forbid the use of the actual secrets > in the XML everywhere. I wouldn't necessarily be that strict - if a centralized key management server is used by libvirt, it might still make sense for users to create one-off virtual machines that are encrypted, but not managed by the centralized infrastructure. > Instead, either pass in a pre-created key UUID when creating a volume > or defining a guest eg, the XML fragment would look like this > > <encryption> > <secret type='keyuuid'>123456-1234-1245-1234-124512345</secret> > </encryption> > > Internally libvirt would then get the real passphrase from the keystore > by looking up the associated UUID. > > Or request auto-creation of a key by leaving out the data > > <encryption> > <secret type='keyuuid'/> > </encryption> > > Internally libvirt would add a key to its database, and fillin the UUID > which you could then see via DumpXML. > > With a keystore we'd likely need a handful of APIs > > - create a secret providing a passphrase > - list all known secret UUIDs > - get the passphrase assoicated with a secret > - delete a secret based on UUID > - lookup a secret UUID based on the a disk path We might not need all of these. There are three main use cases for storing keys outside of XML: 1) This is a "small-scale deployment", where libvirt is the primary key store, and anyone that is able to connect to libvirtd is implicitly allowed to use the secrets. In this case the keystore can be managed completely automatically - creating a volume, or perhaps using it for the first time, implies storing a secret, and deleting a volume implies deleting a secret. Secrets are identified using a volume-unique object (is that a path, or an UUID?), which is completely transparent to the client. As long as the volume is "known" to libvirt, the client does not even need to specify any <encryption> element when creating a domain. (1a, a "small-scale deployment" where there are multiple clients that should be protected against each other, and libvirt is the primary key store, does not make sense unless a client account system is added to libvirt. If libvirt can not distinguish between different client accounts, it can only restrict access to secrets by insisting that the client specifies a keyuuid - but the client will need to store the keyuuid values and protect them exactly as if they were the "real" secrets; the client can just as easily store the "real" secrets and skip using keyuuids altogether.) 2) The secrets are primarily stored outside libvirt (to restrict node access to image data), and libvirt stores only secrets for currently defined persistent domains (to support domain autostart). In this case the keystore can be managed completely automatically - persistent domain definition implies storing a secret, deleting the domain implies deleting a secret. Secrets are identified using a (domain, volume) pair, and this identifier is never exposed to the client. 3) An external key store (such as a KMIP server) is used. In this case the management of access rights, listing and deleting secrets, would be performed by interacting with the external key store directly. Secrets are identified by using the external key store identifiers, and the client and libvirtd send these identifiers to each other. The above seems to indicate the only operations libvirt needs from a keystore are: - generate a UUID (in a keystore-specific format) - store a secret using a specified UUID - get secret associated with an UUID - delete a secret associated with an UUID. The first two operations could perhaps be combined, depending on the specifics of the external key store (e.g. the client might want to allocate an UUID so that the client, not the node, is the "owner" of the UUID, grant the node access to the UUID, and let the node create a volume and store the secret in the UUID). > The nice thing about separating the passphrases out of the XML completely > is that we would then have the ability to do fine grained access control. > eg, you could give people ability to create new guests / volumes, using > encryption without them ever specifying, or having access to the secrets. > A seperate role could be given ability to create/list/delete secrets. > This would also let us associate one secret with many disks, which > might be a useful scenario for some people. AFAICS libvirt does not currently have the concept of an "user"; as described above, such features can be handled in an external key store. If you plan to add the "user" concept to libvirt, this would definitely make sense. Mirek

On Thu, Jul 23, 2009 at 11:43:09PM -0400, Miloslav Trmac wrote:
The <encryption> tag can be added to a <volume> node passed to virStorageVolCreateXML() to create an encrypted volume, or to a <disk> node inside a <domain> to specify what encryption parameters to use for a domain. If the domain is persistent, the parameters (including the passphrase) will be saved unencrypted in /etc/libvirtd; the primary use case is to store the parameters outside of libvirtd, (perhaps by virt-manager in a GNOME keyring).
- Support for "dumb" clients that don't know anything about encryption formats and the required parameters: adding an encryption format to libvirt would automatically make it supported in all clients.
Such a client would only request that a volume should be created when creating it, and libvirt would choose an appropriate format, parameters and passphrase/key and return it to the client, who could later pass it unmodified inside a <domain>.
This requires public API additions to let libvirt return the encryption information as one of the results of a volume creation operation.
- Support for storing the passphrases/keys used by persistent domains outside of the main XML files, e.g. in a separate passphrase-encrypted file that must be entered on libvirtd startup.
I think these two points overlap quite alot. Not quite: the main case of a "dumb" client would be a large-scale virtualization management software that contains a primary store of encryption information, and gives each node access only to those keys that are currently necessary by the node to run its domains; the fact that each node has access to only a limited set of keys prevents an attacker that compromises a single node from reading disk images of all domains managed in the entire site, even if the disk image storage (e.g. unencrypted NFS) does not allow managing access by each node separately.
Such a client must be able to transfer the actual secrets, not only identifiers, to libvirt. (The idea of a "dumb" client, that does not know the specifics of the format, is an additional feature on top, but one that implies that the client does send the actual secrets.)
This implies a flow of secrets Key server --\ +-> libvirt client -> libvirt daemon -> qemu MGMT server --/ This does not in fact guarentee that secrets for a particular node are only used on the node for which they are intended, because the key server cannot be sure of what libvirt daemons the libvirt client is connected to. What I am suggesting is that libvirt daemon should communicate with the key server directly in all cases, and take the client out of the loop. The client should merely indicate whether it wants encryption or not, and never be able to directly access any key material itself. With a direct trust relationship between the key server and each libvirtd daemon, you do now have a guarentee that keys are only ever used on the node for which they are intended. You also have the additional guarentee that no libvirt client can ever see any key secrets or passphrases since it has been taken completely out of the loop. Key server | V MGMT server -> libvirt client -> libvirt daemon -> qemu
Storage of secrets in a separate keystore is more important for "local" libvirt deployments, where libvirt manages the primary, long-term, store of the secrets.
Nope, I think that having a keystore for all scenarios is the desirable end goal. Passing secrets around in the XML should ideally never be done at all - we should aim to always have a keystore that can be used. Secrets in the XML would just be a fallback for a rarely used niche cases, or disaster recovery, or experimentation.
As you say there are two initial approaches to persistence of secrets
- Keep the keys in the domain XML files - Use a separate keystore
For the separate keystore, there are probably 3 interesting targets to consider
1. A simple text (or pkcs11) file managed by libvirtd on the host
This would be useful for the privileged libvirtd to use in ad-hoc, small scale deployments. Perhaps allowing it to be shared between a small number of hosts on NFS, or GFS etc
2. A desktop key agent (eg gnome-keyring)
This would be useful for the unprivileged libvirtd instances that run in the context of the desktop login session. Users already have SSH, GPG, website keys stored here, so having keys for their VM disks is obviously desirable
Another option here is to let the client store the secrets in gnome-keyring, and transfer them to libvirt only when starting a domain (especially when there are no persistent domains). That doesn't affect the design in any way, though.
This is undesriable because it lets requires that any client which wishes to start the guest must have access to the secrets. We really need to be able to have separation of this, so that when we introduce fine grained access control, you can setup separate roles for users who can access / work with secrets, vs users who can start/define guests.
3. A off-node key management server
This would be useful for a large scale data center / cluster cloud deployment. This is good to allow management scalability and better separation of responsiblities of adminstration.
If no keystore is in use, then I clearly all keys must go in & out of libvirt using the XML, which is pretty much what you're doing in this series. I would say though, there is no point in clearing the secret from the virStorageVolDefPtr instance/XML after volume creation, since the secret is going to be kept in memory forever for any guest using the volume. By not clearing the secret, an app could create a volume, requesting automatic key generation, then just do virStorageVolDumpXML(vol, VIR_STORAGE_VOL_XML_SECURE) to extract it and pass to onto the XML for the guest that's created
That is unreliable with current implementation, because a pool refresh creates all information about volumes anew by reading the volume files; therefore, after any pool refresh libvirt "forgets" the secrets directly associated with a volume (not secrets associated with an use of a volume in a domain). If client A creates a volume, client B can refresh the pool before client A is able to read the automatically generated secrets. One of the reasons the patch is clearing the information immediately is to make sure a {virStorageVolCreateXML; virStorageVolGetXMLDesc) always fails and no client is written depending on this racy operation sequence.
This is not a scenario we need to address with this work. In the future we will have fine grained access control per (user, object, operation) which will allow you to restirct which clients can access which data. We shouldn't try to hack in psuedo access control just for disk encryption. It is simply a documented limitation of the current libvirt access control model that all clients which have authenticated to a libvirtd daemon can access the same data.
There is another, more important, reason why the node should never return encryption data to anyone who can connect to it. Consider the above- described situation of a large-scale virtualization deployment that uses volume image encryption to restrict access of nodes to volume data: If domain migration is supported, the nodes (all of them, or at least nodes within some groups) must be able to connect "read-write" to other nodes.
This is just another argument for taking clients out of the loop completely and having libvirt daemon always talk directly with keystores.
and forbid the use of the actual secrets in the XML everywhere.
I wouldn't necessarily be that strict - if a centralized key management server is used by libvirt, it might still make sense for users to create one-off virtual machines that are encrypted, but not managed by the centralized infrastructure.
True, it might even be useful in some disaster recovery scenarios to be able to pass secrets directly if a keystore is not accessible for some reason.
Instead, either pass in a pre-created key UUID when creating a volume or defining a guest eg, the XML fragment would look like this
<encryption> <secret type='keyuuid'>123456-1234-1245-1234-124512345</secret> </encryption>
Internally libvirt would then get the real passphrase from the keystore by looking up the associated UUID.
Or request auto-creation of a key by leaving out the data
<encryption> <secret type='keyuuid'/> </encryption>
Internally libvirt would add a key to its database, and fillin the UUID which you could then see via DumpXML.
With a keystore we'd likely need a handful of APIs
- create a secret providing a passphrase - list all known secret UUIDs - get the passphrase assoicated with a secret - delete a secret based on UUID - lookup a secret UUID based on the a disk path
We might not need all of these. There are three main use cases for storing keys outside of XML:
1) This is a "small-scale deployment", where libvirt is the primary key store, and anyone that is able to connect to libvirtd is implicitly allowed to use the secrets. In this case the keystore can be managed completely automatically - creating a volume, or perhaps using it for the first time, implies storing a secret, and deleting a volume implies deleting a secret. Secrets are identified using a volume- unique object (is that a path, or an UUID?), which is completely transparent to the client. As long as the volume is "known" to libvirt, the client does not even need to specify any <encryption> element when creating a domain.
(1a, a "small-scale deployment" where there are multiple clients that should be protected against each other, and libvirt is the primary key store, does not make sense unless a client account system is added to libvirt.
2) The secrets are primarily stored outside libvirt (to restrict node access to image data), and libvirt stores only secrets for currently defined persistent domains (to support domain autostart). In this case the keystore can be managed completely automatically - persistent domain definition implies storing a secret, deleting the domain implies deleting a secret. Secrets are identified using a (domain, volume) pair, and this identifier is never exposed to the client.
Deleting the domain does not imply deleting a secret, since secrets are really associated with disks, not domains. A disk may be shared by multiple domains. In addition, when you delete a domain, libvirt does not delete the disk. It is the clients responsiblity to delete disks after the fact. In any case items 1 & 2 are really just 2 different implementations of the same concept. There is a keystore, and libvirt talks to it. Whether the keystore is local, or remote from the node is a minor impl detail. Use of a keystore of some format should be our primary goal here, since it takes the client out of the loop. When libvirt does ACLs on clients this is even more important, because when you revoke a clients' access to libvirt you can be sure they don't have a record of any of the secrets, since they never had any opportunity to see / use them during the time they were authenticated
3) An external key store (such as a KMIP server) is used. In this case the management of access rights, listing and deleting secrets, would be performed by interacting with the external key store directly. Secrets are identified by using the external key store identifiers, and the client and libvirtd send these identifiers to each other.
As long as we allow for secrets to be passed in the XML it will always be possible for a client to talk to a keystore, obtain the secrets and pass them in the XML. This is really not a desirable model to aim for by default though, because it requires clients to know about all the different types of keystore. If a client does not know about a particular type of keystore, then it becomes unable to manage guests on that libvirtd. Not to mention the issue of trust of the client & revocation of access.
The nice thing about separating the passphrases out of the XML completely is that we would then have the ability to do fine grained access control. eg, you could give people ability to create new guests / volumes, using encryption without them ever specifying, or having access to the secrets. A seperate role could be given ability to create/list/delete secrets. This would also let us associate one secret with many disks, which might be a useful scenario for some people.
AFAICS libvirt does not currently have the concept of an "user"; as described above, such features can be handled in an external key store. If you plan to add the "user" concept to libvirt, this would definitely make sense.
Yes, libvirt will grow the concept of a user in the future, as well as access controls on (user, object, operation) tuples Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
Miloslav Trmac