[libvirt] [PATCH] storage: add RBD support to disk source pool translation

Hello, We needed this to avoid redundancies when defining domains with RBD backends, because monitor list and authentication secret could not be pulled from the storage pool. I didn't find any bit of documentation to update, actually it already implies that disks with source type 'volume' would work with all kinds of backing pools. But mpath, sheepdog, and gluster would still require an implementation if it's possible. Using RBD volumes this way still has a limitation, we cannot list snapshots in libvirt pools. Whereas they could be accessed read-only if the domain is using a network source explicitely with a 'snapshot' attribute. How could we implement this so that snapshots are seen as pool volumes? There are real use-cases for this like attaching to a VM a backup from it's own disks. Also it would work the same way with ZFS snapshots, but AFAIK not for LVM pools. Cheers Thibault VINCENT (1): storage: add RBD support to generic disk source pool translation src/storage/storage_driver.c | 49 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-) -- 2.1.4

Domains can now reference disks of type 'volume' with an underlying RBD pool. It won't allow mapping snapshots, pools don't list them yet, only COW clones. - virStorageTranslateDiskSourcePool: add case to copy RBD attributes - virStorageAddRBDPoolSourceHost: new helper to copy monitor hosts --- src/storage/storage_driver.c | 49 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-) diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 394e4d4..8c27f4b 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -3173,6 +3173,39 @@ virStorageAddISCSIPoolSourceHost(virDomainDiskDefPtr def, static int +virStorageAddRBDPoolSourceHost(virDomainDiskDefPtr def, + virStoragePoolDefPtr pooldef) +{ + int ret = -1; + size_t i; + + if (pooldef->source.nhost > 0) { + def->src->nhosts = pooldef->source.nhost; + + if (VIR_ALLOC_N(def->src->hosts, def->src->nhosts) < 0) + goto cleanup; + + for (i = 0; i < def->src->nhosts; i++) { + if (VIR_STRDUP(def->src->hosts[i].name, + pooldef->source.hosts[i].name) < 0) + goto cleanup; + + if (pooldef->source.hosts[i].port) { + if (virAsprintf(&def->src->hosts[i].port, "%d", + pooldef->source.hosts[i].port) < 0) + goto cleanup; + } + } + } + + ret = 0; + + cleanup: + return ret; +} + + +static int virStorageTranslateDiskSourcePoolAuth(virDomainDiskDefPtr def, virStoragePoolSourcePtr source) { @@ -3324,8 +3357,22 @@ virStorageTranslateDiskSourcePool(virConnectPtr conn, } break; - case VIR_STORAGE_POOL_MPATH: case VIR_STORAGE_POOL_RBD: + if (!(def->src->path = virStorageVolGetPath(vol))) + goto cleanup; + + def->src->srcpool->actualtype = VIR_STORAGE_TYPE_NETWORK; + def->src->protocol = VIR_STORAGE_NET_PROTOCOL_RBD; + + if (virStorageTranslateDiskSourcePoolAuth(def, &pooldef->source) < 0) + goto cleanup; + + if (virStorageAddRBDPoolSourceHost(def, pooldef) < 0) + goto cleanup; + + break; + + case VIR_STORAGE_POOL_MPATH: case VIR_STORAGE_POOL_SHEEPDOG: case VIR_STORAGE_POOL_GLUSTER: case VIR_STORAGE_POOL_LAST: -- 2.1.4

Hello, Any idea about this one? Not sure if it's bad or getting lost in backlog, and I'd like to see it in next release. Cheers -- Thibault

On 09/06/2015 01:12, Thibault VINCENT wrote:
Any idea about this one? Not sure if it's bad or getting lost in backlog, and I'd like to see it in next release.
Actually I'm going to submit a new version, please don't merge. Latest patch from John Ferlan about secrettype for iSCSI revealed the same problem for RBD. Cheers -- Thibault

On 06/02/2015 09:56 AM, Thibault VINCENT wrote:
Domains can now reference disks of type 'volume' with an underlying RBD pool. It won't allow mapping snapshots, pools don't list them yet, only COW clones.
- virStorageTranslateDiskSourcePool: add case to copy RBD attributes - virStorageAddRBDPoolSourceHost: new helper to copy monitor hosts --- src/storage/storage_driver.c | 49 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-)
Started writing this before your most recent response - figured I'd finish so you'd get my current thoughts... Your cover letter indicates you didn't find any bit of documentation, but I'll point out that the formatdomain.html.in describes the "<disk type='volume'.../>" and how the "<source..." are described in order to use a disk type volume.. There's no tests in this patch to "show" or "prove" that by simply adding this code that libvirt will generate the correct qemu command in order to find the disk and it's auth information. Using gitk to find the commits that added the code it seems you copied from the ISCSI version into virStorageTranslateDiskSourcePool, you'll find a series of 3 patches - commit id's c00b2f0dd, 1f49b05a8, and 1b4eaa619 which allowed the direct access. Patches since then have moved the sources around a bit and changed the logic, but the intentions are the same. That might be a good place to start to ensure you have a way to have the domain XML recognize what it is you want and the qemu command to include/find the disk for the domain. Additionally, I just posted a patch from a recent bz regarding the auth and 'secrettype' setting (or lack thereof), see: http://www.redhat.com/archives/libvir-list/2015-June/msg00329.html and it's follow-up. Curiously that bz was generated because the domain XML didn't have the 'secrettype' defined so when formatting for a snapshot, there was an error. (OK - so you found this already...) John
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 394e4d4..8c27f4b 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -3173,6 +3173,39 @@ virStorageAddISCSIPoolSourceHost(virDomainDiskDefPtr def,
static int +virStorageAddRBDPoolSourceHost(virDomainDiskDefPtr def, + virStoragePoolDefPtr pooldef) +{ + int ret = -1; + size_t i; + + if (pooldef->source.nhost > 0) { + def->src->nhosts = pooldef->source.nhost; + + if (VIR_ALLOC_N(def->src->hosts, def->src->nhosts) < 0) + goto cleanup; + + for (i = 0; i < def->src->nhosts; i++) { + if (VIR_STRDUP(def->src->hosts[i].name, + pooldef->source.hosts[i].name) < 0) + goto cleanup; + + if (pooldef->source.hosts[i].port) { + if (virAsprintf(&def->src->hosts[i].port, "%d", + pooldef->source.hosts[i].port) < 0) + goto cleanup; + } + } + } + + ret = 0; + + cleanup: + return ret; +} + + +static int virStorageTranslateDiskSourcePoolAuth(virDomainDiskDefPtr def, virStoragePoolSourcePtr source) { @@ -3324,8 +3357,22 @@ virStorageTranslateDiskSourcePool(virConnectPtr conn, } break;
- case VIR_STORAGE_POOL_MPATH: case VIR_STORAGE_POOL_RBD: + if (!(def->src->path = virStorageVolGetPath(vol))) + goto cleanup; + + def->src->srcpool->actualtype = VIR_STORAGE_TYPE_NETWORK; + def->src->protocol = VIR_STORAGE_NET_PROTOCOL_RBD; + + if (virStorageTranslateDiskSourcePoolAuth(def, &pooldef->source) < 0) + goto cleanup; + + if (virStorageAddRBDPoolSourceHost(def, pooldef) < 0) + goto cleanup; + + break; + + case VIR_STORAGE_POOL_MPATH: case VIR_STORAGE_POOL_SHEEPDOG: case VIR_STORAGE_POOL_GLUSTER: case VIR_STORAGE_POOL_LAST:

On 09/06/2015 14:45, John Ferlan wrote:
Your cover letter indicates you didn't find any bit of documentation, but I'll point out that the formatdomain.html.in describes the "<disk type='volume'.../>" and how the "<source..." are described in order to use a disk type volume..
It's a misunderstanding, I never wished to imply a lack of information existed on how to use the feature. Actually it was the page that led to testing type 'volume' with RBD backends and finding out it didn't work yet. Hence supporting other pool type would not require an update to the documentation, which was my message. Although it's true we could mention what's not supported yet in that area.
There's no tests in this patch to "show" or "prove" that by simply adding this code that libvirt will generate the correct qemu command in order to find the disk and it's auth information. [...] That might be a good place to start to ensure you have a way to have the domain XML recognize what it is you want and the qemu command to include/find the disk for the domain.
Understood, working on new tests and a proper implementation. It turns out I overlooked a lot of problems.
Curiously that bz was generated because the domain XML didn't have the 'secrettype' defined so when formatting for a snapshot, there was an error. (OK - so you found this already...)
Indeed. The lack of 'secrettype' prevented migrations because of this, and your post came in timely. Thanks John -- Thibault

On 06/09/2015 10:47 AM, Thibault VINCENT wrote:
On 09/06/2015 14:45, John Ferlan wrote:
Your cover letter indicates you didn't find any bit of documentation, but I'll point out that the formatdomain.html.in describes the "<disk type='volume'.../>" and how the "<source..." are described in order to use a disk type volume..
It's a misunderstanding, I never wished to imply a lack of information existed on how to use the feature. Actually it was the page that led to testing type 'volume' with RBD backends and finding out it didn't work yet. Hence supporting other pool type would not require an update to the documentation, which was my message. Although it's true we could mention what's not supported yet in that area.
There's no tests in this patch to "show" or "prove" that by simply adding this code that libvirt will generate the correct qemu command in order to find the disk and it's auth information. [...] That might be a good place to start to ensure you have a way to have the domain XML recognize what it is you want and the qemu command to include/find the disk for the domain.
Understood, working on new tests and a proper implementation. It turns out I overlooked a lot of problems.
Curiously that bz was generated because the domain XML didn't have the 'secrettype' defined so when formatting for a snapshot, there was an error. (OK - so you found this already...)
Indeed. The lack of 'secrettype' prevented migrations because of this, and your post came in timely.
Thanks John
I still think I need a tweak on what I posted as an update - the libvirtd restart path is always a bit tricky. The 'pooltype' isn't filled in at XML processing time and virStorageTranslateDiskSourcePool is only called after XML processing (sigh), so relying on it won't work. Working to figure out something generic and will repost my patch. John

On 09/06/2015 17:33, John Ferlan wrote:
I still think I need a tweak on what I posted as an update - the libvirtd restart path is always a bit tricky. The 'pooltype' isn't filled in at XML processing time and virStorageTranslateDiskSourcePool is only called after XML processing (sigh), so relying on it won't work.
Working to figure out something generic and will repost my patch.
John, I got stuck on this and would appreciate some advice to deal with the restart. With a revised patch it's still impossible to get libvirtd reconnecting to VMs using RBD source pool, because the network pool is never active so early. It goes like this : + qemuProcessReconnect ++ virStorageTranslateDiskSourcePool +++ virStoragePoolIsActive ++++ storagePoolIsActive = 0 ++ goto error ++ qemuProcessStop But seconds after it's seen as active. Isn't it a bad idea to consider a failure at translating source pool as an error in this code path? It looks quite dangerous whatever the pool protocol may be. After all the domain is still happily running anyway. Cheers -- Thibault

On 06/25/2015 06:21 AM, Thibault VINCENT wrote:
On 09/06/2015 17:33, John Ferlan wrote:
I still think I need a tweak on what I posted as an update - the libvirtd restart path is always a bit tricky. The 'pooltype' isn't filled in at XML processing time and virStorageTranslateDiskSourcePool is only called after XML processing (sigh), so relying on it won't work.
Working to figure out something generic and will repost my patch.
John,
I got stuck on this and would appreciate some advice to deal with the restart. With a revised patch it's still impossible to get libvirtd reconnecting to VMs using RBD source pool, because the network pool is never active so early.
It goes like this : + qemuProcessReconnect ++ virStorageTranslateDiskSourcePool +++ virStoragePoolIsActive ++++ storagePoolIsActive = 0 ++ goto error ++ qemuProcessStop
But seconds after it's seen as active.
Isn't it a bad idea to consider a failure at translating source pool as an error in this code path? It looks quite dangerous whatever the pool protocol may be. After all the domain is still happily running anyway.
Cheers
Black magic - you have to go further in the call stack to what calls qemuProcessReconnect to fully understand, but along the way I'll bet storagePoolUpdateState is called My assumption without doing too much thinking is that at some point the virStorageBackendRBDRefreshPool will reconnect for you in the storageDriverAutostart path... While not exactly the same, consider how the iSCSI pool handles uses the checkPool callback. During storagePoolUpdateState, if the 'checkPool' callback determines the pool is active, then storagePoolUpdateState will call refreshPool with a NULL conn; otherwise, we wait for Autostart. So perhaps if the RBD backend had a checkPool that only did some sort of cursory check to ensure there was the infrastructure necessary for the ensuing refreshPool to work. Of course in the UpdateState path, 'conn == NULL', so the virStorageBackendRBDOpenRADOSConn won't be happy if it has authdata, but that's a decision you can make in your refreshPool callback... I don't have a conn, thus I cannot do "something". HTH, John

On 06/25/2015 09:16 PM, John Ferlan wrote:
My assumption without doing too much thinking is that at some point the virStorageBackendRBDRefreshPool will reconnect for you in the storageDriverAutostart path...
While not exactly the same, consider how the iSCSI pool handles uses the checkPool callback.
During storagePoolUpdateState, if the 'checkPool' callback determines the pool is active, then storagePoolUpdateState will call refreshPool with a NULL conn; otherwise, we wait for Autostart.
So perhaps if the RBD backend had a checkPool that only did some sort of cursory check to ensure there was the infrastructure necessary for the ensuing refreshPool to work.
Of course in the UpdateState path, 'conn == NULL', so the virStorageBackendRBDOpenRADOSConn won't be happy if it has authdata, but that's a decision you can make in your refreshPool callback... I don't have a conn, thus I cannot do "something".
Thanks a lot, that really helped. Indeed the iSCSI pool do not require a conn because session state is kept outside of libvirt, so a secret is not required by refreshPool when called during storageStateInitialize. While RBD backend requires the Rados connection for pretty much anything, and has to pull a secret. Hence during storageStateInitialize we cannot implement a checkPool or refreshPool without a conn handle. Now I can understand why we can't have secret access until the *Initialize loop completes, but during *Autostart it's fine. Only that it's too late to recover domains. Maybe we would need a new path here. We can't autostart the qemu driver before attempting to reconnect, right, to prevent VMs running twice. But then how could we get a callback in RBD backend to be invoked with a valid conn before? Has this problem been addressed in an other way already? I cannot see where the current infrastructure may allow that. It would be nice that some drivers be considered 'essentials' and initialized before. (Or even going dependency-based.) How do you think this could evolve? For now I'll dive in other drivers code to search for similar case, and hardcode part of the init for in-house use... time constraints. Cheers, -- Thibault VINCENT - Lead System Engineer, Infrastructure - Arkena | Phone: +33 1 58 68 62 38 | Mobile: +33 6 78 97 01 08 27 Blvd Hippolyte Marquès, 94200 Ivry-sur-Seine, France | www.arkena.com Arkena - Ready to Play
participants (3)
-
John Ferlan
-
Thibault VINCENT
-
Thibault VINCENT