[libvirt] ZFS backend does fail if used on non top level pools

Hi, by discussing on bug [1] we found an issue in the ZFS storage backend. When one defines a zfs pool through XML (virsh pool-create --build) a top level ZFS pool will be created and works fine. The virsh pool-define-as counterpart to this is: $ fallocate -l 100M /tmp/Mzfs $ sudo zpool create Mzfs /tmp/Mzfs $ virsh pool-define-as --name zfs --source-name Mzfs --type zfs $ virsh pool-start zfs $ virsh vol-create-as --pool zfs --name vol1 --capacity 10M $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1 Ok, all fine so far. But if one wants to use a zfs-FS as sub-pool things work a while but fail as soon as pool info is refreshed. $ fallocate -l 100M /tmp/Xzfs $ sudo zpool create Xzfs /tmp/Xzfs $ sudo zfs create Xzfs/images $ virsh pool-define-as --name Xzfs --source-name Xzfs/images --type zfs I ended with the pool not being started (expected after pool-define-as), but then $ virsh pool-start Xzfs $ virsh vol-create-as --pool Xzfs --name vol1 --capacity 10M $ virsh vol-list --pool Xzfs vol1 /dev/zvol/Xzfs/images/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool Xzfs # no output anymore We happened to find this in the libvirt log: error : virCommandWait:2601 : internal error: Child process (/sbin/zpool get -Hp health,size,free,allocated Xzfs/images) unexpected exit status 1: cannot open 'Xzfs/images': invalid character '/' in pool name So it seems the data refresh would need to become aware of filesytems and strip this to the basename before doing the call. This comes from src/storage/storage_backend_zfs.c cmd = virCommandNewArgList(ZPOOL, "get", "-Hp", "health,size,free,allocated", def->source.name, NULL); # fails zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs/images # would work zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs But there might be more to it than what I see, either more changes needed to fully work with subpools or intentionally not working with them. For the latter we should still consider improving the error handling, like refusing right away with a reasonable message on "virsh pool-define-as". I wondered if one with more experience in that area of libvirt would be willing to pick this up or has an opinion why it would not need to be changed that I might be missing? [1]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1767997 -- Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd

On Wed, May 02, 2018 at 09:36:10AM +0200, Christian Ehrhardt wrote:
Hi, by discussing on bug [1] we found an issue in the ZFS storage backend. When one defines a zfs pool through XML (virsh pool-create --build) a top level ZFS pool will be created and works fine. The virsh pool-define-as counterpart to this is:
$ fallocate -l 100M /tmp/Mzfs $ sudo zpool create Mzfs /tmp/Mzfs $ virsh pool-define-as --name zfs --source-name Mzfs --type zfs $ virsh pool-start zfs $ virsh vol-create-as --pool zfs --name vol1 --capacity 10M $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1
Ok, all fine so far. But if one wants to use a zfs-FS as sub-pool things work a while but fail as soon as pool info is refreshed.
$ fallocate -l 100M /tmp/Xzfs $ sudo zpool create Xzfs /tmp/Xzfs $ sudo zfs create Xzfs/images $ virsh pool-define-as --name Xzfs --source-name Xzfs/images --type zfs I ended with the pool not being started (expected after pool-define-as), but then $ virsh pool-start Xzfs $ virsh vol-create-as --pool Xzfs --name vol1 --capacity 10M $ virsh vol-list --pool Xzfs vol1 /dev/zvol/Xzfs/images/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool Xzfs # no output anymore
We happened to find this in the libvirt log: error : virCommandWait:2601 : internal error: Child process (/sbin/zpool get -Hp health,size,free,allocated Xzfs/images) unexpected exit status 1: cannot open 'Xzfs/images': invalid character '/' in pool name
So it seems the data refresh would need to become aware of filesytems and strip this to the basename before doing the call. This comes from src/storage/storage_backend_zfs.c
cmd = virCommandNewArgList(ZPOOL, "get", "-Hp", "health,size,free,allocated", def->source.name, NULL); # fails zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs/images # would work zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs
But there might be more to it than what I see, either more changes needed to fully work with subpools or intentionally not working with them. For the latter we should still consider improving the error handling, like refusing right away with a reasonable message on "virsh pool-define-as".
I wondered if one with more experience in that area of libvirt would be willing to pick this up or has an opinion why it would not need to be changed that I might be missing?
I don't really know anything about the ZFS pool in libvirt, but it superficially looks like a genuine libvirt bug we should try to fix. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 05/02/2018 03:36 AM, Christian Ehrhardt wrote:
Hi, by discussing on bug [1] we found an issue in the ZFS storage backend. When one defines a zfs pool through XML (virsh pool-create --build) a top level ZFS pool will be created and works fine. The virsh pool-define-as counterpart to this is:
$ fallocate -l 100M /tmp/Mzfs $ sudo zpool create Mzfs /tmp/Mzfs $ virsh pool-define-as --name zfs --source-name Mzfs --type zfs $ virsh pool-start zfs $ virsh vol-create-as --pool zfs --name vol1 --capacity 10M $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool zfs vol1 /dev/zvol/Mzfs/vol1
Ok, all fine so far. But if one wants to use a zfs-FS as sub-pool things work a while but fail as soon as pool info is refreshed.
$ fallocate -l 100M /tmp/Xzfs $ sudo zpool create Xzfs /tmp/Xzfs
If one digs into the virStorageBackendZFSBuildPool they will see libvirt pool create/build processing would "zpool create $name $path[0...n]" where $name is the "source.name" (in your case Xzfs/images) and $path[0...n] would be the various paths (in your case tmp/Xzfs). So the source.name is probably wrong and should be def->name instead. When a volume is created in virStorageBackendZFSCreateVol the source.name and volume name are concatenated - so that seems right.
$ sudo zfs create Xzfs/images
This one is the "unknown" for me. What happens if you create Xzfs/images/vol1 (or your command below) without first creating Xzfs/images? What probably needs to happen, is the zfs backend needs to have a pool create added which does the zpool command and then the existing pool build command would check if def->name != def->source.name and then do the zfs create @def->source.name... Of course that also assumes that the def->name is STRPREFIX os the def->source.name. e.g., in this example the pool def->name is "Xzfs" and that's what starts the source.name of "Xzfs/image".
$ virsh pool-define-as --name Xzfs --source-name Xzfs/images --type zfs I ended with the pool not being started (expected after pool-define-as), but then $ virsh pool-start Xzfs $ virsh vol-create-as --pool Xzfs --name vol1 --capacity 10M $ virsh vol-list --pool Xzfs vol1 /dev/zvol/Xzfs/images/vol1 $ virsh pool-refresh zfs Pool zfs refreshed $ virsh vol-list --pool Xzfs # no output anymore
We happened to find this in the libvirt log: error : virCommandWait:2601 : internal error: Child process (/sbin/zpool get -Hp health,size,free,allocated Xzfs/images) unexpected exit status 1: cannot open 'Xzfs/images': invalid character '/' in pool name
So it seems the data refresh would need to become aware of filesytems and strip this to the basename before doing the call. This comes from src/storage/storage_backend_zfs.c
cmd = virCommandNewArgList(ZPOOL, "get", "-Hp", "health,size,free,allocated", def->source.name <http://source.name>,
I think instead of def->source.name here, def->name should be used. I would think def->source.name would *only* be used in conjunction with Storage Pool "Volume" API's while def->name would be used for anything in Storage Pool as a whole API's.
NULL);
# fails zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs/images # would work zfs list -Hp -t volume -r -o name,volsize,refreservation Xzfs
But there might be more to it than what I see, either more changes needed to fully work with subpools or intentionally not working with them. For the latter we should still consider improving the error handling, like refusing right away with a reasonable message on "virsh pool-define-as".
I wondered if one with more experience in that area of libvirt would be willing to pick this up or has an opinion why it would not need to be changed that I might be missing?
I don't have a zfs pool to play with, but if someone proposes patches I'll look. I think there's a few hints above. Looks like Roman Bogorodskiy was the original author and Richard Laager has also made other improvements. John
[1]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1767997 <https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1767997>
-- Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (3)
-
Christian Ehrhardt
-
Daniel P. Berrangé
-
John Ferlan