> This one is the "unknown" for me. What happens if you create
> Xzfs/images/vol1 (or your command below) without first creating Xzfs/images?

Answer: it fails, unless you give the '-p' flag.

-p Creates all the non-existing parent datasets. Datasets created in this manner are automatically mounted according to the mountpoint property inherited from their parent. Any property specified on the command line using the -o option is ignored. If the target filesystem already exists, the operation completes successfully.

Example: given an existing zfs pool called "zfs":

# zfs create zfs/foo/barcannot create 'zfs/foo/bar': parent does not exist# zfs create -p zfs/foo/bar# zfs list zfs/fooNAME USED AVAIL REFER MOUNTPOINTzfs/foo 192K 23.5G 96K /zfs/foo# zfs list -r zfs/fooNAME USED AVAIL REFER MOUNTPOINTzfs/foo 192K 23.5G 96K /zfs/foozfs/foo/bar 96K 23.5G 96K /zfs/foo/bar

However, I don't see this as a problem for libvirt. The parent should already exist when you define the pool, and I expect libvirt will only create immediate children.

> If one digs into the virStorageBackendZFSBuildPool they will see libvirt
> pool create/build processing would "zpool create $name $path[0...n]"
> where $name is the "source.name" (in your case Xzfs/images) and
> $path[0...n] would be the various paths (in your case tmp/Xzfs)

Just to be clear, creating a zpool ("zpool create") is different to creating a zfs dataset ("zfs create").

By analogy to LVM: a zpool is like a volume group, and a zfs dataset/zvol is like a logical volume.

A zpool (or VG) is created from a collection of block devices - or something which looks like a block device, e.g. a partition or a loopback-mounted file. Those are $path[0...n] in the above, and would be called "physical volumes" in LVM.

"zfs create" then creates a dataset (filesystem) or zvol (block device) which draws space out of the zpool. The analogous operation in LVM is "lvcreate", although it will only give you a block device - it's up to you to make a filesystem within it.

In summary:

zpool create ==> vgcreate (*)
zfs create -V ==> lvcreate

(*) LVM also requires you to label the block devices with "pvcreate" before you can add them to a volume group. zpool create doesn't require this.

From my point of view, as a libvirt user: I *could* dedicate an entire zpool to libvirt, but I don't want to. It would mean libvirt has full ownership of that set of physical disks, and I may want to use the space for other things as well.

What I want to do is to allow libvirt to use an existing zpool, with a parent dataset which it can allocate underneath, like this:

zfs create zfs/libvirt
virsh pool-define-as --name zfs --source-name zfs/libvirt --type zfs

(instead of using pool-create/pool-build). This not only makes it clear which datasets belong to libvirt, but allows me to do things like storage accounting at the parent dataset level.

And actually, this almost works. It's just the pool refresh which fails, because it tries to treat "zfs/images" as if it were a zpool. Stripping off everything up to the first slash for "zpool get" would fix this.

Arguably this uncovers a couple of other related issues to do with error handling:

- to the end user, "virsh pool-refresh" silently appears to work (unless you dig down deep into logs), even though the underlying "zpool get" returns with an error

- by this stage, pool-refresh has already destroyed all existing libvirt volumes which were previously in the pool

Regards,

Brian.