[Libvir] Storage manager initial requirements and thoughts

Hi all. At the Red Hat virtualization team meeting last week we spent some time talking about the problem of remote storage management, which is a requirement for at the very least creating remote guests. Remote storage management also delivers a lot of benefits for managing guest storage, provisioning, and a host of other issues. Note that we don't (I think) believe that storage management should be part of libvirt. However we might want to be able to grab storage information with libvirt when creating a guest, for example. Here's the tentative requirements we came up with, along with several questions that came up. We have two important use cases for remote storage management: --Create a new guest against an existing physical device --Create a new guest against a file on an existing physical device (SELinux context must match up on both) Requirements: 2 key requirements: enumerate devices, create files * Enumerate devices (unallocated disks) * storage pool * Unallocated space * Allocated volumes -- free -- in use * Read only / Read-write * Host availability * Volume -- Global unique name -- Local device name -- Usage -- Mounted locally -- Assigned to guest read only shared exclusive -- Inactive guest * Create a backing store for a guest * need to know what storage is available * Can be plain file * Can be a physical partition (local scsi/IDE, SAN) * Can be a logical volume * Can be network - iscsi/nbd Cases: Physical device --> create partitions Volume group --> create logical volumes Directory --> create files Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered to take a look at this?) Identify what other scenarios we need to address. Please comment; I'm hoping we can get these into a more definite form sooner rather than later. -- Red Hat Virtualization Group http://redhat.com/virtualization Hugh Brock | virt-manager http://virt-manager.org hbrock@redhat.com | virtualization library http://libvirt.org

On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered
* Investigate Conga's cluster and non-cluster remotely-accessible LVM management, which sounds like it would fit the bill? APIs are all XMLRPC, IIRC, so they're extensible and flexible. http://sourceware.org/cluster/conga/ -- Lon

Lon Hohberger wrote:
On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered
* Investigate Conga's cluster and non-cluster remotely-accessible LVM management, which sounds like it would fit the bill?
APIs are all XMLRPC, IIRC, so they're extensible and flexible.
http://sourceware.org/cluster/conga/
-- Lon
Sorry, yes, I meant to add looking at Conga as another possibility. --Hugh -- Red Hat Virtualization Group http://redhat.com/virtualization Hugh Brock | virt-manager http://virt-manager.org hbrock@redhat.com | virtualization library http://libvirt.org

Lon Hohberger wrote:
On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered
* Investigate Conga's cluster and non-cluster remotely-accessible LVM management, which sounds like it would fit the bill?
APIs are all XMLRPC, IIRC, so they're extensible and flexible.
http://sourceware.org/cluster/conga/
-- Lon
It does general partition and file system creation tasks as well. The next release will even help you set up iscsi initiator and target. -j

On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
Lon Hohberger wrote:
It seems that that should be http://www.sourceware.org/cluster/conga/ - I get a 503 without the www. BTW, where is the conga CVS ? It doesn't seem to be linked from that page. David

On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
Lon Hohberger wrote:
It seems that that should be http://www.sourceware.org/cluster/conga/ - I get a 503 without the www. BTW, where is the conga CVS ? It doesn't seem to be linked from that page.
Hrm, I thought it was on sources.redhat.com... Jim? -- Lon

On Mon, 2007-02-12 at 16:21 -0500, Lon Hohberger wrote:
On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
Lon Hohberger wrote:
It seems that that should be http://www.sourceware.org/cluster/conga/ - I get a 503 without the www. BTW, where is the conga CVS ? It doesn't seem to be linked from that page.
Hrm, I thought it was on sources.redhat.com... Jim?
http://sources.redhat.com/cgi-bin/cvsweb.cgi/conga/?cvsroot=cluster -- Lon

Lon Hohberger wrote:
On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
Lon Hohberger wrote:
It seems that that should be http://www.sourceware.org/cluster/conga/ - I get a 503 without the www. BTW, where is the conga CVS ? It doesn't seem to be linked from that page.
Hrm, I thought it was on sources.redhat.com... Jim?
-- Lon
It is on sources.redhat.com/cluster/conga

On Fri, Feb 09, 2007 at 04:39:03PM -0500, Lon Hohberger wrote:
On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered
* Investigate Conga's cluster and non-cluster remotely-accessible LVM management, which sounds like it would fit the bill?
APIs are all XMLRPC, IIRC, so they're extensible and flexible.
There unfortunately is a bit of an impedance mis-match between libvirt and Conga. libvirt is a low level library written with the goal that if you have a host running Xen / QEMU / KVM, you can just drop in the libvirt library and get a set of APIs for managing the system. Experiance with developing virt-inst, virt-manager & cobbler/koan has shown that we need a simple API for enumerating available storage volumes, and allocating new volumes. In providing such an API though, we don't want to have to mandate that everyone using libvirt also install Conga. While Conga is indeed a very capable tool, requiring install / setup of another web service is going to put up a singificant barrier to entry for people wanting to use libvirt/ Particularly for developers who are just experimenting with virtualization on a laptop / desktop / couple of machines. Hence our initial goal is to find a suitable C library we can call into to perform our simple set of storage management tasks. Now in keeping with the libvirt model of pluggable hypervisor drivers, I'd expect the underlying libvirt impl of any storage APIs to also be pluggable. So while the initial impl might be based on GParteD, we would have the option of also providing a Conga based backend at a later date. Regards, Dan -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, 2007-02-12 at 17:53 +0000, Daniel P. Berrange wrote:
Hence our initial goal is to find a suitable C library we can call into to perform our simple set of storage management tasks. Now in keeping with the libvirt model of pluggable hypervisor drivers, I'd expect the underlying libvirt impl of any storage APIs to also be pluggable. So while the initial impl might be based on GParteD, we would have the option of also providing a Conga based backend at a later date.
Thanks for the clarification! I'm comfortable with that explanation. My initial response was without a full understanding of the context & requirements. -- Lon

On Fri, 2007-02-09 at 16:39 -0500, Lon Hohberger wrote:
* Investigate Conga's cluster and non-cluster remotely-accessible LVM management, which sounds like it would fit the bill?
Scott did a lot of work to add storage mgmt capabilities to puppet based on conga. Those features haven't been embraced by the puppet community. Feedback I have seen has been (a) conga is not in Fedora, let alone any other Linux distros (e.g., debian) (b) conga is perceived as too RH specific. David

Hugh Brock <hbrock@redhat.com> wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered to take a look at this?)
Hi Hugh, I am indeed looking at GNU parted. Currently just in getting-to-know-you mode, but a few of my patches have already gone in. FYI, here are some of the relevant resources: Sources: Upstream source repository: http://git.debian.org/?p=parted/parted.git;a=summary Check out sources on the trunk via: cg-clone http://git.debian.org/git/parted/parted.git Bug tracking and development plans: http://parted.alioth.debian.org/cgi-bin/trac.cgi http://parted.alioth.debian.org/cgi-bin/trac.cgi/wiki/PlanningEdge Mailing lists: bug-parted: subscribe: http://mail.gnu.org/mailman/listinfo/bug-parted read: http://news.gmane.org/gmane.comp.gnu.parted.bugs parted-devel http://lists.alioth.debian.org/mailman/listinfo/parted-devel parted-edge (discuss next major version, 2.0): http://lists.alioth.debian.org/mailman/listinfo/parted-edge IRC: #parted on irc.freenode.net

Jim Meyering <jim@meyering.net> wrote:
Hugh Brock <hbrock@redhat.com> wrote:
Todos: Investigate gparted, one of the partition management tools we already have (apis? remote accessibility?) (I believe Jim Meyering volunteered to take a look at this?)
Recently, I've been spending a good bit of time getting to know and improve GNU parted: http://git.debian.org/?p=parted/parted.git;a=summary Note that GNU parted builds the command-line tool, "parted", and an underlying library, libgparted, while "gparted" is a full-featured GUI-based tool that depends on lots of libraries (including libgparted) and command-line oriented tools. Currently, other than my work on the trunk and David Cantrell's work on the upcoming parted-1.8.3, not much is happening on the Parted development front, afaik. Well, there have been a few patches to add FreeBSD support. Note that the previous parted maintainer stepped down in mid-January. Some of the things I've done: modernized parted's build infrastructure, made some of its interfaces "const correct", and tracked down the source of the Gparted-ISO/CD segfault. That the latter led to my finding three buffer-overrun bugs was a big surprise. I had the impression that parted was more mature. I don't know the history, but expect it's just fall-out from a recent change to let parted handle larger-than-512B logical sectors. In its defense, Parted does issue a big warning when it has to deal with a disk having a logical sector size of 2048: ... Not all parts of GNU Parted support this at the moment, and the working code is HIGHLY EXPERIMENTAL. Parted has what looks like a reasonable testing framework, but it is suffering from bit-rot: all tests fail, though it might be easy to fix. Actually bit-rot is a problem with another big part of parted: since most of the fs-specific code is from snapshots of other projects, it is probably old. Both removing that fs/ code (and replacing with uses of external libraries, where possible) and adding unit tests are high priority. Another pretty high priority item seems to be adding LVM support. Several people have expressed interest in the last few months. However, there is no API at all for it[*]. I've heard that Alasdair Kergon is amenable to the idea of someone else adding a real API, and that there are already some requirements. I've just begun to look at device-mapper and lvm/lvm2. How important do you guys think having LVM support will be to ET projects? And when will you need it? FYI, as for my plans, top of the list is getting some parted tests running (and passing), and then checking in fixes for the few problems I uncovered but haven't yet been able to test. Then I'll finish reviewing the public interfaces and fix some link-related problems (among other things, libparted maintains some static state). Then I'll look into LVM. Jim [*] liblvm2cmd, a wrapper around the command line tools, doesn't count :-)

Jim Meyering wrote:
How important do you guys think having LVM support will be to ET projects? And when will you need it?
For my point of view, as former sysadmin, virtualisation and LVM are such a natural fit for each other that I can hardly imagine _not_ provisioning new virtual servers from space in a VG. I did look at the API for libparted a few months ago (actually from the rather ancient released version on gnu.org) and it didn't look to me like there was any way to express LVM notions through the API, so I guess this will require a lot of new API calls and structures? Some more open question to everyone else: Do we need Python bindings? Should libvirt's C API use/expose libparted structures directly? (And how would this affect the remote case?) Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Wed, Mar 14, 2007 at 09:17:37AM +0000, Richard W.M. Jones wrote:
Jim Meyering wrote:
How important do you guys think having LVM support will be to ET projects? And when will you need it?
For my point of view, as former sysadmin, virtualisation and LVM are such a natural fit for each other that I can hardly imagine _not_ provisioning new virtual servers from space in a VG.
Yes, I'd see LVM as the primary backend for the majority of production level virtual machine deployments, particularly with Xen. Closely following that I'd see file based virtual disks as the next most likely backend. Traditional block device partitions I think will really be very much a niche case, except for people (fortunate) enough to have SAN storage. With the case of SAN though, the SAN administrator carves up the storage into chunks with the SAN speciifc managmenet tools, so libparted wouldn't be involved there.
I did look at the API for libparted a few months ago (actually from the rather ancient released version on gnu.org) and it didn't look to me like there was any way to express LVM notions through the API, so I guess this will require a lot of new API calls and structures?
The other option is to simply call the LVM commands directly from libvirt which is what pretty much every app seems todo when they need to talk to LVM. We already do this in the network driver backend to deal with iptables and it isn't all that evil. If the libparted developers are working on LVM APIs we should encourage them, but its not clear to me that its worth expending our own resources to develop full LVM support in libparted when libvirt will only ever need a tiny number of LVM operations to be invokved.
Some more open question to everyone else: Do we need Python bindings?
Yes, in so much as any libvirt API needs python (or other language) bindings.
Should libvirt's C API use/expose libparted structures directly? (And how would this affect the remote case?)
I'd say definitely not expose libpartd via libvirt APIs. I view libparted as an internal implementation detail. We're not seeking to turn libvirt into a general purpose parititioning tool, but rather just providing a minimal set of APIs for enumerating, creating and assigning virtual disks to machines. Such an API would be operating at a more abstract higher level than the libparted API, so exposing libparted would be a mistake in this respect. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

"Daniel P. Berrange" <berrange@redhat.com> wrote:
On Wed, Mar 14, 2007 at 09:17:37AM +0000, Richard W.M. Jones wrote: ...
I did look at the API for libparted a few months ago (actually from the rather ancient released version on gnu.org) and it didn't look to me like there was any way to express LVM notions through the API, so I guess this will require a lot of new API calls and structures?
The other option is to simply call the LVM commands directly from libvirt which is what pretty much every app seems todo when they need to talk to LVM. We already do this in the network driver backend to deal with iptables and it isn't all that evil. If the libparted developers are working on LVM APIs we should encourage them, but its not clear to me that its worth expending our own resources to develop full LVM support in libparted when libvirt will only ever need a tiny number of LVM operations to be invokved.
If the set of LVM operations required by libvirt is really so small, then I'd say it's worth investing in doing the right thing, if only to create the tiny API right away. With a real library API, libvirt stand a much better chance of properly diagnosing, and even detecting, partitioning failures. Of course, when everything works well, there's very little difference, but it's when things go wrong that users will thank us. Can you give me (an LVM newbie) an idea of what the few libvirt-required LVM operations are?

Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote:
I did look at the API for libparted a few months ago (actually from the rather ancient released version on gnu.org) and it didn't look to me like there was any way to express LVM notions through the API, so I guess this will require a lot of new API calls and structures? The other option is to simply call the LVM commands directly from libvirt which is what pretty much every app seems todo when they need to talk to LVM. We already do this in the network driver backend to deal with iptables and it isn't all that evil. If the libparted developers are working on LVM APIs we should encourage them, but its not clear to me that its worth expending our own resources to develop full LVM support in
On Wed, Mar 14, 2007 at 09:17:37AM +0000, Richard W.M. Jones wrote: ... libparted when libvirt will only ever need a tiny number of LVM operations to be invokved.
If the set of LVM operations required by libvirt is really so small, then I'd say it's worth investing in doing the right thing, if only to create the tiny API right away.
With a real library API, libvirt stand a much better chance of properly diagnosing, and even detecting, partitioning failures. Of course, when everything works well, there's very little difference, but it's when things go wrong that users will thank us.
Can you give me (an LVM newbie) an idea of what the few libvirt-required LVM operations are?
Quoting almost directly from the operations manual at my former workplace, the LVM-related operations needed to create a new VM were: lvcreate -L 3G -n newroot raidvg lvcreate -L 1G -n newswap raidvg dd if=/dev/raidvg/oldroot of=/dev/raidvg/newroot mkswap /dev/raidvg/newswap (That's right - we used to create new machines by dupping old ones). Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

Richard W.M. Jones wrote:
lvcreate -L 3G -n newroot raidvg lvcreate -L 1G -n newswap raidvg
I should add, in a libvirt context it's probably going to be useful to also: * list available volume groups (vgscan) * list space available in each VG (vgdisplay name-of-vg) * show how VGs relate to PGs (pvscan) The LVM tools use quite a sophisticated internal API which isn't exported - but see for example tools/tools.h and include/*.h in the LVM2 source code. <libdevmapper.h> is also available, but much more lowlevel. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Thu, 2007-03-15 at 14:31 +0000, Richard W.M. Jones wrote:
Richard W.M. Jones wrote:
lvcreate -L 3G -n newroot raidvg lvcreate -L 1G -n newswap raidvg
I should add, in a libvirt context it's probably going to be useful to also:
* list available volume groups (vgscan) * list space available in each VG (vgdisplay name-of-vg) * show how VGs relate to PGs (pvscan)
I think that lists a pretty good subset; would add one more in that: expand the storage in an existing LV. For non-libvirt uses, what people really want here is lvextend -L +20G /dev/vg00/music && ext2online /data/music/ To make that automatable, you'd also need functionality similar to lvdisplay, i.e. a way to query how big an lv is currently. But those things together would go a very long way towards more manageable storage. David

Richard W.M. Jones wrote:
Richard W.M. Jones wrote:
lvcreate -L 3G -n newroot raidvg lvcreate -L 1G -n newswap raidvg
I should add, in a libvirt context it's probably going to be useful to also:
* list available volume groups (vgscan) * list space available in each VG (vgdisplay name-of-vg) * show how VGs relate to PGs (pvscan)
The LVM tools use quite a sophisticated internal API which isn't exported - but see for example tools/tools.h and include/*.h in the LVM2 source code. <libdevmapper.h> is also available, but much more lowlevel.
I talked to Alasdair Kergon about all of this on Tuesday, and there are a few clarifications. Firstly, vgscan & vgdisplay are apparently deprecated (who knew?) and there is a new command (vgs) which replaces both. The good news is that vgs is designed so that the output can be parsed: # /usr/sbin/vgs VG #PV #LV #SN Attr VSize VFree Home 1 1 0 wz--n- 298.09G 8.09G VolGroup00 1 2 0 wz--n- 152.56G 32.00M versus the old way: # /usr/sbin/vgdisplay Home --- Volume group --- VG Name Home System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 2 [etc] Similarly, pvs instead of pvscan/pvdisplay and lvs instead of lvscan/lvdisplay. Secondly there is an API of sorts for lvm2. I think Alasdair called it "libcmd", but maybe I got that wrong because Google doesn't seem to turn up anything. In any case, all it is is a wrapper around the command line tools, so it seems doubtful that this is going to be any better than just invoking the command line tools ourselves. Thirdly, Alasdair is quite keen to see a proper programming API for LVM2. However I don't think anyone has the time to do it, and he has the usual concerns for backwards binary compatibility and making sure the API mirrors the command line tools. LVM2 is still evolving at a fair rate. There are few other candidates to use the API at the moment, but apparently Anaconda does some really nasty stuff directly on LVM2 structures so it should be changed to use an API instead. My current opinion is that we can use libparted as it is to both manage virt-on-partition scenarios and to find orphan PVs as candidates to add to a VG, and the command line tools like vgs to manage and allocate LVM2 partitions. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Thu, Mar 22, 2007 at 11:22:21AM +0000, Richard W.M. Jones wrote:
Secondly there is an API of sorts for lvm2. I think Alasdair called it "libcmd", but maybe I got that wrong because Google doesn't seem to turn up anything. In any case, all it is is a wrapper around the command line tools, so it seems doubtful that this is going to be any better than just invoking the command line tools ourselves.
If there is some longer-term commitment to the C API then it might be simpler to start using it now, even if it currently just uses the commands. Pushing the parsing of command output into one place sounds like a good idea, especially if it will eventually go away (the parsing of command output, that is). dme.

"Richard W.M. Jones" <rjones@redhat.com> wrote:
Secondly there is an API of sorts for lvm2. I think Alasdair called it "libcmd", but maybe I got that wrong because Google doesn't seem to turn up anything. In any case, all it is is a wrapper around the command line tools, so it seems doubtful that this is going to be any better than just invoking the command line tools ourselves.
lvm2cmd is what you're looking for. It is only slightly higher-level than "system". Here's an example of that interface, from the lvm2 repo, doc/example_cmdlib.c: #include "lvm2cmd.h" /* All output gets passed to this function line-by-line */ void test_log_fn(int level, const char *file, int line, const char *format) { /* Extract and process output here rather than printing it */ if (level != 4) return; printf("%s\n", format); return; } int main(int argc, char **argv) { void *handle; int r; lvm2_log_fn(test_log_fn); handle = lvm2_init(); lvm2_log_level(handle, 1); r = lvm2_run(handle, "vgs --noheadings vg1"); /* More commands here */ lvm2_exit(handle); return r; }

I'm reviving this old thread to give you all an update. http://thread.gmane.org/gmane.comp.emulators.libvirt/826 I've been working on parted for some time now, http://git.debian.org/?p=parted/parted.git;a=summary http://news.gmane.org/gmane.comp.gnu.parted.devel and have just recently discovered a few problems with its ext2 file system support. I fixed one bug that I just stumbled upon, and another one or two have proposed fixes that look good. However, after investigating those, I am convinced we need to dump libparted's ext2 FS support ASAP, in favor of the e2fslibs (libext2fs) library that is used by e2fsprogs. It appears to be very well maintained and robust, while the ext2/FS code in libparted is obviously in bad shape. For example, mkfs would fail for many sizes, and resize would fail for even more, given an initial partition size in certain ill-fated ranges. So far, I haven't seen too many file-system-corrupting bugs, but I think that's mainly luck. In addition, there is the fact that Parted's partition-table (aka what it calls "label") support is currently tied to a 512-byte sector size for many label types. BTW, do any of you know which are the partition types that matter the most to us? MSDOS and GPT seem like the top priority ones, and I've fixed most parts of those two, but have only lightly tested the GPT changes. Also, with >512-byte sector devices becoming more and more common (e.g., ipods, CDs, new-and-bigger disks), I wonder how important it is to make Parted work for them, now. Fixing Parted for the few most common partition types isn't a big deal, but fixing all of them would require more time and testing resources than I expect to have. I plan to leave most of the others in their current, works-only-for-512-byte-sectors state. Also in line with limiting scope, I'm concentrating only on the few file system types that are likely to be useful here: obviously, we care about ext2 and ext3 (which Parted will get for free, with the e2fsprogs graft), probably FAT32, too. Does anyone think some other FS type deserves attention right away? Since my initial handful of patches in LVM/device-mapper land, I've done little on that front, and I don't expect to resume work on it until some time in July. Jim

Jim Meyering wrote: [...] Thanks for looking into this.
In addition, there is the fact that Parted's partition-table (aka what it calls "label") support is currently tied to a 512-byte sector size for many label types. BTW, do any of you know which are the partition types that matter the most to us? MSDOS and GPT seem like the top priority ones, and I've fixed most parts of those two, but have only lightly tested the GPT changes. Also, with >512-byte sector devices becoming more and more common (e.g., ipods, CDs, new-and-bigger disks), I wonder how important it is to make Parted work for them, now. Fixing Parted for the few most common partition types isn't a big deal, but fixing all of them would require more time and testing resources than I expect to have. I plan to leave most of the others in their current, works-only-for-512-byte-sectors state.
From the virt-manager/libvirt p.o.v. it seems to me the important operations are: (1) Find attached drives. (2) Find partitions available & their sizes. (3) Allocate logical volumes. (4) Find out how much free space is available on a partition, and carve out a file. Correct me if I'm wrong (I usually am), but: Nothing can do (1) except doing a brute force scan over /dev and looking for likely block devices (this is what vgscan does). Parted can do (2), with several limitations including sector size. It can't do (3) at all, but then neither can anything else except forking the LVM command line tools. And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done). Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Thu, Jun 14, 2007 at 03:50:28PM +0100, Richard W.M. Jones wrote:
Jim Meyering wrote: [...]
Thanks for looking into this.
In addition, there is the fact that Parted's partition-table (aka what it calls "label") support is currently tied to a 512-byte sector size for many label types. BTW, do any of you know which are the partition types that matter the most to us? MSDOS and GPT seem like the top priority ones, and I've fixed most parts of those two, but have only lightly tested the GPT changes. Also, with >512-byte sector devices becoming more and more common (e.g., ipods, CDs, new-and-bigger disks), I wonder how important it is to make Parted work for them, now. Fixing Parted for the few most common partition types isn't a big deal, but fixing all of them would require more time and testing resources than I expect to have. I plan to leave most of the others in their current, works-only-for-512-byte-sectors state.
From the virt-manager/libvirt p.o.v. it seems to me the important operations are:
(1) Find attached drives.
(2) Find partitions available & their sizes.
(3) Allocate logical volumes.
(4) Find out how much free space is available on a partition, and carve out a file.
Correct me if I'm wrong (I usually am), but:
Nothing can do (1) except doing a brute force scan over /dev and looking for likely block devices (this is what vgscan does).
HAL can do this for physical devices.
Parted can do (2), with several limitations including sector size. It can't do (3) at all, but then neither can anything else except forking the LVM command line tools.
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done).
Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

"Richard W.M. Jones" <rjones@redhat.com> wrote: ...
From the virt-manager/libvirt p.o.v. it seems to me the important operations are:
(1) Find attached drives.
(2) Find partitions available & their sizes.
(3) Allocate logical volumes.
(4) Find out how much free space is available on a partition, and carve out a file.
How about creating a regular partition, or resizing an existing (non-LVM) file system?
Correct me if I'm wrong (I usually am), but:
Nothing can do (1) except doing a brute force scan over /dev and looking for likely block devices (this is what vgscan does).
Parted can do (2), with several limitations including sector size. It can't do (3) at all, but then neither can anything else except forking the LVM command line tools.
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done).
Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib: http://www.gnu.org/software/gnulib/MODULES.html the big pieces: http://cvs.sv.gnu.org/viewcvs/gnulib/lib/mountlist.c?root=gnulib&view=markup http://cvs.sv.gnu.org/viewcvs/gnulib/m4/ls-mntd-fs.m4?root=gnulib&view=markup Of course, if your target is just Linux, then it is easier.

On Thu, Jun 14, 2007 at 05:55:33PM +0200, Jim Meyering wrote:
"Richard W.M. Jones" <rjones@redhat.com> wrote: ...
From the virt-manager/libvirt p.o.v. it seems to me the important operations are:
(1) Find attached drives.
(2) Find partitions available & their sizes.
(3) Allocate logical volumes.
(4) Find out how much free space is available on a partition, and carve out a file.
How about creating a regular partition, or resizing an existing (non-LVM) file system?
Correct me if I'm wrong (I usually am), but:
Nothing can do (1) except doing a brute force scan over /dev and looking for likely block devices (this is what vgscan does).
Parted can do (2), with several limitations including sector size. It can't do (3) at all, but then neither can anything else except forking the LVM command line tools.
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done).
Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib:
We don't need to enumerate all the mount points. The admin will simply configure particulra directories (eg /var/lib/xen/images) as storage repositories. So we only need to be able to call statfs/statvfs on particular paths where we want to create a new image.
Of course, if your target is just Linux, then it is easier.
Minimally we have to target Solaris too, since we know they already use libvirt. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

"Daniel P. Berrange" <berrange@redhat.com> wrote:
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done).
Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib:
We don't need to enumerate all the mount points. The admin will simply
Lucky you :)
configure particulra directories (eg /var/lib/xen/images) as storage repositories. So we only need to be able to call statfs/statvfs on particular paths where we want to create a new image.
Of course, if your target is just Linux, then it is easier.
Minimally we have to target Solaris too, since we know they already use libvirt.
Ok. Then this (also used by df) might help, if you ever need portability to e.g., older Solaris, *BSD, AIX, HP-UX, etc. http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup It provides a thin wrapper around statfs/statvfs, which helps handle some of the bogus values (sometimes negative or UINTMAX_MAX) that can arise in statvfs.f_* values. That last bit might even be useful on Linux.

On Thu, Jun 14, 2007 at 07:16:21PM +0200, Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote:
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done).
Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib:
We don't need to enumerate all the mount points. The admin will simply
Lucky you :)
configure particulra directories (eg /var/lib/xen/images) as storage repositories. So we only need to be able to call statfs/statvfs on particular paths where we want to create a new image.
Of course, if your target is just Linux, then it is easier.
Minimally we have to target Solaris too, since we know they already use libvirt.
Ok. Then this (also used by df) might help, if you ever need portability to e.g., older Solaris, *BSD, AIX, HP-UX, etc.
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-( Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Thu, Jun 14, 2007 at 07:16:21PM +0200, Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote:
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done). Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib: We don't need to enumerate all the mount points. The admin will simply Lucky you :)
configure particulra directories (eg /var/lib/xen/images) as storage repositories. So we only need to be able to call statfs/statvfs on particular paths where we want to create a new image.
Of course, if your target is just Linux, then it is easier. Minimally we have to target Solaris too, since we know they already use libvirt. Ok. Then this (also used by df) might help, if you ever need portability to e.g., older Solaris, *BSD, AIX, HP-UX, etc.
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-(
I'm sure we can use it as guidance for possible problems though. I'm quite surprised there are portability problems with statfs. Isn't it a v7 call, didn't think there wouldn't be much that could go wrong :-) Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Thu, Jun 14, 2007 at 06:39:01PM +0100, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
On Thu, Jun 14, 2007 at 07:16:21PM +0200, Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote:
And (4) can be done by libvirtd using ordinary POSIX calls, so no external library support is needed, just some work to remote those operations (which is mostly done). Isn't doing #4 portably pretty tricky? There's still too much variation, because many of the details aren't covered by POSIX. At least for GNU df, it was -- it uses the mountlist module from gnulib: We don't need to enumerate all the mount points. The admin will simply Lucky you :)
configure particulra directories (eg /var/lib/xen/images) as storage repositories. So we only need to be able to call statfs/statvfs on particular paths where we want to create a new image.
Of course, if your target is just Linux, then it is easier. Minimally we have to target Solaris too, since we know they already use libvirt. Ok. Then this (also used by df) might help, if you ever need portability to e.g., older Solaris, *BSD, AIX, HP-UX, etc.
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-(
I'm sure we can use it as guidance for possible problems though. I'm quite surprised there are portability problems with statfs. Isn't it a v7 call, didn't think there wouldn't be much that could go wrong :-)
Its worse than you thing :-) CONFORMING TO The Linux statfs() was inspired by the 4.4BSD one (but they do not use the same structure). And Solaris is different again, calling it statvfs with different structs Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

"Daniel P. Berrange" <berrange@redhat.com> wrote: ...
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-(
Ugh. That, again. Would you be seriously interested if it were LGPL? Since I own most of the pieces (the gnulib "fsusage" module does depend on several others) this might be enough to make me change it.

On Thu, Jun 14, 2007 at 08:07:09PM +0200, Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote: ...
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-(
Ugh. That, again. Would you be seriously interested if it were LGPL?
Potentially, yes - hard to say for certain until we actually come to write the code of course :-)
Since I own most of the pieces (the gnulib "fsusage" module does depend on several others) this might be enough to make me change it.
Or as the copyright holder you could submit the neccessary patch for libvirt yourself when the time comes.... Lets remember this code for future Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

"Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Jun 14, 2007 at 08:07:09PM +0200, Jim Meyering wrote:
"Daniel P. Berrange" <berrange@redhat.com> wrote: ...
http://cvs.sv.gnu.org/viewcvs/gnulib/lib/fsusage.c?root=gnulib&view=markup
Unfortunately we can't use that. The license is GPL, while libvirt needs to be LGPL :-(
Ugh. That, again. Would you be seriously interested if it were LGPL?
Potentially, yes - hard to say for certain until we actually come to write the code of course :-)
Since I own most of the pieces (the gnulib "fsusage" module does depend on several others) this might be enough to make me change it.
Or as the copyright holder you could submit the neccessary patch for libvirt yourself when the time comes.... Lets remember this code for future
I've just switched the relevant gnulib modules to LGPL: http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/10583 Let me know when/if you're interested in using it, and I'll be happy to lend a hand.

On Wed, 2007-03-14 at 11:29 +0000, Daniel P. Berrange wrote:
On Wed, Mar 14, 2007 at 09:17:37AM +0000, Richard W.M. Jones wrote:
Should libvirt's C API use/expose libparted structures directly? (And how would this affect the remote case?)
I'd say definitely not expose libpartd via libvirt APIs. I view libparted as an internal implementation detail. We're not seeking to turn libvirt into a general purpose parititioning tool, but rather just providing a minimal set of APIs for enumerating, creating and assigning virtual disks to machines. Such an API would be operating at a more abstract higher level than the libparted API, so exposing libparted would be a mistake in this respect.
I still haven't gone off the notion of a virtual storage pool :-) https://www.redhat.com/archives/libvir-list/2007-February/msg00057.html Cheers, Mark.

On Tue, Mar 20, 2007 at 12:06:39PM +0000, Mark McLoughlin wrote:
On Wed, 2007-03-14 at 11:29 +0000, Daniel P. Berrange wrote:
On Wed, Mar 14, 2007 at 09:17:37AM +0000, Richard W.M. Jones wrote:
Should libvirt's C API use/expose libparted structures directly? (And how would this affect the remote case?)
I'd say definitely not expose libpartd via libvirt APIs. I view libparted as an internal implementation detail. We're not seeking to turn libvirt into a general purpose parititioning tool, but rather just providing a minimal set of APIs for enumerating, creating and assigning virtual disks to machines. Such an API would be operating at a more abstract higher level than the libparted API, so exposing libparted would be a mistake in this respect.
I still haven't gone off the notion of a virtual storage pool :-)
https://www.redhat.com/archives/libvir-list/2007-February/msg00057.html
I like the idea of storage pools too, but not the impl described in that thread :-) Creating a loop back mounted sparse file & running LVM on it is utterly disasterous for both performanceand & data integrity. It will also not be portable to any non-Linux systems which don't have LVM. It is really unneccessary too, since most host machines already have plenty of other ways for us to deal with storage - and importantly these are consistent with current manual approaches to managing storage so we have good compatability with non-libvirt managed storage. - There are a couple of different types of storage pool - An LVM volume group - Block devices - A directory on a filesystem - Each storage pool can have zero or more storage volumes allocated - LVM volume group has multiple logical volumes - Block device has multiple partitions - A directory has multiple files (maybe sparse) - Each storage pool has some measure of free space - LVM volume group has unallocated physical extents - Block device has unpartitioned sectors - A directory has free space from underlying filesystem - Every host has at least one storage pool with free space - ie a directory on a filesystem. Some hosts may also have free LVM space, or unpartitioned block devices but we can't assume their presence in general. This lets us manage all existing VMs which are either device (LVM/block) or file based (/var/lib/xen/images) with the new APIs, so gives a good back compatability story. The performance & reliability are good, since we're avoiding extra layers of loopback. There are only a handful of operations we need to track to get an initially useful API: - Enumerate storage pools - Enumerate volumes within a pool - Extract metadata about pools (free space, UUID?) - Extract metadata about volumes (logical size, physical allocation, UUID) - Create volume. - Delete volume That's pretty much it. I'd be inclined to implement regular file based pool in terms of /var/lib/xen/images (or /var/lib/libvirt/images?) as a first target. Its by far the easiest since it merely requires use of POSIX apis, and is also completely cross-platform portable (which LVM isn't). Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Mar 20, 2007 at 02:32:10PM +0000, Daniel P. Berrange wrote:
- There are a couple of different types of storage pool - An LVM volume group - Block devices - A directory on a filesystem - Each storage pool can have zero or more storage volumes allocated - LVM volume group has multiple logical volumes - Block device has multiple partitions - A directory has multiple files (maybe sparse) - Each storage pool has some measure of free space - LVM volume group has unallocated physical extents - Block device has unpartitioned sectors - A directory has free space from underlying filesystem - Every host has at least one storage pool with free space - ie a directory on a filesystem. Some hosts may also have free LVM space, or unpartitioned block devices but we can't assume their presence in general.
ZFS takes a slightly different view: - ZFS storage pools are collections of physical devices (including data replication), - ZFS datasets are contained within ZFS storage pools and are either filesystems, volumes or snapshots. - ZFS filesystems are, well, filesystems, - ZFS volumes are available as block devices, - ZFS volumes can contain multiple partitions. Currently we anticipate using both file-based images (inside ZFS and other filesystems) and ZFS volumes (to provide the impression of a dedicated physical device) for VMs, as well as dedicating real physical volumes, obviously. Overall this fits with your model, I think. dme.

On Tue, Mar 20, 2007 at 02:58:16PM +0000, David Edmondson wrote:
On Tue, Mar 20, 2007 at 02:32:10PM +0000, Daniel P. Berrange wrote:
- There are a couple of different types of storage pool - An LVM volume group - Block devices - A directory on a filesystem - Each storage pool can have zero or more storage volumes allocated - LVM volume group has multiple logical volumes - Block device has multiple partitions - A directory has multiple files (maybe sparse) - Each storage pool has some measure of free space - LVM volume group has unallocated physical extents - Block device has unpartitioned sectors - A directory has free space from underlying filesystem - Every host has at least one storage pool with free space - ie a directory on a filesystem. Some hosts may also have free LVM space, or unpartitioned block devices but we can't assume their presence in general.
ZFS takes a slightly different view:
- ZFS storage pools are collections of physical devices (including data replication), - ZFS datasets are contained within ZFS storage pools and are either filesystems, volumes or snapshots. - ZFS filesystems are, well, filesystems, - ZFS volumes are available as block devices, - ZFS volumes can contain multiple partitions.
That all makes sense - the ZFS storage pools sounds like they provide equivalent volume management capabilities to what you'd get in LVM.
Currently we anticipate using both file-based images (inside ZFS and other filesystems) and ZFS volumes (to provide the impression of a dedicated physical device) for VMs, as well as dedicating real physical volumes, obviously.
Overall this fits with your model, I think.
Yes, sounds just fine. On this subject, does ZFS come with any library API for doing all the volume pool management tasks, or is it all just a set of command line tools as we'd get with LVM ? Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Mar 20, 2007 at 04:00:21PM +0000, Daniel P. Berrange wrote:
On this subject, does ZFS come with any library API for doing all the volume pool management tasks, or is it all just a set of command line tools as we'd get with LVM ?
There's no Committed C API for ZFS management tasks today, so it's the command line. dme.

On Tue, 2007-03-20 at 14:32 +0000, Daniel P. Berrange wrote:
On Tue, Mar 20, 2007 at 12:06:39PM +0000, Mark McLoughlin wrote:
I still haven't gone off the notion of a virtual storage pool :-)
https://www.redhat.com/archives/libvir-list/2007-February/msg00057.html
I like the idea of storage pools too, but not the impl described in that thread :-)
/me too. But right now, we are not just lacking storage pools, we are also lacking reasonable local means to deal with storage. And robust support for handling storage locally is needed as the foundation for doing things like pools and non-local management. In addition, for pools we'd want to abstract away some of the differences between the various storage 'backends', i.e. you allocate a 10GB block device from pool 'foo' without having to worry whether that is backed by a container file or by an LV. That should probably be a layer that sits on top of the local API.
- There are a couple of different types of storage pool - An LVM volume group - Block devices
I don't like putting block devices at this level; after all an LVM LV is a block device, too. And using partitions of a block device in analogy to LV's in a VG adds additional constraints, e.g. because the number of partitions is restricted. If a physical block device should become part of a pool, it should just be added to a volume group.
- A directory on a filesystem
And then there's stuff like SAN's, which is another way to carve physical storage into block devices, though I ahve no idea how we would dela with that at this level.
That's pretty much it. I'd be inclined to implement regular file based pool in terms of /var/lib/xen/images (or /var/lib/libvirt/images?) as a first target. Its by far the easiest since it merely requires use of POSIX apis, and is also completely cross-platform portable (which LVM isn't).
That would be a really good first step for storage pools. David

"Richard W.M. Jones" <rjones@redhat.com> wrote:
Jim Meyering wrote:
How important do you guys think having LVM support will be to ET projects? And when will you need it?
For my point of view, as former sysadmin, virtualisation and LVM are such a natural fit for each other that I can hardly imagine _not_ provisioning new virtual servers from space in a VG.
I did look at the API for libparted a few months ago (actually from the rather ancient released version on gnu.org) and it didn't look to me like there was any way to express LVM notions through the API, so I guess this will require a lot of new API calls and structures?
Exposing all of the LVM command-line tool functionality via an API will involve a *lot* of work. I'm hoping we can get by with a small subset.
Some more open question to everyone else: Do we need Python bindings?
About two weeks ago, I heard that someone started working on hand-crafted Python and C++ bindings for parted.

On Thu, 2007-03-15 at 14:45 +0100, Jim Meyering wrote:
Some more open question to everyone else: Do we need Python bindings?
About two weeks ago, I heard that someone started working on hand-crafted Python and C++ bindings for parted.
There are several existing sets of python bindings for parted -- it seems to be relatively popular to just write your own rather than use an existing one ;-) One of the longest existing ones is pyparted which was originally written by Matt Wilson and is used extensively within anaconda and continues to be maintained by dcantrell. Jeremy

On Tue, 2007-03-13 at 17:22 +0100, Jim Meyering wrote:
How important do you guys think having LVM support will be to ET projects?
I've used lvm commands from python code (for stacaccli) and it pained me to do that, especially from an error handling POV - anaconda is another example of code that does this. So, a C library for LVM with python bindings would be great, but it's a big job ... lvm's internal API as used by the lvm command just isn't suitable for wider consumption IMHO. Cheers, Mark.

Hey, I know I said this already ... but it's probably worth fleshing out the idea a bit "on paper" to see how it works out. The idea is that perhaps we should support the concept of a "Virtual Storage Pool" in the same way we now support Virtual Networks. A virtual storage pool would basically be an area of storage on a physical machine from which virtual disks can be allocated for guests. It would be backed by e.g. an LVM Volume Group. On each host machine, there would be a default storage pool consisting of a large (e.g. 10G) sparse file loopback mounted and containing an LVM VG. Users can add other PVs to that VG in order to allocate more storage to the pool. So, what would the XML format be like for the simple case of a loopback sparse file? <storage_pool> <name>TestStorage</name> <uuid>1dd053b2-a068-4f6a-aaae-d7d88ecb504d</uuid> <pool type='lvm'> <physical_volume type='file'> <file size='20971520'>/var/lib/xen/test-storage.img</file> </physical_volume> </pool> </storage_pool> When you start create a pool, libvirtd would create the file, associate a loopback device with it, initialise it as a PV using pvcreate and create a volume group corresponding to the name of the pool using vgcreate. On subsequent boots, libvirtd would find that the file exists, associate a loopback device with the file, run vgscan and check that the VG exists[1]. (We could allow specifying an alternative VG name and having multiple physical volumes. Also note, that you'd only need to list physical volumes which actually need to be "activated" ... i.e. if the VG was just on /dev/hda3 or something, it wouldn't need to be listed) Of course, you then need to be able to carve out a chunk for a guest, so perhaps: int virStorageAllocateVolume(virStoragePtr storage, const char *name, unsigned long size); int virStorageDeallocateVolume(virStoragePtr storage, const char *name); Volumes could be allocated using the XML format too: <pool type='lvm'> <volume> <name>TestVolume</name> <size>4194304</name> </volume> </pool> And once allocated, you could create a guest with e.g. <disk type='volume'> <source volume='TestVolume' /> <target dev='hda' /> </disk> which would cause libvirtd to first lookup the device path for the volume and use that when starting the guest. Of course, we'd need to think about other types of physical volumes. The simple one is just a plain block device: <physical_volume type='device'> <file>/dev/hda3</file> </physical_volume> An iSCSI target: <physical_volume type='iscis'> <host>storage.devel.redhat.com</host> <port>3260</port> <target>iqn.1994-06.com.redhat.devel:markmc.test1</target> <lun>0</lun> </physical_volume> Or a file on an NFS mount: <physical_volume type='nfs'> <remote>storage.devel.redhat.com:/mnt/storage/test</remote> <file>test-storage.img</file> </physical_volume> Of course, all guests don't need to fit into this model they can continue using a file/device directly rather than a storage pool. Cheers, Mark. [1] - One thing to think about is that the VG contains the canonical list of physical volumes and allocated logical volumes ... so e.g. libvirt would be confused if you re-named a volume.
participants (10)
-
Daniel P. Berrange
-
David Edmondson
-
David Lutterkort
-
Hugh Brock
-
James Parsons
-
Jeremy Katz
-
Jim Meyering
-
Lon Hohberger
-
Mark McLoughlin
-
Richard W.M. Jones