Chris Lalancette <clalance(a)redhat.com> wrote:
To support LVM partitioning in oVirt, one of the things we need is
the ability
to tell what kind of label is currently on a block device. Here, a 'label' is
used in the same sense that it is used in parted; namely, it defines which kind
of partition table is on the disk, whether it be DOS, LVM2, SUN, BSD, etc. Note
that this is different than the partition type; those are things like Linux,
FAT16, FAT32, etc.
This actually turns out to be fairly easy to implement; there are really only a
few labels that are in common use, and they all have an easy signature to
recognize (but see comments in the code about pc98). This patch implements
label detection on block devices in virStorageBackendUpdateVolInfoFD, and hooks
it up to the iSCSI backend so it works there.
To keep code duplication down, I moved some of the enum's from
storage_backend_disk.c into a common place. Note, however, that there is a
slight semantic change because of this. Previously, if no label was found on a
disk in storage_backend_disk.c, it would always return "dos" as the label
type.
That's not actually true, though; if it's a completely zeroed disk, for
instance, it really just has label type of 'unknown'. This patch changes to the
new semantic of 'unknown' for label types we don't understand. I don't
think
this will be a huge issue for compatibility, but there could be something I'm
missing.
Hi Chris,
I like the patch.
Some minor suggestions below.
Otherwise, this patch has been tested by me to work, and now when you
do:
# virsh vol-dumpxml --pool iscsitest lun-1
you'll get:
<volume>
...
<target>
...
<format type='dos' />
Which should be sufficient for oVirt to do it's detection.
Signed-off-by: Chris Lalancette <clalance(a)redhat.com>
Index: src/storage_backend.c
===================================================================
RCS file: /data/cvs/libvirt/src/storage_backend.c,v
retrieving revision 1.21
diff -u -r1.21 storage_backend.c
--- src/storage_backend.c 5 Sep 2008 12:03:45 -0000 1.21
+++ src/storage_backend.c 16 Oct 2008 07:29:46 -0000
@@ -192,6 +192,30 @@
return ret;
}
+struct diskType disk_types[] = {
+ { "lvm2", VIR_STORAGE_POOL_DISK_LVM2, 0x218, 8, 0x31303020324D564CUL },
+ { "gpt", VIR_STORAGE_POOL_DISK_GPT, 0x200, 8, 0x5452415020494645UL },
+ { "dvh", VIR_STORAGE_POOL_DISK_DVH, 0x0, 4, 0x41A9E50BUL },
+ { "mac", VIR_STORAGE_POOL_DISK_MAC, 0x0, 2, 0x5245UL },
+ { "bsd", VIR_STORAGE_POOL_DISK_BSD, 0x40, 4, 0x82564557UL },
+ { "sun", VIR_STORAGE_POOL_DISK_SUN, 0x1fc, 2, 0xBEDAUL },
Some partition tables are deliberately created to be dual GPT/DOS-based.
Here, it gets the first match (as you've deliberately ordered them that
way, I see). We might need a mechanism to let the caller prefer one or
the other. Also, realize that this can be wrong if there was once a GPT
table, and it has since been replaced with a DOS one that still happens
to have the GPT signature bytes. Or maybe it was dual GPT/DOS long ago
but has since been maintained only as DOS (in which case all GPT-related
headers could be gone). This suggests that pronouncing a partition
table to have type GPT based solely on the first 1KB is a little risky.
But it's probably just fine for now.
+ /*
+ * NOTE: pc98 is funky; the actual signature is 0x55AA (just like dos), so
+ * we can't use that. At the moment I'm relying on the "dummy"
IPL
+ * bootloader data that comes from parted. Luckily, the chances of running
+ * into a pc98 machine running libvirt are approximately nil.
I agree.
Why not just exclude it?
+ */
+ /*{ "pc98", 0x1fe, 2, 0xAA55UL },*/
+ { "pc98", VIR_STORAGE_POOL_DISK_PC98, 0x0, 8, 0x314C5049000000CBUL },
+ /*
+ * NOTE: the order is important here; some other disk types (like GPT and
+ * and PC98) also have 0x55AA at this offset. For that reason, the DOS
+ * one must be the last one.
+ */
+ { "dos", VIR_STORAGE_POOL_DISK_DOS, 0x1fe, 2, 0xAA55UL },
+ { NULL, -1, 0x0, 0, 0x0UL },
+};
+
int
virStorageBackendUpdateVolInfoFD(virConnectPtr conn,
virStorageVolDefPtr vol,
@@ -245,6 +269,40 @@
if (withCapacity) vol->capacity = end;
}
+ /* make sure to set the target format "unknown" to begin with */
+ vol->target.format = VIR_STORAGE_POOL_DISK_UNKNOWN;
+
+ if (S_ISBLK(sb.st_mode)) {
+ off_t start;
+ int i;
+ unsigned char buffer[1024];
+ ssize_t bytes;
+
+ start = lseek(fd, 0, SEEK_SET);
+ if (start < 0) {
+ virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR,
+ _("cannot seek to beginning of file
'%s':%s"),
+ vol->target.path, strerror(errno));
+ return -1;
+ }
+ memset(buffer, 0, 1024);
You can remove the above memset, given one of the suggestions below.
+ bytes = saferead(fd, buffer, 1024);
Better not to repeat constants like that. use sizeof:
bytes = saferead(fd, buffer, sizeof buffer);
+ if (bytes < 0) {
+ virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR,
+ _("cannot read beginning of file
'%s':%s"),
+ vol->target.path, strerror(errno));
+ return -1;
+ }
At worst, memset just the probably-empty portion of the buffer
that wasn't set by the read:
memset(buffer + bytes, 0, sizeof buffer - bytes);
However, I prefer to drop the memset altogether and check
offset+len against "bytes" below:
+ for (i = 0; disk_types[i].name != NULL; i++) {
It'd be good to check that offset+len is within range, regardless
of whether the buffer is zeroed. In case someone decides to shrink
buffer or to add a type with magic bytes past the 1KB mark.
if (disk_types[i].offset + disk_types[i].length > bytes)
continue;
+ if (memcmp(buffer+disk_types[i].offset,
&disk_types[i].magic,
+ disk_types[i].length) == 0) {
+ vol->target.format = disk_types[i].label;
+ break;
+ }
+ }
+ }
+
...
Index: src/storage_backend.h
===================================================================
RCS file: /data/cvs/libvirt/src/storage_backend.h,v
retrieving revision 1.7
diff -u -r1.7 storage_backend.h
--- src/storage_backend.h 2 Sep 2008 14:15:42 -0000 1.7
+++ src/storage_backend.h 16 Oct 2008 07:29:46 -0000
@@ -56,6 +56,30 @@
VIR_STORAGE_BACKEND_POOL_SOURCE_NAME = (1<<4),
};
+enum diskLabel {
+ VIR_STORAGE_POOL_DISK_DOS = 0,
I like to start with 1 or larger, so that using an uninitialized
variable (often 0) is more likely to trigger an error.
+ VIR_STORAGE_POOL_DISK_DVH,
+ VIR_STORAGE_POOL_DISK_GPT,
+ VIR_STORAGE_POOL_DISK_MAC,
+ VIR_STORAGE_POOL_DISK_BSD,
+ VIR_STORAGE_POOL_DISK_PC98,
+ VIR_STORAGE_POOL_DISK_SUN,
+ VIR_STORAGE_POOL_DISK_LVM2,
+ /* the "unknown" disk is 1 billion (and not, for instance, -1), to make
+ sure it doesn't run afoul of error checking */
+ VIR_STORAGE_POOL_DISK_UNKNOWN = 1 * 1024 * 1024 * 1024,
+};
+
+struct diskType {
+ const char *name;
+ /* the 'label' terminology comes from parted */
i find "label" (meaning "partition table") terminology confusing, if
only
because each partition can have a volume-label string. If it weren't so
ensconced in Parted's code, directory structure (libparted/labels/*.c)
and documentation, I would have excised it long ago. I much prefer
"partition table".
In a way, I don't mean to make you change names, but I'd hate
to see Parted's poorly chosen name propagate any more than it has.
+ enum diskLabel label;
+ unsigned short offset;
+ unsigned short length;
+ unsigned long long magic;
+};
+extern struct diskType disk_types[];
+
typedef struct _virStorageBackendPoolOptions virStorageBackendPoolOptions;
typedef virStorageBackendPoolOptions *virStorageBackendPoolOptionsPtr;
struct _virStorageBackendPoolOptions {