On 12/22/2011 02:00 PM, David Mansfield wrote:
On 12/22/2011 12:44 PM, Daniel P. Berrange wrote:
> On Thu, Dec 22, 2011 at 09:20:01AM -0500, David Mansfield wrote:
>>
>> On 12/21/2011 05:41 PM, Daniel P. Berrange wrote:
>>> On Wed, Dec 21, 2011 at 05:23:33PM -0500, David Mansfield wrote:
>>>> Hi All.
>>>>
>>>> I have a dell system with a H700 raid. Within the hardware RAID
>>>> config I've created a "virtual disk" which I have assigned
to one of
>>>> my guests. On the host the device is "/dev/sdb", on the guest
it's
>>>> "/dev/vdb".
>>>>
>>>> This works fine.
>>>>
>>>> Within the guest, we have created lvm PV on /dev/vdb (using the
>>>> whole disk - no partitions) and created a volume group. The guest's
>>>> hostname is "argo" and the vg is called
"vg_argo_bkup".
>>>>
>>>> When I reboot the host, it does a vgscan and finds the volume group
>>>> and activates it in the _host_, which I need to prevent (I think??).
>>>>
>>>> I have successfully done this by filtering "/dev/sdb" in
>>>> /etc/lvm/lvm.conf (which does NOT work as advertised BTW), but
>>>> referencing the extremely volatile SCSI "sd*" names seems a
terrible
>>>> way to do this. If I fiddle around in the HW raid config, the
>>>> /dev/sd? may change.
>>>>
>>>> I plan on creating about 10 more VM's spread over a number of
>>>> machines over the next weeks with a very similar setup, and the
>>>> admin overhead seems like it'll be onerous and error-prone.
>>>>
>>>> I'd love to be able to filter the volume groups by VG name instead
>>>> of pv device node. The host's hostname is "narnia" and
I'd love to
>>>> say, 'vgscan --include-regex "vg_narnia.*"' or
something similar, if
>>>> you get my drift.
>>>>
>>>> Does anyone have a best practice for this? I'm sure iSCSI
>>>> enthusiasts must have the exact same issue all the time.
>>> The recommended approach is not to assign the entire disk to the
>>> guest. Partition the host disk, to contain 1 single partition
>>> consuming all space, then assign the partition to the guest. Worst
>>> case is you loose a few KB of space due to partition alignment, but
>>> this is a small price to pay to avoid the LVM problems you describe
>>> all to well.
>> I don't really understand. The host still scans the partitions,
>> right? And the partition "dev" names change dynamically if the
>> whole-disk changes it's "dev" name. Won't I still have to list
>> specific volatile names in the /etc/lvm/lvm.conf on the host?
> The host will see '/dev/sda' and '/dev/sda1', you'll assign
> ' /dev/sda1' to the guest, and it will appear as /dev/vda.
> In the guest you'll create '/dev/vda1' and format it as the
> PV. So while the host will see /dev/sda1, it won't see the
> nested partition table, and thus won't see the PV
Ahhh. Brilliant. Thanks. Now the only problem is getting the underlying
stripe alignment right ;-)
In case anyone googles this (or is on this list and curious ;-) and gets
to the last part, about aligning the partitions, I thought I'd mention
the completely bizarre behavior I'm seeing.
First of all, with a 3TB "disk", you have to use the GPT label,
available in 'parted' and not in 'fdisk'. Anyway, my particular
"disk"
is a RAID5 (on my H700 raid controller) of 4 disks, with a 64k stripe
element. Therefore each full-stripe is 192k. This is 384 512-byte
sectors. So I used this line in parted:
mkpart primary 384s 5857345502s
And the performance on /dev/sda1 sucked compared to /dev/sda. I tried
many different alignments, then finally tried:
mkpart primary 383s 5857345502s
And voila!
I can't explain this, because I can verify the partition start sector is
indeed 383 according to /sys/block/sda1/start, so it's not a 'parted'
bug or anything.
The difference in the host was significant, but I went further and
assigned the partition, with different alignments, to the VM and tested
there and the difference was magnified. When aligned, the VM hit very
close to bare metal speed, but when unaligned it suffered a perf hit
(about 50% of when aligned) vs bare metal (which was slower anyway to
the misalignment).
So thanks to Daniel who gave me the trick, and a reverse-thanks to the
gods of the computers for coming up with this bizarre unexpected result.
David Mansfield
Cobite, INC.