In doing more testing I notice that the resource listed by 'sanlock
client status' disappears when the VM is paused. I haven't yet found
anything about lock versions yet though.
However, even if sanlock behaved as you said, preventing the unpausing
of a VM once another one has used it's disk, this could still cause
damage to the filesystem if the VM is paused during write operations and
never allowed to finish (after the subsequent VM using the same storage
takes over). I would think the ideal behavior would be keeping the lock
until the VM is completely off. I don't see why this would get in the
way of migration as the filesystem shouldn't need to be accessed until
the RAM contents have been migrated to the other machine, but then again
I'm not writing the code.
Thanks,
Michael
On 1/31/2013 3:56 PM, Michael Rodrigues wrote:
Just for follow up and future reference:
https://bugzilla.redhat.com/show_bug.cgi?id=906590
Thanks,
Michael
On 1/31/2013 12:03 PM, Daniel P. Berrange wrote:
> On Thu, Jan 31, 2013 at 11:02:15AM -0800, Michael Rodrigues wrote:
>> On 1/31/2013 10:40 AM, Daniel P. Berrange wrote:
>>> On Thu, Jan 31, 2013 at 10:35:17AM -0800, Michael Rodrigues wrote:
>>>> Hi Daniel,
>>>>
>>>> I thought migration might be the reason, but I'm still not seeing
>>>> the behavior you describe with regards to pausing. I saw the
>>>> following behavior:
>>>>
>>>> 1. Created VM on node 1
>>>> 2. Started VM on node 1
>>>> 3. Migrated VM to node 2, node 1 is now shutdown, node 2 is running
>>>> 4. I paused node 2
>>>> 5. I started node 1, no error
>>>> 6. Paused node 1
>>>> 7. Unpaused node 2, no err
>>>>
>>>> I thought maybe the original VM had to be paused first, so I tried
>>>> that as well:
>>>>
>>>> 1. Created VM on node 1
>>>> 2. Started VM on node 1
>>>> 3. Migrated to node 2, node 1 is now shutdown, node 2 is running
>>>> 4. I shutdown node 2 instead of pausing
>>>> 5. I started node 1
>>>> 6. I paused node 1
>>>> 7. Started node 2
>>>> 8. Paused node 2
>>>> 9. Started node 1
>>> Hmm, that isn't supposed to be possible. When you paused node 1
>>> in step 6, it was supposed to record the lease version number.
>>> When you resume in step 9, the version number should mis-match
>>> due to step 7, and thus sandlock ought to have caused an error
>>> at step 9. If that didn't happen, then I believe we have a bug
>> Should I file a report? I'm not really a developer but I can provide
>> whatever information is necessary for a proper report. I don't have
>> RHEL or a bugzilla account.
> Yes, please do file a bug report including this scenario to
> reproduce
>
> Daniel
--
Michael Rodrigues
Interim Help Desk Manager
Gevirtz Graduate School of Education
Education Building 4203
(805) 893-8031
help(a)education.ucsb.edu