On 2/23/19 5:06 PM, Eric Blake wrote:
On 2/22/19 6:06 PM, John Snow wrote:
> This adds a simple test that ensures the busy bit works for push backups,
> as well as doubling as bonus test for incremental backups that get interrupted
> by EIO errors.
This makes 124 longer to run, but it is not in the 'quick' group, so
that's okay.
The easy part: I compiled the series, and validated that ./check still
passes with this applied, so:
Tested-by: Eric Blake <eblake(a)redhat.com>
Now for the fun part (I know from IRC that you had some interesting
challenges coming up with scenarios to even provoke the states you
wanted shown in the output):
>
> Recording bit tests are already handled sufficiently by 236.
>
> Signed-off-by: John Snow <jsnow(a)redhat.com>
> ---
> tests/qemu-iotests/124 | 110 +++++++++++++++++++++++++++++++++++++
> tests/qemu-iotests/124.out | 4 +-
> 2 files changed, 112 insertions(+), 2 deletions(-)
>
> diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
> index 5aa1bf1bd6..30f12a2202 100755
> --- a/tests/qemu-iotests/124
> +++ b/tests/qemu-iotests/124
> @@ -634,6 +634,116 @@ class
TestIncrementalBackupBlkdebug(TestIncrementalBackupBase):
> self.vm.shutdown()
> self.check_backups()
>
> + def test_incremental_pause(self):
> + """
> + Test an incremental backup that errors into a pause and is resumed.
> + """
> +
> + drive0 = self.drives[0]
> + result = self.vm.qmp('blockdev-add',
> + node_name=drive0['id'],
> + driver=drive0['fmt'],
> + file={
> + 'driver': 'blkdebug',
> + 'image': {
> + 'driver': 'file',
> + 'filename': drive0['file']
> + },
> + 'set-state': [{
> + 'event': 'flush_to_disk',
> + 'state': 1,
> + 'new_state': 2
> + },{
> + 'event': 'read_aio',
> + 'state': 2,
> + 'new_state': 3
> + }],
> + 'inject-error': [{
> + 'event': 'read_aio',
> + 'errno': 5,
> + 'state': 3,
> + 'immediately': False,
> + 'once': True
> + }],
Yeah, it's hairy to come up with sequences that will pause at the right
place, and this may be sensitive enough that we have to revisit it in
the future, but for now it works and I don't have any better suggestions.
The problem in this case was that the drive_add itself provokes a
read_aio. I didn't dig too far to see why, but I needed an extra
read_aio state in the sequence there so that the error would occur
during the actual backup job.
I think I'll document this in the test, with at least a comment.
I wonder if there's a syntactical sugar we could write that makes this
easier to read, like:
'inject-error': [{ 'event':
'flush_to_disk->read_aio->read_aio' }] or
some such, but that's probably at odds with the existing interface and
not worth my time to go on a crusade about.
> + })
> + self.assert_qmp(result, 'return', {})
> + self.create_anchor_backup(drive0)
> + bitmap = self.add_bitmap('bitmap0', drive0)
> +
> + # Emulate guest activity
> + self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
> + ('0xfe', '16M',
'256k'),
> + ('0x64', '32736k',
'64k')))
> +
> + # For the purposes of query-block visibility of bitmaps, add a drive
> + # frontend after we've written data; otherwise we can't use hmp-io
> + result = self.vm.qmp("device_add",
> + id="device0",
> + drive=drive0['id'],
> + driver="virtio-blk")
> + self.assert_qmp(result, 'return', {})
> +
> + # Bitmap Status Check
> + query = self.vm.qmp('query-block')
> + ret = [bmap for bmap in query['return'][0]['dirty-bitmaps']
> + if bmap.get('name') == bitmap.name][0]
> + self.assert_qmp(ret, 'count', 458752)
> + self.assert_qmp(ret, 'granularity', 65536)
> + self.assert_qmp(ret, 'status', 'active')
> + self.assert_qmp(ret, 'busy', False)
> + self.assert_qmp(ret, 'recording', True)
So far, nothing too different from we've seen elsewhere, but then we get
to the fun part:
> +
> + # Start backup
> + parent, _ = bitmap.last_target()
> + target = self.prepare_backup(bitmap, parent)
> + res = self.vm.qmp('drive-backup',
> + job_id=bitmap.drive['id'],
> + device=bitmap.drive['id'],
> + sync='incremental',
> + bitmap=bitmap.name,
> + format=bitmap.drive['fmt'],
> + target=target,
> + mode='existing',
> + on_source_error='stop')
> + self.assert_qmp(res, 'return', {})
> +
> + # Wait for the error
> + event = self.vm.event_wait(name="BLOCK_JOB_ERROR",
> +
match={"data":{"device":bitmap.drive['id']}})
> + self.assert_qmp(event, 'data', {'device':
bitmap.drive['id'],
> + 'action': 'stop',
> + 'operation': 'read'})
> +
> + # Bitmap Status Check
> + query = self.vm.qmp('query-block')
> + ret = [bmap for bmap in query['return'][0]['dirty-bitmaps']
> + if bmap.get('name') == bitmap.name][0]
> + self.assert_qmp(ret, 'count', 458752)
> + self.assert_qmp(ret, 'granularity', 65536)
> + self.assert_qmp(ret, 'status', 'frozen')
> + self.assert_qmp(ret, 'busy', True)
> + self.assert_qmp(ret, 'recording', True)
"status":"frozen", "busy":true - yay, you provoked the
other QMP states
that we have not seen elsewhere in the testsuite.
It also tests that you can query a bitmap during an error-paused backup
job which is new.
> +
> + # Resume and check incremental backup for consistency
> + res = self.vm.qmp('block-job-resume',
device=bitmap.drive['id'])
> + self.assert_qmp(res, 'return', {})
> + self.wait_qmp_backup(bitmap.drive['id'])
> +
> + # Bitmap Status Check
> + query = self.vm.qmp('query-block')
> + ret = [bmap for bmap in query['return'][0]['dirty-bitmaps']
> + if bmap.get('name') == bitmap.name][0]
> + self.assert_qmp(ret, 'count', 0)
> + self.assert_qmp(ret, 'granularity', 65536)
> + self.assert_qmp(ret, 'status', 'active')
> + self.assert_qmp(ret, 'busy', False)
> + self.assert_qmp(ret, 'recording', True)
and even nicer, after the transient error goes away, it does not get in
the way of further bitmap edits.
And that the job can successfully finish, the bitmap is correct, etc. It
does not, sadly, test further guest writes because of the thorny issue
with the drive frontend prohibiting writes from HMP.
> +
> + # Finalize / Cleanup
> + self.make_reference_backup(bitmap)
> + self.vm.shutdown()
> + self.check_backups()
> +
>
> if __name__ == '__main__':
> iotests.main(supported_fmts=['qcow2'])
> diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
> index e56cae021b..281b69efea 100644
> --- a/tests/qemu-iotests/124.out
> +++ b/tests/qemu-iotests/124.out
> @@ -1,5 +1,5 @@
> -...........
> +............
> ----------------------------------------------------------------------
> -Ran 11 tests
> +Ran 12 tests
Pre-existing, but this output is a bear to debug if the test ever starts
failing. Not something you have to worry about today, though.
Reviewed-by: Eric Blake <eblake(a)redhat.com>
Thanks!