Hello,
On Fri, Feb 25, 2011 at 03:41:47PM +0100, Dominik Klein wrote:
See attached logs of another run.
sysctl -w kernel.sysrq=1
echo blk > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/block/sdb/trace/enable
echo workqueue_queue_work >> /sys/kernel/debug/tracing/set_event
echo workqueue_activate_work >> /sys/kernel/debug/tracing/set_event
echo workqueue_execute_start >> /sys/kernel/debug/tracing/set_event
echo workqueue_execute_end >> /sys/kernel/debug/tracing/set_event
That makes attachment trace_pipe5.gz
echo 8 > /proc/sysrq-trigger
echo t > /proc/sysrq-trigger
That makes attachment console.gz
So, the following work item never finished. We can tell that pid 549
started execution from the last line.
<idle>-0 [017] 1497.601733: workqueue_queue_work: work
struct=ffff880809f3fe70 function=blk_throtl_work workqueue=ffff88102c8ba700 req_cpu=17
cpu=17
<idle>-0 [017] 1497.601736: workqueue_activate_work: work struct
ffff880809f3fe70
<...>-549 [017] 1497.601754: workqueue_execute_start: work struct
ffff880809f3fe70: function blk_throtl_work
And the stack trace of pid 549 is...
[ 1522.220046] kworker/17:1 D ffff88202fc53600 0 549 2 0x00000000
[ 1522.220046] ffff88082c5bd7c0 0000000000000046 ffff88180a822600 ffff88082c578000
[ 1522.220046] 0000000000013600 ffff88080afaffd8 0000000000013600 0000000000013600
[ 1522.220046] ffff88082c5bda98 ffff88082c5bdaa0 ffff88082c5bd7c0 0000000000013600
[ 1522.220046] Call Trace:
[ 1522.220046] [<ffffffff810395c6>] ? __wake_up+0x35/0x46
[ 1522.220046] [<ffffffff81315de3>] ? io_schedule+0x68/0xa7
[ 1522.220046] [<ffffffff81182168>] ? get_request_wait+0xee/0x17d
[ 1522.220046] [<ffffffff810604f1>] ? autoremove_wake_function+0x0/0x2a
[ 1522.220046] [<ffffffff811826b6>] ? __make_request+0x313/0x45d
[ 1522.220046] [<ffffffff81180ebd>] ? generic_make_request+0x30d/0x385
[ 1522.220046] [<ffffffff8105cc79>] ? queue_delayed_work_on+0xfc/0x10a
[ 1522.220046] [<ffffffff8118c607>] ? blk_throtl_work+0x312/0x32b
[ 1522.220046] [<ffffffff8118c2f5>] ? blk_throtl_work+0x0/0x32b
[ 1522.220046] [<ffffffff8105b754>] ? process_one_work+0x1d1/0x2ee
[ 1522.220046] [<ffffffff8105d1e3>] ? worker_thread+0x12d/0x247
[ 1522.220046] [<ffffffff8105d0b6>] ? worker_thread+0x0/0x247
[ 1522.220046] [<ffffffff8105d0b6>] ? worker_thread+0x0/0x247
[ 1522.220046] [<ffffffff8106009f>] ? kthread+0x7a/0x82
[ 1522.220046] [<ffffffff8100a824>] ? kernel_thread_helper+0x4/0x10
[ 1522.220046] [<ffffffff81060025>] ? kthread+0x0/0x82
[ 1522.220046] [<ffffffff8100a820>] ? kernel_thread_helper+0x0/0x10
The '?'s are because frame pointer is disabled and means that the
stack trace is a guesswork. Can you please turn on
CONFIG_FRAME_POINTER just to be sure? But at any rate, it looks like
blk_throtl_work() got stuck trying to allocate a request. I don't
think workqueue is causing any problem here. It seems like a resource
deadlock on request. Vivek, any ideas?
Thanks.
--
tejun