From 31dab97ff138fe413355fcf23f5d72149243e0fa Mon Sep 17 00:00:00 2001 Message-Id: <31dab97ff138fe413355fcf23f5d72149243e0fa.1601675152.git.zanussi@kernel.org> In-Reply-To: <5b5a156f9808b1acf1205606e03da117214549ea.1601675151.git.zanussi@kernel.org> References: <5b5a156f9808b1acf1205606e03da117214549ea.1601675151.git.zanussi@kernel.org> From: Mikulas Patocka Date: Mon, 13 Nov 2017 12:56:53 -0500 Subject: [PATCH 150/333] locking/rt-mutex: fix deadlock in device mapper / block-IO Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.19/older/patches-4.19.148-rt64.tar.xz When some block device driver creates a bio and submits it to another block device driver, the bio is added to current->bio_list (in order to avoid unbounded recursion). However, this queuing of bios can cause deadlocks, in order to avoid them, device mapper registers a function flush_current_bio_list. This function is called when device mapper driver blocks. It redirects bios queued on current->bio_list to helper workqueues, so that these bios can proceed even if the driver is blocked. The problem with CONFIG_PREEMPT_RT_FULL is that when the device mapper driver blocks, it won't call flush_current_bio_list (because tsk_is_pi_blocked returns true in sched_submit_work), so deadlocks in block device stack can happen. Note that we can't call blk_schedule_flush_plug if tsk_is_pi_blocked returns true - that would cause BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in task_blocks_on_rt_mutex when flush_current_bio_list attempts to take a spinlock. So the proper fix is to call blk_schedule_flush_plug in rt_mutex_fastlock, when fast acquire failed and when the task is about to block. CC: stable-rt@vger.kernel.org [bigeasy: The deadlock is not device-mapper specific, it can also occur in plain EXT4] Signed-off-by: Mikulas Patocka Signed-off-by: Sebastian Andrzej Siewior --- kernel/locking/rtmutex.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 1f2dc2dfe2e7..b38c3a92dce8 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "rtmutex_common.h" @@ -1919,6 +1920,15 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state, if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) return 0; + /* + * If rt_mutex blocks, the function sched_submit_work will not call + * blk_schedule_flush_plug (because tsk_is_pi_blocked would be true). + * We must call blk_schedule_flush_plug here, if we don't call it, + * a deadlock in I/O may happen. + */ + if (unlikely(blk_needs_flush_plug(current))) + blk_schedule_flush_plug(current); + return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK, ww_ctx); } @@ -1936,6 +1946,9 @@ rt_mutex_timed_fastlock(struct rt_mutex *lock, int state, likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) return 0; + if (unlikely(blk_needs_flush_plug(current))) + blk_schedule_flush_plug(current); + return slowfn(lock, state, timeout, chwalk, ww_ctx); } -- 2.17.1