From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F306C433B4 for ; Thu, 20 May 2021 13:30:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 91CB4610CB for ; Thu, 20 May 2021 13:30:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91CB4610CB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 37B208E000C; Thu, 20 May 2021 09:30:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32A9E8E0006; Thu, 20 May 2021 09:30:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F2A18E000C; Thu, 20 May 2021 09:30:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0246.hostedemail.com [216.40.44.246]) by kanga.kvack.org (Postfix) with ESMTP id DFFF18E0006 for ; Thu, 20 May 2021 09:30:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6B984D224 for ; Thu, 20 May 2021 13:30:17 +0000 (UTC) X-FDA: 78161693274.07.76E5C5B Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf25.hostedemail.com (Postfix) with ESMTP id E52986000259 for ; Thu, 20 May 2021 13:30:14 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6AE25ACAD; Thu, 20 May 2021 13:30:15 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 375151F2C9C; Thu, 20 May 2021 15:30:15 +0200 (CEST) Date: Thu, 20 May 2021 15:30:15 +0200 From: Jan Kara To: Andreas Gruenbacher Cc: Alexander Viro , cluster-devel@redhat.com, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Jan Kara Subject: Re: [PATCH 6/6] gfs2: Fix mmap + page fault deadlocks (part 2) Message-ID: <20210520133015.GC18952@quack2.suse.cz> References: <20210520122536.1596602-1-agruenba@redhat.com> <20210520122536.1596602-7-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210520122536.1596602-7-agruenba@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: E52986000259 Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of jack@suse.cz designates 195.135.220.15 as permitted sender) smtp.mailfrom=jack@suse.cz X-Rspamd-Server: rspam04 X-Stat-Signature: cogb79ksigpmi759ukkbthe6sknungnw X-HE-Tag: 1621517414-232736 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 20-05-21 14:25:36, Andreas Gruenbacher wrote: > Now that we handle self-recursion on the inode glock in gfs2_fault and > gfs2_page_mkwrite, we need to take care of more complex deadlock > scenarios like the following (example by Jan Kara): > > Two independent processes P1, P2. Two files F1, F2, and two mappings M1, > M2 where M1 is a mapping of F1, M2 is a mapping of F2. Now P1 does DIO > to F1 with M2 as a buffer, P2 does DIO to F2 with M1 as a buffer. They > can race like: > > P1 P2 > read() read() > gfs2_file_read_iter() gfs2_file_read_iter() > gfs2_file_direct_read() gfs2_file_direct_read() > locks glock of F1 locks glock of F2 > iomap_dio_rw() iomap_dio_rw() > bio_iov_iter_get_pages() bio_iov_iter_get_pages() > > gfs2_fault() gfs2_fault() > tries to grab glock of F2 tries to grab glock of F1 > > Those kinds of scenarios are much harder to reproduce than > self-recursion. > > We deal with such situations by using the LM_FLAG_OUTER flag to mark > "outer" glock taking. Then, when taking an "inner" glock, we use the > LM_FLAG_TRY flag so that locking attempts that don't immediately succeed > will be aborted. In case of a failed locking attempt, we "unroll" to > where the "outer" glock was taken, drop the "outer" glock, and fault in > the first offending user page. This will re-trigger the "inner" locking > attempt but without the LM_FLAG_TRY flag. Once that has happened, we > re-acquire the "outer" glock and retry the original operation. > > Reported-by: Jan Kara > Signed-off-by: Andreas Gruenbacher ... > diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c > index 7d88abb4629b..8b26893f8dc6 100644 > --- a/fs/gfs2/file.c > +++ b/fs/gfs2/file.c > @@ -431,21 +431,30 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf) > vm_fault_t ret = VM_FAULT_LOCKED; > struct gfs2_holder gh; > unsigned int length; > + u16 flags = 0; > loff_t size; > int err; > > sb_start_pagefault(inode->i_sb); > > - gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh); > + if (current_holds_glock()) > + flags |= LM_FLAG_TRY; > + > + gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, flags, &gh); > if (likely(!outer_gh)) { > err = gfs2_glock_nq(&gh); > if (err) { > ret = block_page_mkwrite_return(err); > + if (err == GLR_TRYFAILED) { > + set_current_needs_retry(true); > + ret = VM_FAULT_SIGBUS; > + } I've checked to make sure but do_user_addr_fault() indeed calls do_sigbus() which raises the SIGBUS signal. So if the application does not ignore SIGBUS, your retry will be visible to the application and can cause all sorts of interesting results... So you probably need to add a new VM_FAULT_ return code that will behave like VM_FAULT_SIGBUS except it will not raise the signal. Otherwise it seems to me your approach should work. Honza -- Jan Kara SUSE Labs, CR