From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25F4FC35247 for ; Tue, 4 Feb 2020 23:49:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DE7F921582 for ; Tue, 4 Feb 2020 23:49:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="bN3NEw4W" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DE7F921582 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 749246B0005; Tue, 4 Feb 2020 18:49:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D35E6B0006; Tue, 4 Feb 2020 18:49:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59B7B6B0007; Tue, 4 Feb 2020 18:49:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 38F276B0005 for ; Tue, 4 Feb 2020 18:49:21 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BF14C824805A for ; Tue, 4 Feb 2020 23:49:20 +0000 (UTC) X-FDA: 76454088480.07.wound36_172d9ffa7c434 X-HE-Tag: wound36_172d9ffa7c434 X-Filterd-Recvd-Size: 8165 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Tue, 4 Feb 2020 23:49:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1580860163; x=1612396163; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=ckYcLi5YC5b2Po3ECoBPoiB/ys4kTWBG3lKNwDFgPe0=; b=bN3NEw4WOR0BOO7EtnmeGm+U67uxc8aCHAKsHAK3Pvn+U8RA12kA870W f9szGEYG53ja75cTFW2WoDG8W1jV2cZIrU1GQxZEbfPTQkloDRapituGe vz3U/yuILyvwaCikmCQhuVeYEJDrh3kGgLkp5WI3sbMFrn5g+UNUXbcg7 yJn3jZnTGJQQt7ueYYY9z8Z8VfHwSRMNQ+9N5rmDbEYfp/V3kYiJtCXk5 jOqxfV++AnNl9w2jxG36JjArm42CJ/TfCLl6zXd+Op2yQl3g4queN9V2Z /SK7aMG5EczDaRbbNwQJKTjloUfGgw7sHJ2LMkBHC2WBzTLfegMMVWChH w==; IronPort-SDR: 39BD9sreErcA+V6r5smxVyDxqzHcIM1rickHVaa2rAwob7pLfPeg3SLfSC3tCrDKZSMEMEDRV+ r8/4zh66RmwCVsgDvtBhSGV1QJJMxaj4g3fBzj4UQaPVrjy2Dbk/qw3tlM3G5lAX9MobuCgTFt E0UG445/giAYEaFWN6rzJsq2styqu/WPYBffSuy6WLop0hKt+wRr0CRDMMQY2q/b5oDmSfDotY Sq6+nc/+KgzDSs/U+k+VqJkfalrpOAwGEzxkvndfvbVi4ijXp6kBHhV5oLNC0BoMTanOCfGZRC 0+k= X-IronPort-AV: E=Sophos;i="5.70,403,1574092800"; d="scan'208";a="230842179" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 05 Feb 2020 07:49:20 +0800 IronPort-SDR: FXmGk4WeuWMKB/zDP6jiGwm9fn07QEJDaerodujR/ZrHHubxXf/9OAsXZtudNuA4ppXyMReOqx z/S8idYCZvV8Cj2BdjUZ+M5TlGjpfR9CnWv2e79DLqT2EZ0Op1a1VXI1Jd9//L4UfpNULZzD7U KAhm4IwY5DNf1MpRYHLlgcutrXBj8JFKl05Qxbfg21cTWoEz8a7UNGWrsJknzlA7SY30lqSnYS tfadl8DBbmMIzhEEQfMTBi0Z5rFeqjW7GcJM6aWkEzkZRaQjxzswMKxbIG8kS9Sc+UX2SegS/T AAhPnW4jCi+kWDTktieOsjrh Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Feb 2020 15:42:19 -0800 IronPort-SDR: g7OB+PPOihiqJ7yEeBwgDznQ+9Bv73Dkx0gMW+L1Z8epVrbQzYLo1CkCkxCa4c1HnozbDcfpH4 xJJd3GSS1tULNdiG7Z7/7RNc3dMm1fDVk86CPOpq4ypIhnWR6HYuFn9ymSj0D+ZD/L72eMgU6D RiRGe7pJ81swEsYp12+foKwB1KSFAI9/xQ8NQo1XdXCbUMWzjRfQf7kR+Xo3DTMRTwUYAUdbWb 3Ma4F1ryjO2JSUNN+P8q7n2Lurafjg6jXxDHtMq49w5wYNKX4fhfO23ANOrVjhbD01hIQ4tjLP aDA= WDCIronportException: Internal Received: from naota.dhcp.fujisawa.hgst.com ([10.149.52.155]) by uls-op-cesaip02.wdc.com with SMTP; 04 Feb 2020 15:49:17 -0800 Received: (nullmailer pid 1090808 invoked by uid 1000); Tue, 04 Feb 2020 23:49:16 -0000 Date: Wed, 5 Feb 2020 08:49:16 +0900 From: Naohiro Aota To: "Darrick J. Wong" Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, Andrew Morton , Christoph Hellwig Subject: Re: [PATCH] mm, swap: unlock inode in error path of claim_swapfile Message-ID: <20200204234916.s6zx6i2ko4mvxim2@naota.dhcp.fujisawa.hgst.com> References: <20200204095943.727666-1-naohiro.aota@wdc.com> <20200204154229.GC6874@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20200204154229.GC6874@magnolia> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 04, 2020 at 07:42:29AM -0800, Darrick J. Wong wrote: >On Tue, Feb 04, 2020 at 06:59:43PM +0900, Naohiro Aota wrote: >> claim_swapfile() currently keeps the inode locked when it is successful, or >> the file is already swapfile (with -EBUSY). And, on the other error cases, >> it does not lock the inode. >> >> This inconsistency of the lock state and return value is quite confusing >> and actually causing a bad unlock balance as below in the "bad_swap" >> section of __do_sys_swapon(). >> >> This commit fixes this issue by unlocking the inode on the error path. It >> also reverts blocksize and releases bdev, so that the caller can safely >> forget about the inode. >> >> ===================================== >> WARNING: bad unlock balance detected! >> 5.5.0-rc7+ #176 Not tainted >> ------------------------------------- >> swapon/4294 is trying to release lock (&sb->s_type->i_mutex_key) at: >> [] __do_sys_swapon+0x94b/0x3550 >> but there are no more locks to release! >> >> other info that might help us debug this: >> no locks held by swapon/4294. >> >> stack backtrace: >> CPU: 5 PID: 4294 Comm: swapon Not tainted 5.5.0-rc7-BTRFS-ZNS+ #176 >> Hardware name: ASUS All Series/H87-PRO, BIOS 2102 07/29/2014 >> Call Trace: >> dump_stack+0xa1/0xea >> ? __do_sys_swapon+0x94b/0x3550 >> print_unlock_imbalance_bug.cold+0x114/0x123 >> ? __do_sys_swapon+0x94b/0x3550 >> lock_release+0x562/0xed0 >> ? kvfree+0x31/0x40 >> ? lock_downgrade+0x770/0x770 >> ? kvfree+0x31/0x40 >> ? rcu_read_lock_sched_held+0xa1/0xd0 >> ? rcu_read_lock_bh_held+0xb0/0xb0 >> up_write+0x2d/0x490 >> ? kfree+0x293/0x2f0 >> __do_sys_swapon+0x94b/0x3550 >> ? putname+0xb0/0xf0 >> ? kmem_cache_free+0x2e7/0x370 >> ? do_sys_open+0x184/0x3e0 >> ? generic_max_swapfile_size+0x40/0x40 >> ? do_syscall_64+0x27/0x4b0 >> ? entry_SYSCALL_64_after_hwframe+0x49/0xbe >> ? lockdep_hardirqs_on+0x38c/0x590 >> __x64_sys_swapon+0x54/0x80 >> do_syscall_64+0xa4/0x4b0 >> entry_SYSCALL_64_after_hwframe+0x49/0xbe >> RIP: 0033:0x7f15da0a0dc7 >> >> Fixes: 1638045c3677 ("mm: set S_SWAPFILE on blockdev swap devices") >> Signed-off-by: Naohiro Aota >> --- >> mm/swapfile.c | 29 ++++++++++++++++++++++------- >> 1 file changed, 22 insertions(+), 7 deletions(-) >> >> diff --git a/mm/swapfile.c b/mm/swapfile.c >> index bb3261d45b6a..dd5d7fa42282 100644 >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -2886,24 +2886,37 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) >> p->old_block_size = block_size(p->bdev); >> error = set_blocksize(p->bdev, PAGE_SIZE); >> if (error < 0) >> - return error; >> + goto err; >> /* >> * Zoned block devices contain zones that have a sequential >> * write only restriction. Hence zoned block devices are not >> * suitable for swapping. Disallow them here. >> */ >> - if (blk_queue_is_zoned(p->bdev->bd_queue)) >> - return -EINVAL; >> + if (blk_queue_is_zoned(p->bdev->bd_queue)) { >> + error = -EINVAL; >> + goto err; >> + } >> p->flags |= SWP_BLKDEV; >> } else if (S_ISREG(inode->i_mode)) { >> p->bdev = inode->i_sb->s_bdev; >> } >> >> inode_lock(inode); >> - if (IS_SWAPFILE(inode)) >> - return -EBUSY; >> + if (IS_SWAPFILE(inode)) { >> + inode_unlock(inode); >> + error = -EBUSY; >> + goto err; >> + } >> >> return 0; >> + >> +err: >> + if (S_ISBLK(inode->i_mode)) { >> + set_blocksize(p->bdev, p->old_block_size); >> + blkdev_put(p->bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); >> + } >> + >> + return error; >> } >> >> >> @@ -3157,10 +3170,12 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) >> mapping = swap_file->f_mapping; >> inode = mapping->host; >> >> - /* If S_ISREG(inode->i_mode) will do inode_lock(inode); */ >> + /* do inode_lock(inode); */ > >What if we made this function responsible for calling inode_lock (and >unlock) instead of splitting it between sys_swapon and claim_swapfile? I think we cannot take inode_lock before claim_swapfile() because we can have circular locking dependency as: claim_swapfile() -> blkdev_get() -> __blkdev_get() -> mutex_lock(&bdev->bd_mutex) -> bd_set_size() -> inode_lock(&bdev->bd_inode); So, one thing we can do is to move inode_lock() and "if (IS_SWAPFILE(..)) ..." out of claim_swapfile(). In this case, the "bad_swap" section must check if "inode_is_locked" to call "inode_unlock". > >--D > >> error = claim_swapfile(p, inode); >> - if (unlikely(error)) >> + if (unlikely(error)) { >> + inode = NULL; >> goto bad_swap; >> + } >> >> /* >> * Read the swap header. >> -- >> 2.25.0 >>