From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C13BEC433B4 for ; Mon, 19 Apr 2021 07:41:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 30C6461008 for ; Mon, 19 Apr 2021 07:41:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 30C6461008 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A0DBC6B0036; Mon, 19 Apr 2021 03:41:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 984626B006E; Mon, 19 Apr 2021 03:41:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FDD66B0070; Mon, 19 Apr 2021 03:41:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id 5FA606B0036 for ; Mon, 19 Apr 2021 03:41:43 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 23224249A for ; Mon, 19 Apr 2021 07:41:43 +0000 (UTC) X-FDA: 78048322086.38.D45D9ED Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf20.hostedemail.com (Postfix) with ESMTP id 16936130 for ; Mon, 19 Apr 2021 07:41:34 +0000 (UTC) IronPort-SDR: MDjR43/nT7VyOvdbJ/Caez/qAM8xuRubrCBkJrrQ+mPWMS/9OBvapck0b7M6Xx7xPlSlRRB4tx LPdrXAsm7W3g== X-IronPort-AV: E=McAfee;i="6200,9189,9958"; a="182777584" X-IronPort-AV: E=Sophos;i="5.82,233,1613462400"; d="scan'208";a="182777584" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Apr 2021 00:41:39 -0700 IronPort-SDR: 96eJN9v2jRinOp9bZ8wzjLPORJqeNK6KJsdmHV5mts+h/41o1NDEB0nut17SymCNEiLETlc9lH oqL/zdR6sa8A== X-IronPort-AV: E=Sophos;i="5.82,233,1613462400"; d="scan'208";a="426409354" Received: from yhuang6-desk1.sh.intel.com (HELO yhuang6-desk1.ccr.corp.intel.com) ([10.239.13.1]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Apr 2021 00:41:34 -0700 From: "Huang, Ying" To: Miaohe Lin Cc: , , , , , , , , , , , , Subject: Re: [PATCH v2 5/5] mm/shmem: fix shmem_swapin() race with swapoff References: <20210417094039.51711-1-linmiaohe@huawei.com> <20210417094039.51711-6-linmiaohe@huawei.com> <87r1j7kok3.fsf@yhuang6-desk1.ccr.corp.intel.com> <87h7k24uxg.fsf@yhuang6-desk1.ccr.corp.intel.com> <41a33c84-f878-8dab-a1d0-4aea3a1fc739@huawei.com> Date: Mon, 19 Apr 2021 15:41:28 +0800 In-Reply-To: <41a33c84-f878-8dab-a1d0-4aea3a1fc739@huawei.com> (Miaohe Lin's message of "Mon, 19 Apr 2021 15:14:10 +0800") Message-ID: <877dky4t7b.fsf@yhuang6-desk1.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 16936130 X-Stat-Signature: xm3kcgymenw53gu7dw45m6ubpyhedbcp Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf20; identity=mailfrom; envelope-from=""; helo=mga18.intel.com; client-ip=134.134.136.126 X-HE-DKIM-Result: none/none X-HE-Tag: 1618818094-149078 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Miaohe Lin writes: > On 2021/4/19 15:04, Huang, Ying wrote: >> Miaohe Lin writes: >> >>> On 2021/4/19 10:15, Huang, Ying wrote: >>>> Miaohe Lin writes: >>>> >>>>> When I was investigating the swap code, I found the below possible race >>>>> window: >>>>> >>>>> CPU 1 CPU 2 >>>>> ----- ----- >>>>> shmem_swapin >>>>> swap_cluster_readahead >>>>> if (likely(si->flags & (SWP_BLKDEV | SWP_FS_OPS))) { >>>>> swapoff >>>>> si->flags &= ~SWP_VALID; >>>>> .. >>>>> synchronize_rcu(); >>>>> .. >>>> >>>> You have removed these code in the previous patches of the series. And >>>> they are not relevant in this patch. >>> >>> Yes, I should change these. Thanks. >>> >>>> >>>>> si->swap_file = NULL; >>>>> struct inode *inode = si->swap_file->f_mapping->host;[oops!] >>>>> >>>>> Close this race window by using get/put_swap_device() to guard against >>>>> concurrent swapoff. >>>>> >>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >>>> >>>> No. This isn't the commit that introduces the race condition. Please >>>> recheck your git blame result. >>>> >>> >>> I think this is really hard to find exact commit. I used git blame and found >>> this race should be existed when this is introduced. Any suggestion ? >>> Thanks. >> >> I think the commit that introduces the race condition is commit >> 8fd2e0b505d1 ("mm: swap: check if swap backing device is congested or >> not") >> > > Thanks. > The commit log only describes one race condition. And for that one, this should be correct > Fixes tag. But there are still many other race conditions inside swap_cluster_readahead, > such as swap_readpage() called from swap_cluster_readahead. This tag could not cover the > all race windows. No. swap_readpage() in swap_cluster_readahead() is OK. Because __read_swap_cache_async() is called before that, so the swap entry will be marked with SWAP_HAS_CACHE, and page will be locked. Best Regards, Huang, Ying >> Best Regards, >> Huang, Ying >> >>>> Best Regards, >>>> Huang, Ying >>>> >>>>> Signed-off-by: Miaohe Lin >>>>> --- >>>>> mm/shmem.c | 6 ++++++ >>>>> 1 file changed, 6 insertions(+) >>>>> >>>>> diff --git a/mm/shmem.c b/mm/shmem.c >>>>> index 26c76b13ad23..936ba5595297 100644 >>>>> --- a/mm/shmem.c >>>>> +++ b/mm/shmem.c >>>>> @@ -1492,15 +1492,21 @@ static void shmem_pseudo_vma_destroy(struct vm_area_struct *vma) >>>>> static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, >>>>> struct shmem_inode_info *info, pgoff_t index) >>>>> { >>>>> + struct swap_info_struct *si; >>>>> struct vm_area_struct pvma; >>>>> struct page *page; >>>>> struct vm_fault vmf = { >>>>> .vma = &pvma, >>>>> }; >>>>> >>>>> + /* Prevent swapoff from happening to us. */ >>>>> + si = get_swap_device(swap); >>>>> + if (unlikely(!si)) >>>>> + return NULL; >>>>> shmem_pseudo_vma_init(&pvma, info, index); >>>>> page = swap_cluster_readahead(swap, gfp, &vmf); >>>>> shmem_pseudo_vma_destroy(&pvma); >>>>> + put_swap_device(si); >>>>> >>>>> return page; >>>>> } >>>> . >>>> >> . >>