From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC97EC433B4 for ; Thu, 8 Apr 2021 21:34:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 63B3361001 for ; Thu, 8 Apr 2021 21:34:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 63B3361001 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D85BD6B006C; Thu, 8 Apr 2021 17:34:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5C326B006E; Thu, 8 Apr 2021 17:34:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C23FF6B0071; Thu, 8 Apr 2021 17:34:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id A4A796B006C for ; Thu, 8 Apr 2021 17:34:34 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 608B28E65 for ; Thu, 8 Apr 2021 21:34:34 +0000 (UTC) X-FDA: 78010504068.23.FD0901A Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf03.hostedemail.com (Postfix) with ESMTP id 0965AC0007C6 for ; Thu, 8 Apr 2021 21:34:30 +0000 (UTC) IronPort-SDR: Nttd2AZDFEQoRfJ2YkmKu11SI69fhv5RavKBrouH7n1Q902pXlxOH6AYAd4BDI4QhvMfX5MWbD c30AmeSYws+g== X-IronPort-AV: E=McAfee;i="6000,8403,9948"; a="214052344" X-IronPort-AV: E=Sophos;i="5.82,207,1613462400"; d="scan'208";a="214052344" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2021 14:34:31 -0700 IronPort-SDR: RQ8llsF18AjTDJ3fMupU3UimpX8MtU7KzcNM3kgcLAeXBSbVi/sKYReRdS9qrbxHfnRTaz09Wb 4h/AWY5K2guA== X-IronPort-AV: E=Sophos;i="5.82,207,1613462400"; d="scan'208";a="422427802" Received: from schen9-mobl.amr.corp.intel.com ([10.209.1.104]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2021 14:34:30 -0700 Subject: Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff To: Miaohe Lin , akpm@linux-foundation.org Cc: hannes@cmpxchg.org, mhocko@suse.com, iamjoonsoo.kim@lge.com, vbabka@suse.cz, alex.shi@linux.alibaba.com, willy@infradead.org, minchan@kernel.org, richard.weiyang@gmail.com, ying.huang@intel.com, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20210408130820.48233-1-linmiaohe@huawei.com> <20210408130820.48233-3-linmiaohe@huawei.com> From: Tim Chen Message-ID: <7684b3de-2824-9b1f-f033-d4bc14f9e195@linux.intel.com> Date: Thu, 8 Apr 2021 14:34:30 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20210408130820.48233-3-linmiaohe@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Stat-Signature: bkxu1ofh1ic1r1a7dzndrb8p8tr1xg16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0965AC0007C6 Received-SPF: none (linux.intel.com>: No applicable sender policy available) receiver=imf03; identity=mailfrom; envelope-from=""; helo=mga01.intel.com; client-ip=192.55.52.88 X-HE-DKIM-Result: none/none X-HE-Tag: 1617917670-596368 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 4/8/21 6:08 AM, Miaohe Lin wrote: > When I was investigating the swap code, I found the below possible race > window: > > CPU 1 CPU 2 > ----- ----- > do_swap_page > synchronous swap_readpage > alloc_page_vma > swapoff > release swap_file, bdev, or ... Perhaps I'm missing something. The release of swap_file, bdev etc happens after we have cleared the SWP_VALID bit in si->flags in destroy_swap_extents if I read the swapoff code correctly. > swap_readpage > check sis->flags is ok > access swap_file, bdev...[oops!] > si->flags = 0 This happens after we clear the si->flags synchronize_rcu() release swap_file, bdev, in destroy_swap_extents() So I think if we have get_swap_device/put_swap_device in do_swap_page, it should fix the race you've pointed out here. Then synchronize_rcu() will wait till we have completed do_swap_page and call put_swap_device. > > Using current get/put_swap_device() to guard against concurrent swapoff for > swap_readpage() looks terrible because swap_readpage() may take really long > time. And this race may not be really pernicious because swapoff is usually > done when system shutdown only. To reduce the performance overhead on the > hot-path as much as possible, it appears we can use the percpu_ref to close > this race window(as suggested by Huang, Ying). I think it is better to break this patch into two. One patch is to fix the race in do_swap_page and swapoff by adding get_swap_device/put_swap_device in do_swap_page. The second patch is to modify get_swap_device and put_swap_device with percpu_ref. But swapoff is a relatively rare events. I am not sure making percpu_ref change for performance is really beneficial. Did you encounter a real use case where you see a problem with swapoff? The delay in swapoff is primarily in try_to_unuse to bring all the swapped off pages back into memory. Synchronizing with other CPU for paging in probably is a small component in overall scheme of things. Thanks. Tim