From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28CC3C54E41 for ; Wed, 6 Mar 2024 08:10:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A165F6B007D; Wed, 6 Mar 2024 03:10:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C6066B007E; Wed, 6 Mar 2024 03:10:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88E326B0080; Wed, 6 Mar 2024 03:10:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 78CA86B007D for ; Wed, 6 Mar 2024 03:10:45 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7D31980A79 for ; Wed, 6 Mar 2024 08:10:44 +0000 (UTC) X-FDA: 81865892808.02.ED1FF92 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 48A19C0016 for ; Wed, 6 Mar 2024 08:10:42 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709712643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OI0CDxIs+lEvmrGxx5Q7H1+oru1UroP7Ky+PH0VSST0=; b=qKBwrnKk2l7KM1mQAS306zYkWVnLcAcLte2giRKcSmLb8hxatQSJJMmKXLyLZTySoeDfjh OJi3WwCzh/+SNtKET6MASTqjHiHihI/Intk8EhPoddz+eGyS5SzhJzMRGxDI13EqI8USzM +VDAUzYQb+mvw/ylE737EGs5IlkciXs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709712643; a=rsa-sha256; cv=none; b=FfImdHGUoAazcf+vVxIhrpWnalxPwFOqsRPoHbpET4Xyb83jHWAIaQAoSgpBKyGstqC6un W6cJ/yh2l5DoxKLdGLeIWQSn5uPD0Fs2rK1KTKIQdk011GFxr5vUDTPz9jpYRCO6Sa7uI4 UAPwCD0IWhTsOgIsGgso+xaSgs7wY7E= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A5B4C1FB; Wed, 6 Mar 2024 00:11:17 -0800 (PST) Received: from [10.57.68.241] (unknown [10.57.68.241]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6C8943F762; Wed, 6 Mar 2024 00:10:39 -0800 (PST) Message-ID: Date: Wed, 6 Mar 2024 08:10:37 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] mm: swap: Fix race between free_swap_and_cache() and swapoff() Content-Language: en-GB To: David Hildenbrand , Andrew Morton , "Huang, Ying" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <20240305151349.3781428-1-ryan.roberts@arm.com> <8989df79-84f5-488c-bd74-c11d2241eff1@arm.com> <017414bc-78cd-4aa1-9edf-6ce947b9e4e4@redhat.com> From: Ryan Roberts In-Reply-To: <017414bc-78cd-4aa1-9edf-6ce947b9e4e4@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 48A19C0016 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: bowubxjccf3jyktutckn3jczyg4yxo93 X-HE-Tag: 1709712642-8914 X-HE-Meta: U2FsdGVkX1+Y59mADXvM7s3qyvcTk5cIvjQT4JgIRL0Lk8zAK/YcRp0jEGlvc06U/oh22XKIkTKBeSHtiktpmsvhu5ieW5Biq8NOctVAi3GhahdVUR/qzFpqpfzZLAQqQDV2YkeJljyQhMIILgjytqFL7el2TJmYG2IQJZX/0PedsbyoESH+KS4FQA0UXc5NutpdM9Te1d/rCtxq552KcEovUSZ3g8gJvfuyfuVGLemo8+zCVIlMFCRKNvkMSR8W11mrolxsxbPt5NZu2dG9huc1cyUM+SpuIjyabs4nReSLz2Ja7w0d4TYJy4zsKwSAwBTjJKqu907vgrPixEmnBhPQFt9Ao43lAr0qrcrUGZua++bShOyWIoQFgrvsDQYGQlF+FKFVse9NgWm6nYkgNz0EifyEB/KiQ0Y4GxN6Soyvtc9S6LkWU+GiNx3ANKj+fjMJSEM5AZ3+vsHUcpyf5Od5BFdGn4O/3tY1s6ddFEW0UuztRUtZY7hWnsybebSEeF3YwWMQONz7diMDEvfF3fOisX8sULvwmPWVNjI6wfDEk3CpmnjauSvLJeaLtrvEDyJ7UvjIeRHl1RDJtVUN5CV7RHGX4YKimEkuIVrQDZn9gqH8GuNGvnxUUBOrIBmi+aRiWXFrC7/N5nnwjr747kXkp2OZydnSZ7y99h1EDFQtJGjVJ1Osizu1oOyEoYPgxg21+q+egZzZdWKjDUAo1qA5FEUnnugBIsGg4yZfIzKm6v1OxnQjHWM/kKLrOf+/5NVnQVBIDJBdPuFGCzApmJlvVW8r7lSzLoHHyk9hN5zMCJmMbfVU7GRV7u71CuvvGTZk/mvjaNK9Axhq3bc6JcO8n9Ggd8+3cdlL+9qmMctopnwpK1UCPev/8T/WdAN2ncGIlfz35mY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05/03/2024 22:05, David Hildenbrand wrote: > On 05.03.24 17:33, Ryan Roberts wrote: >> On 05/03/2024 15:50, David Hildenbrand wrote: >>> On 05.03.24 16:13, Ryan Roberts wrote: >>>> There was previously a theoretical window where swapoff() could run and >>>> teardown a swap_info_struct while a call to free_swap_and_cache() was >>>> running in another thread. This could cause, amongst other bad >>>> possibilities, swap_page_trans_huge_swapped() (called by >>>> free_swap_and_cache()) to access the freed memory for swap_map. >>>> >>>> This is a theoretical problem and I haven't been able to provoke it from >>>> a test case. But there has been agreement based on code review that this >>>> is possible (see link below). >>>> >>>> Fix it by using get_swap_device()/put_swap_device(), which will stall >>>> swapoff(). There was an extra check in _swap_info_get() to confirm that >>>> the swap entry was valid. This wasn't present in get_swap_device() so >>>> I've added it. I couldn't find any existing get_swap_device() call sites >>>> where this extra check would cause any false alarms. >>>> >>>> Details of how to provoke one possible issue (thanks to David Hilenbrand >>>> for deriving this): >>> >>> Almost >>> >>> "s/Hilenbrand/Hildenbrand/" :) >> >> Ahh sorry... I even explicitly checked it against your name on emails... fat >> fingers... > > No need to be sorry. Even your average German person would get it wrong, > because there are other (more common) variants :) > > [...] > >>>> >>> >>> LGTM >>> >>> Are you planning on sending a doc extension for get_swap_device()? >> >> I saw your comment: >> >> We should likely update the documentation of get_swap_device(), that after >> decrementing the refcount, the SI might become stale and should not be touched >> without a prior get_swap_device(). >> >> But when I went to make the changes, I saw the documentation already said: >> >> ...we need to enclose all swap related functions with get_swap_device() and >> put_swap_device()... Notice that swapoff ... can still happen before the >> percpu_ref_tryget_live() in get_swap_device() or after the percpu_ref_put() in >> put_swap_device()... The caller must be prepared for that. >> >> I thought that already covered it? I'm sure as usual, I've misunderstood your >> point. Happy to respin if you have something in mind? > > No need to respin, we could clarify on top, if we decide it makes sense. > > I was thinking about something like this, making it clearer that the PTL > discussion above does not express the corner case we discovered: > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 2b3a2d85e350b..646a436581eee 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1232,6 +1232,11 @@ static unsigned char __swap_entry_free_locked(struct > swap_info_struct *p, >   * with get_swap_device() and put_swap_device(), unless the swap >   * functions call get/put_swap_device() by themselves. >   * > + * Note that when only holding the PTL, swapoff might succeed immediately > + * after freeing a swap entry. Therefore, immediately after > + * __swap_entry_free(), the swap info might become stale and should not > + * be touched without a prior get_swap_device(). > + * Are yes, this is useful. I'm going to have to respin anyway, so will include it in the next version. Thanks! >   * Check whether swap entry is valid in the swap device.  If so, >   * return pointer to swap_info_struct, and keep the swap entry valid >   * via preventing the swap device from being swapoff, until > >