From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E715C54798 for ; Sat, 9 Mar 2024 12:33:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAC916B0074; Sat, 9 Mar 2024 07:33:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C8E096B0075; Sat, 9 Mar 2024 07:33:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B72526B0078; Sat, 9 Mar 2024 07:33:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A80DA6B0074 for ; Sat, 9 Mar 2024 07:33:51 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 50F031A07D3 for ; Sat, 9 Mar 2024 12:33:51 +0000 (UTC) X-FDA: 81877442262.26.ADF296A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 58B6980018 for ; Sat, 9 Mar 2024 12:33:49 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709987629; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OAiIXfEwHbmzVu7Ia+hnJHSDhBmqx/dhivC8UI3Il/E=; b=o26uN6cXxGHPRB7373c4ufSI8/pYfFuIA3xVpPCckdhQ+YWyzypAluYNlqg/cLU2Q4xnxy oSxnDvRrNq2QkixZZ2e4hcFgj+hWOYo0tSIM9N6+qlfyXYI8MmpMyHN2tFmrmQEWBf8dvf D2dxNNAhhAWLUJ2QQ5iQvQCzj1rI8YM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709987629; a=rsa-sha256; cv=none; b=YkaUfYx+q5kTZU6DUuczpiEaskKKl14WMSIciQ2QjfHEqZwxlWK7JWK2SZVVuEI58gU5Lt WuPamULOWUNTNv1wiBcEUcEDcXveHRvBkHpK9Fv3PzKfq9N64NSJrEBhqGpErwuxRrcSNa sXctxAVJcp9zsI8T4lQPnLfEisLGvvM= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8AC6FC15; Sat, 9 Mar 2024 04:34:24 -0800 (PST) Received: from [10.57.67.186] (unknown [10.57.67.186]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C10073F738; Sat, 9 Mar 2024 04:33:46 -0800 (PST) Message-ID: <748cb10a-c4c4-4517-b1df-72ccb7d55c60@arm.com> Date: Sat, 9 Mar 2024 12:33:44 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Content-Language: en-GB From: Ryan Roberts To: Matthew Wilcox , Andrew Morton Cc: David Hildenbrand , Zi Yan , linux-mm@kvack.org, Yang Shi , Huang Ying References: <7415b36c-b5d3-4655-92e1-b303104bf4a9@arm.com> <644c2f60-dbb0-4fdb-8505-96f8101b2399@arm.com> <20240308203415.3379391ec00d5a66cf66dd5b@linux-foundation.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 58B6980018 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: q74g545bdrixsxs5qzw9pte977hbninc X-HE-Tag: 1709987629-80136 X-HE-Meta: U2FsdGVkX1+mnLDlm0uP0l4Pn7lE/9wa2uhtQDFQFrOqTMyl9XInBd9qGdRsAHX5ubN/c9IR9vyOYTCHV70Xu3fVQNbfoaQBbptjtHIIZSLUCNKhBRutvE+1xAOZ4MzmW4ASo3s0iCM5S1cG5ZrOwsIVWisto/c25cm4clY+U1nWFMsifA5GnmzS1YrwpzbRmFjvzQOEh6iBBaKibIZX0Bi1GnoTvB9d6IsSIc+ILI8ENux8qh2dRJo+Ub7vUwMcuBJpLtvFHY18kewbAoaFkOluRl1bVFX3LH172EGk/OvR6byvA/VoVYfjUOKU0sK4M638MdyXe6EB3FJ4dBQtr94ticGUBdkJxnrtGt7EwwRij+DVLYUN9GorfKExm8VIDgzNVxQaThbbctPgalHwvNa1/nYwHXXKr7PBttJ+rubjZ0apjJNtPUdjlYPs+iXOffj5CAMlCiKFGA9QzGSkHb1twnKUU2HoPI0FiNjWDBqKFy7OSta4oaHHX/eTgW0qV2Wjq+KyeNeeLfXjsuWGrtJYvw4nkGFJE/Kps59lopqqnOoCZmqO5kzVWpGhR+9Kgy8rrOpHst1W35Er80B9NeKsS8AMQ7R02Ypvhn845U/+Yek4CCIJseGzNYt5n6013TrAfL9yTigFlEyhCC1/dvQoVjHduRv+dseaa0hXR+a+vyug+2D80cajGMBFyfn6JAM+0t+ERAyUybsg2wSaN4C9am+N9yeHgS6V8FEL/HJBBmZkyUekpzxkX6dshAzslqYtaQUiSjijtPh6lv1DUOxyHjv6NztvgWkZAZ2Am98YZ0N6hRy+aZSaQVloR7tDrwbfNHsxQgkSrMpAZIdYiux0fB7dPkVB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 09/03/2024 08:05, Ryan Roberts wrote: > On 09/03/2024 04:52, Matthew Wilcox wrote: >> On Fri, Mar 08, 2024 at 08:34:15PM -0800, Andrew Morton wrote: >>> >>> We seem to be coming down to the wire on this one - Linus might release >>> 6.8 this weekend. >>> >>> Will simply dropping "mm: allow non-hugetlb large folios to be batch >>> processed" from mm-stable get us out of trouble? >> >> We can add a fix patch which re-narrows the race to the point where it's >> no longer observable. Obviously we need to figure out what the real >> problem is, but we could be going back a long way. We've definitely >> found two bugs in the process of investigating the problem (of arguable >> import; the migration one merely wastes memory temporarily and it's not >> entirely clear that the wrong-lock problem definitely causes a crash) >> >> diff --git a/mm/swap.c b/mm/swap.c >> index 6b697d33fa5b..7b1d3144391b 100644 >> --- a/mm/swap.c >> +++ b/mm/swap.c >> @@ -1012,6 +1012,8 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) >> free_huge_folio(folio); >> continue; >> } >> + if (folio_test_large(folio) && folio_test_large_rmappable(folio)) >> + folio_undo_large_rmappable(folio); >> >> __page_cache_release(folio, &lruvec, &flags); >> > > I agree this is likely to re-hide the problems. But I haven't actually tested it > on it's own without the other fixes. I'll do some more testing with your latest > patch and if that doesn't lead anywhere, I'll test with this on its own to check > that I can no longer reproduce the crashes. If it hides them, I think this is > the best short-term solution we have right now. I've tested this workaround on immediately on top of commit f77171d241e3 ("mm: allow non-hugetlb large folios to be batch processed") and can't reproduce any problem. I've run the test 32 times. Without the workaround, the biggest number of test repeats I've managed before seeing a problem is ~5. So I'm confident this will be sufficient as a short-term solution.