From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E18AEC5478C for ; Tue, 27 Feb 2024 09:10:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69883280001; Tue, 27 Feb 2024 04:10:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 61E9194002F; Tue, 27 Feb 2024 04:10:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EA6F280001; Tue, 27 Feb 2024 04:10:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 39EFF94002F for ; Tue, 27 Feb 2024 04:10:22 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 18691140B27 for ; Tue, 27 Feb 2024 09:10:22 +0000 (UTC) X-FDA: 81837012684.22.922E271 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id 68FA7140010 for ; Tue, 27 Feb 2024 09:10:19 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709025019; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/xS4EApxoklEpcyfDmV49lRpTw4T8ueNdIKrxGZ+VA8=; b=ZHN7ihm49iiMoY65Ecl8YqOE0S18UsjuzQK2cqtqT7Pe2PQI1j2zFuOgXjPrSyUxb5TgVZ SiS+vrxNhSWPoyDY5FeCjD1qFOj/R1jZBBMLp1zximffB1RFKYne1sKGKkCjwmS+YHCOGH Iou/m0HIqIXpqjOmXpKCdhMsKeGze5A= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709025019; a=rsa-sha256; cv=none; b=Coe1EaMHqq2+rLstt3b+lOjXsuaIqDBe9knIa0V2vb08e39eKPXUnKHxv57AIMljGkYtRn Din7OLoFxlzNbKDNMCPnM5ACxV7dDOGi33XMOQqEQwytJK3a+w6P1kDhA+B1TQQG3O6KQd 530Z4tb43EO4eI+LRNsSdz1rCktqZMM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BE808DA7; Tue, 27 Feb 2024 01:10:56 -0800 (PST) Received: from [10.57.67.4] (unknown [10.57.67.4]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B01243F762; Tue, 27 Feb 2024 01:10:15 -0800 (PST) Message-ID: Date: Tue, 27 Feb 2024 09:10:13 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] madvise:madvise_cold_or_pageout_pte_range(): allow split while folio_estimated_sharers = 0 Content-Language: en-GB To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Yin Fengwei , Yu Zhao , David Hildenbrand , Kefeng Wang , Matthew Wilcox , Minchan Kim , Vishal Moola , Yang Shi References: <20240221085036.105621-1-21cnbao@gmail.com> <71fa4302-2df6-4e55-a5a8-7609476c41d4@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 68FA7140010 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: en3wspicfoz4zbsp6ok9rjucnx39sw9x X-HE-Tag: 1709025019-970241 X-HE-Meta: U2FsdGVkX1+aj5xuazCJXwY3Jbburfcw/agekzdFyp5d/12yJm9doQ0Hw/xML1SQT88LMDqkIerJyjvXfWUlmfERnmp1V/1jBCtxYUiS9rav1UAo4gRa63dkqXlYt2tVOxNo8Xrz0Gmq1l5466xjrgN4ZqzOuRPBLl0ItT8VFczHwqHl8XTlMM3RJyQh6brysJDlKBvBxfKcDfVEvjzfYaWvdixaX2oLyKUTt0XWCh0HeY/S9/dBABBaothlnyYNY1n3GuINyEiKG3JTJaG619kfbPpHdZqsWj3wsqh2pa1ThgT2C5uDh+WreRBZEgM1i4YnMEqqXlZQLRSTdIfV3waVcHeL80QmIzPK++Z7Oa4qkgRL5Vxjo/oGe4aqlOcKQhMMRJqkgne6FoJLgj37BEKrP16XtvcyUvB46CYMsx6I7g5hQWhJ0J44cAoXulhBM5iIlsL5gllmBBEKD0k4v83kB1iC7CN7uYpc/KoT8AbK8zTZAN4nXv2HUPdfgLyYTT9u5I/KYGbAL5GSjUErYL/ig5Y4UdWsvOYAf0z6Q9Fncsnq+ieWwopdVa10mhElaE7hL3h37kY2+NQar+zjL2/CUgFuloj64k5lp22CbEzfpSnSIVukfgJdKmqe3Np8l2vhalgdXdeem1xKT+kCNLSiSH4gQmIQtr/RWb5QdpqYrR+gzHnRjuo3XuxvvzFjfFXBFKpEyE0VIozRIosXbbpa6IhxEZ5fCknv22gwMBWNTCukPRjHcwwLGPAezLIcg+uOlbRXiBkmFIh1TLQDmHIE2M0xpCoEbm8bzUNxMXHsXVI/IiMLLjsYKXYM1SOH3gz8hM/qyXBM6jyMmKBUkKXxPvQok+pJJGlxhB3UhDOKuxA0MXGCoOHNE0FPeKKzbpTdr6mRszIplDCAOd8BAso+fslkWfFXKVGVR1PKhFW55dDros3rRtDkDEil+WRvU2e/HufGdiIQoufUc5E Dtw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 26/02/2024 21:17, Barry Song wrote: > On Tue, Feb 27, 2024 at 2:46 AM Ryan Roberts wrote: >> >> On 21/02/2024 08:50, Barry Song wrote: >>> From: Barry Song >>> >>> The purpose is stopping splitting large folios whose mapcount are 2 or >>> above. Folios whose estimated_shares = 0 should be still perfect and >>> even better candidates than estimated_shares = 1. >>> >>> Consider a pte-mapped large folio with 16 subpages, if we unmap 1-15, >>> the current code will split folios and reclaim them while madvise goes >>> on this folio; but if we unmap subpage 0, we will keep this folio and >>> break. This is weird. >>> >>> For pmd-mapped large folios, we can still use "= 1" as the condition >>> as anyway we have the entire map for it. So this patch doesn't change >>> the condition for pmd-mapped large folios. >>> This also explains why we had been using "= 1" for both pmd-mapped and >>> pte-mapped large folios before commit 07e8c82b5eff ("madvise: convert >>> madvise_cold_or_pageout_pte_range() to use folios"), because in the >>> past, we used the mapcount of the specific subpage, since the subpage >>> had pte present, its mapcount wouldn't be 0. >>> >>> The problem can be quite easily reproduced by writing a small program, >>> unmapping the first subpage of a pte-mapped large folio vs. unmapping >>> anyone other than the first subpage. >>> >>> Fixes: 2f406263e3e9 ("madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check") >>> Cc: Yin Fengwei >>> Cc: Yu Zhao >>> Cc: Ryan Roberts >>> Cc: David Hildenbrand >>> Cc: Kefeng Wang >>> Cc: Matthew Wilcox >>> Cc: Minchan Kim >>> Cc: Vishal Moola (Oracle) >>> Cc: Yang Shi >>> Signed-off-by: Barry Song >>> --- >>> mm/madvise.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/madvise.c b/mm/madvise.c >>> index cfa5e7288261..abde3edb04f0 100644 >>> --- a/mm/madvise.c >>> +++ b/mm/madvise.c >>> @@ -453,7 +453,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, >>> if (folio_test_large(folio)) { >>> int err; >>> >>> - if (folio_estimated_sharers(folio) != 1) >>> + if (folio_estimated_sharers(folio) > 1) >>> break; >>> if (pageout_anon_only_filter && !folio_test_anon(folio)) >>> break; >> >> I wonder if we should change all the instances: >> >> folio_estimated_sharers() != 1 -> folio_estimated_sharers() > 1 >> folio_estimated_sharers() == 1 -> folio_estimated_sharers() <= 1 >> >> It shouldn't cause a problem for the pmd case, and there are definitely other >> cases where it will help. e.g. madvise_free_pte_range(). > > right. My test case covered PAGEOUT only and I agree madvise_free and > others have > exactly the same issue. for pmd case, it doesn't matter whether we > change the condition > or not because we have already pmd-mapped in the page table. > > And good to know David will have a wrapper in folio_mapped_shared() to more > widely address this issue. > >> >> Regardless: >> >> Reviewed-by: Ryan Roberts >> > > Thanks though we might have missed your tag as this one has been > in mm-stable. No problem! I've been out on holiday so a bit behind on where everything is. > > Best regards, > Barry