From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0D36C04E69 for ; Wed, 2 Aug 2023 11:51:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6FE15280162; Wed, 2 Aug 2023 07:51:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AE38280143; Wed, 2 Aug 2023 07:51:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59CFE280162; Wed, 2 Aug 2023 07:51:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4A6A8280143 for ; Wed, 2 Aug 2023 07:51:36 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1A85C160DC7 for ; Wed, 2 Aug 2023 11:51:36 +0000 (UTC) X-FDA: 81078999792.15.78B36F2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf05.hostedemail.com (Postfix) with ESMTP id CEFF610000C for ; Wed, 2 Aug 2023 11:51:33 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf05.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690977094; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9+9UwFs0mFfh8tsuOop5RJPCT+/giy9JaF/2enhUwnE=; b=K/ssEj3Xhk5Cx9cfNvhEmIS2eEzjlDWkmUx+Cqkb4ukn7rujlEyQnPMifrh2UGcyEAvivT ecy1xj77UBdPUTeiRi/uLyZQ/Ckc24qvFm9Mp+6DHUiZXMTH6PDF6IA11/9+EHDdqvkScG /LfZ6dfr3NAsYWyolxARyey6DB1klic= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf05.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690977094; a=rsa-sha256; cv=none; b=1jniKJoMkKmUh/VFx81+WvEVTcZNl6nTqg1L6pwENGDRqjfBLBMl/Wubj31FFhBs1PzqYt rvlVKGol2fhtOZwoBO1e39bBHVIv6FQ6qagZe1h2knbBJ6FARNRNw+RGF4o9Y9RDeXmkrX q0tAF4wpLfLJEcEuaiuNKndoqi9Obuw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6FB31113E; Wed, 2 Aug 2023 04:52:15 -0700 (PDT) Received: from [10.57.77.90] (unknown [10.57.77.90]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2BE173F6C4; Wed, 2 Aug 2023 04:51:29 -0700 (PDT) Message-ID: <2d64ca09-06fe-a32f-16f9-c277b7033b57@arm.com> Date: Wed, 2 Aug 2023 12:51:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH 0/2] don't use mapcount() to check large folio sharing To: David Hildenbrand , Yin Fengwei , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, akpm@linux-foundation.org, willy@infradead.org, vishal.moola@gmail.com, wangkefeng.wang@huawei.com, minchan@kernel.org, yuzhao@google.com, shy828301@gmail.com References: <20230728161356.1784568-1-fengwei.yin@intel.com> <3bbfde16-ced1-dca8-6a3f-da893e045bc5@arm.com> <31093c49-5baa-caed-9871-9503cb89454b@redhat.com> <20419779-b5f5-7240-3f90-fe5c4b590e4d@arm.com> <2722c9ad-370a-70ff-c374-90a94eca742a@redhat.com> From: Ryan Roberts In-Reply-To: <2722c9ad-370a-70ff-c374-90a94eca742a@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: CEFF610000C X-Stat-Signature: gbtomksnctogoosmdn7ptkjf6kuf4o96 X-HE-Tag: 1690977093-68236 X-HE-Meta: U2FsdGVkX18rASywsoHkAXhjUk8eI7vuT46QplxQU+ANqw0/vXV46pLy5E4RWE+NoO3uOeLxRCnP2DJVUHoxggfjET9avnAqG4ITE+8/BJkvSA1UsceldWnsVXn7sJNyAYuZOszPDqqToMeWGNcwW7ULh1fEj5H2x7dngzU3gvJrEWMksYEaN/XW5X8yprsGrMQwyqsurVlCt7XGZNlFyLgXSFuPchW8RsOjrYL+dUtSychQHDf9unAZAnMflEo1o0hj3jsxkm+QxzxC4yrro+QkIJgN+rL2k+51CmuOKs67MbQDKS75TeRM1pWvO9MA08CmpBUCR7ONyUW5tASpgZO6mqK2+nJB/+cbkGByLqgEB3SRpxopJYhERHrpIlmaUQNMOvHtC1YRYQTsFw7maODwslA5yjouBfzy02VQtjzG5Z2Z6VG7++RbWGN/Gn3FDCxvIhDFcEa2E5uXPcPrOiFskADh//SJ6msZS/2SskfVZpqpw2Fyl+kVcSYP7NZ4CUYRCM/Sv/qv3D/ojAo/8jEJisn9pTwLaylfZbShPwfg73PmrAAK5nRq68adsGbpiTsVjLEUle5XHIADZbwXBnFhvf2c4l2tJSIoE4JuHOZSqUvaWti238gyHjQEkmI8UK+dwYAO0jZ49Vsh1k48jZAz9XPQWSadq7D3/nEcHzptnbNB5rwjHiWV5gll1RsdcZgo7n4EisQ2H4LXFMQQmsDqEAcZaRX+cqD0QH/HezzBLmO8gzwgbVfpYNvXwjr97KRRLIyPpBE0gv/32OpsYt6RBZ0JfIJTK5g/TSG53qsGbBh/LjCXLWx8kPDiyQQle54RT1pS3f5fM3UIe99eITBotSTcwPZQ/h001R2L1u05LB0Lwn508fewcTevLavlHs6OGVMhn6tleGN2tC3wwF48NZbCEBQbCb8N41MvDsdV37j6VuvuqpQdnCZ8AKGCtZxqL+w7vZyZ+aYMCEp K9gVad9S XY3yvHwfs/gJopgK1OW3gAjbS3X8fW3m0WIqrJoupfDNhfhS5/7USc+9Uod6ZLd98nx/6it4IJgSUOXO25MFsGRG/BmJE7RUEbbblbPnvBi+AQMCynH9lJQBq24wP67OxE/7K63hAH8Stq3MQs8Ot9Wi2hpmF+ygCmNC1nVF4wtcolcUMGvZQ7wXWrD0i3MAj9j8wBMQ5fjG6Y6beuwsANxqsCZud1h2Qc6KPmLKMOyYJjJqspAn/lHXOFK0l9y9TmsrWKrYsd90BTw0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02/08/2023 12:36, David Hildenbrand wrote: > On 02.08.23 13:20, Ryan Roberts wrote: >> On 02/08/2023 11:48, David Hildenbrand wrote: >>> On 02.08.23 12:27, Ryan Roberts wrote: >>>> On 28/07/2023 17:13, Yin Fengwei wrote: >>>>> In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(), >>>>> folio_mapcount() is used to check whether the folio is shared. But it's >>>>> not correct as folio_mapcount() returns total mapcount of large folio. >>>>> >>>>> Use folio_estimated_sharers() here as the estimated number is enough. >>>>> >>>>> Yin Fengwei (2): >>>>>     madvise: don't use mapcount() against large folio for sharing check >>>>>     madvise: don't use mapcount() against large folio for sharing check >>>>> >>>>>    mm/huge_memory.c | 2 +- >>>>>    mm/madvise.c     | 6 +++--- >>>>>    2 files changed, 4 insertions(+), 4 deletions(-) >>>>> >>>> >>>> As a set of fixes, I agree this is definitely an improvement, so: >>>> >>>> Reviewed-By: Ryan Roberts >>>> >>>> >>>> But I have a couple of comments around further improvements; >>>> >>>> Once we have the scheme that David is working on to be able to provide precise >>>> exclusive vs shared info, we will probably want to move to that. Although that >>>> scheme will need access to the mm_struct of a process known to be mapping the >>> >>> There are probably ways to work around lack of mm_struct, but it would not be >>> completely for free. But passing the mm_struct should probably be an easy >>> refactoring. >>> >>>> folio. We have that info, but its not passed to folio_estimated_sharers() so we >>>> can't just reimplement folio_estimated_sharers() - we will need to rework these >>>> call sites again. >>> >>> We should probably just have a >>> >>> folio_maybe_mapped_shared() >>> >>> with proper documentation. Nobody should care about the exact number. >>> >>> >>> If my scheme for anon pages makes it in, that would be precise for anon pages >>> and we could document that. Once we can handle pagecache pages as well to get a >>> precise answer, we could change to folio_mapped_shared() and adjust the >>> documentation. >> >> Makes sense to me. I'm assuming your change would allow us to get rid of >> PG_anon_exclusive too? In which case we would also want a precise API >> specifically for anon folios for the CoW case, without waiting for pagecache >> page support. > > Not necessarily and I'm currently not planning that > > On the COW path, I'm planning on using it only when PG_anon_exclusive is clear > for a compound page, combined with a check that there are no other page > references besides from mappings: all mappings from me and #refs == #mappings -> > reuse (set PG_anon_exclusive). That keeps the default (no fork) as fast as > possible and simple. > >>> >>> I just saw >>> >>> https://lkml.kernel.org/r/20230802095346.87449-1-wangkefeng.wang@huawei.com >>> >>> that converts a lot of code to folio_estimated_sharers(). >>> >>> >>> That patchset, for example, also does >>> >>> total_mapcount(page) > 1 -> folio_estimated_sharers(folio) > 1 >>> >>> I'm not 100% sure what to think about that at this point. We eventually add >>> false negatives (actually shared but we fail to detect it) all over the place, >>> instead of having false positives (actually exclusive, but we fail to detect >>> it). >>> >>> And that patch set doesn't even spell that out. >>> >>> >>> Maybe it's as good as we will get, especially if my scheme doesn't make it in. >> >> I've been working on the assumption that your scheme is plan A, and I'm waiting >> for it to unblock forward progress on large anon folios. Is this the right >> approach, or do you think your scheme is sufficiently riskly and/or far out that >> I should aim not to depend on it? > > It is plan A. IMHO, it does not feel too risky and/or far out at this point -- > and the implementation should not end up too complicated. But as always, I > cannot promise anything before it's been implemented and discussed upstream. OK, good we are on the same folio... (stolen from Hugh; if a joke is worth telling once, its worth telling 1000 times ;-) > > Hopefully, we know more soon. I'll get at implementing it fairly soon. >