From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E17C5C2BD09 for ; Thu, 27 Jun 2024 09:27:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D0706B008A; Thu, 27 Jun 2024 05:27:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 47E946B008C; Thu, 27 Jun 2024 05:27:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 346476B0092; Thu, 27 Jun 2024 05:27:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1641C6B008A for ; Thu, 27 Jun 2024 05:27:21 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B71DDA4540 for ; Thu, 27 Jun 2024 09:27:20 +0000 (UTC) X-FDA: 82276140240.21.0245F45 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 9C5AF1C000F for ; Thu, 27 Jun 2024 09:27:18 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719480431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nEa/h/Qgzjb8mcRdUZhpSTZiMMhlcQZJ17xX0VDbgDA=; b=QtkMEt98p7HipttzY6Fahk0YarbBUht+EWyvjGd4F+dypjXOD8koWB9Wh/Ed/kbH0lnoCa q22RB0d5LLPhYUGu6WN+TqoObSgOCwpAzBGEvZULX3UlNsKBttXE7ogDHfyhzYPajKU9Px s00aVxgLDPgvXFWqsBBC4WAZZROavMA= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719480431; a=rsa-sha256; cv=none; b=fqg9PtUxjQZ5rMqQJe3wGwLGf4TnS4l5a9fLz7WqqKYMgCzRBKKvsH+SqQVGZWX/eSRKLp 8vW6VZmDnu/hwCVXpCiXfLtwoAm2rkXxCPQJixCajYiO/upK3MIno5y1jPZObnwR2EM1J0 /xYzYxqMxNnT1KGgg9xkvB8+ysjK+EI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 782B7367; Thu, 27 Jun 2024 02:27:42 -0700 (PDT) Received: from [10.1.32.171] (XHFQ2J9959.cambridge.arm.com [10.1.32.171]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 909263F8F4; Thu, 27 Jun 2024 02:27:15 -0700 (PDT) Message-ID: <4e1a1878-4133-4d78-90fa-1d5bc99d179c@arm.com> Date: Thu, 27 Jun 2024 10:27:14 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] kpageflags: fix wrong KPF_THP on non-pmd-mappable compound pages Content-Language: en-GB To: Barry Song <21cnbao@gmail.com> Cc: Zi Yan , ran xiaokai , akpm@linux-foundation.org, willy@infradead.org, vbabka@suse.cz, svetly.todorov@memverge.com, ran.xiaokai@zte.com.cn, peterx@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, David Hildenbrand , Baolin Wang , Kefeng Wang , Lance Yang References: <20240626024924.1155558-1-ranxiaokai627@163.com> <20240626024924.1155558-3-ranxiaokai627@163.com> <1907a8c0-9860-4ca0-be59-bec0e772332b@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9C5AF1C000F X-Stat-Signature: geaeff8jfhndzjk4nrthtxxcqae7uky9 X-HE-Tag: 1719480438-984462 X-HE-Meta: U2FsdGVkX19AIHBEo+ziEbO7n2KyPABLaw8FONDCRQy/i9Bv9HBU9BpiqB5j4JdfCOv+gHqYPpGYiEhl1PvQcuR5FyQ0QDYQ/WQWgwnvD4wRLthSf4DDefX6ITvEIpc8/EfZRmhxmgW8YIs1RmWduSqJVmSSwTAkWK7McLttfEbsflrNh+jOTlOUVRRQPVRNrgH82EbqdMT7HuHvUOEE+H4IdYxhHdZ5pSH9njMqMMvhtwf6VPjhSQej6XhSgr4bXnjndiZ8Eil3n+yj+if7ipWP1t3fD2tMD/l96LpSVKwEZnfJqQOGD5JMXOXIsmYfsDe+Q21fyk5G2MrnotebV8Ieybw4GbX7GEZFxukXawzso9U4vEftiLAaCgPGnoBO+xnAYZ0T5t8WVcfpmLcOkAL2TYgDqyos9W/VJvwzDnMJrx1ZnLPRxL1YRQ7Dz0ymfEhn15/RoCerH3q3nZGZ8v9wlL0LAdzyncgp5h35v/2dvyjIh1UC8esauf8F5l8ha8J+MgnM/E3DFWL2M5ZeW1QyvsArvmKnS7JB4DaKVCbXGuO5sakfHXsObkB3bhqhmGAkaTvoV/FQkC08icuxEZtUuHwwgUzwRo70csBiOcVQ8TcT4ChmW2P/IjnxfEv5tUSIchaBFWvAp0zqvEQcpRIphrf6Wl0OjeOSiZ6ZAoxVupISIL9xLtUKyQHf/PyR7r9MdFhP9sr7JQJK+7mmqo8wZ2vzPCIz2YBZPrs8XA9o536jmmljChyHxdVz7EPbkTM+zYgia29eoiF4SViDud3BiIGp4jNemAYEhNs9Dr+XeeeL3jc84iGUXo7Z7a/WUIUPq+1SCME3/a7H8LF7vGjqO/HlElH3xgBuQQn+72TacxmCOHDq+MnIUmQR8HNxclxU4fUdN4dtzZ8ylr3hOX4eufBBFfcCmyyZy6Ru1AIpuTmMT7ZoJBVuFw0hup+K2gB9xIx2NIwgSlZqOYN ST4n5US5 5H+hHIcksEN+i4kKKaHC9pcwTelzzQfzEwhgEy/AasWgyElfmZdrTsvCq6SggcBxZGj1rKTouar8IyO80211+OhshCrtIrMdYirRJvRM6vh7sxLeB6+bP7SgcVcSROT338i5X6raWbEmz+hvE6UE+G6KsMZv6DfbHgfpZtDVC0Mri1h+xv+FyTWyXDpFVXDx/O6PGqCHWjgWPOP7Xv8OGdWhWAlliCKZ8EZy8tPMU7PQxGCQYu0NYIKNS/sHPGg/bu8pGoMrsBdpQewdhs+snyX0+8TumuVeM4f427Hqpgtw21ucbGR5pOhrZsB3nxhLI/VhQqbsqPqFBUcc8EyrdiW0bmtBSw6TL2P8in0d6vCSkjvw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 27/06/2024 10:16, Barry Song wrote: > On Thu, Jun 27, 2024 at 8:39 PM Ryan Roberts wrote: >> >> On 27/06/2024 05:10, Barry Song wrote: >>> On Thu, Jun 27, 2024 at 2:40 AM Zi Yan wrote: >>>> >>>> On Wed Jun 26, 2024 at 7:07 AM EDT, Ryan Roberts wrote: >>>>> On 26/06/2024 04:06, Zi Yan wrote: >>>>>> On Tue Jun 25, 2024 at 10:49 PM EDT, ran xiaokai wrote: >>>>>>> From: Ran Xiaokai >>>>>>> >>>>>>> KPF_COMPOUND_HEAD and KPF_COMPOUND_TAIL are set on "common" compound >>>>>>> pages, which means of any order, but KPF_THP should only be set >>>>>>> when the folio is a 2M pmd mappable THP. >>>>> >>>>> Why should KPF_THP only be set on 2M THP? What problem does it cause as it is >>>>> currently configured? >>>>> >>>>> I would argue that mTHP is still THP so should still have the flag. And since >>>>> these smaller mTHP sizes are disabled by default, only mTHP-aware user space >>>>> will be enabling them, so I'll naively state that it should not cause compat >>>>> issues as is. >>>>> >>>>> Also, the script at tools/mm/thpmaps relies on KPF_THP being set for all mTHP >>>>> sizes to function correctly. So that would need to be reworked if making this >>>>> change. >>>> >>>> + more folks working on mTHP >>>> >>>> I agree that mTHP is still THP, but we might want different >>>> stats/counters for it, since people might want to keep the old THP counters >>>> consistent. See recent commits on adding mTHP counters: >>>> ec33687c6749 ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback >>>> counters"), 1f97fd042f38 ("mm: shmem: add mTHP counters for anonymous shmem") >>>> >>>> and changes to make THP counter to only count PMD THP: >>>> 835c3a25aa37 ("mm: huge_memory: add the missing folio_test_pmd_mappable() for >>>> THP split statistics") >>>> >>>> In this case, I wonder if we want a new KPF_MTHP bit for mTHP and some >>>> adjustment on tools/mm/thpmaps. >>> >>> It seems we have to do this though I think keeping KPF_THP and adding a >>> separate bit like KPF_PMD_MAPPED makes more sense. but those tools >>> relying on KPF_THP need to realize this and check the new bit , which is >>> not done now. >>> whether the mTHP's name is mTHP or THP will make no difference for >>> this case:-) >> >> I don't quite follow your logic for that last part; If there are 2 separate >> bits; KPF_THP and KPF_MTHP, and KPF_THP is only set for PMD-sized THP, that >> would be a safe/compatible approach, right? Where as your suggestion requires >> changes to existing tools to work. > > Right, my point is that mTHP and THP are both types of THP. The only difference > is whether they are PMD-mapped or PTE-mapped. Adding a bit to describe how > the page is mapped would more accurately reflect reality. However, this change > would disrupt tools that assume KPF_THP always means PMD-mapped THP. > Therefore, we would still need separate bits for THP and mTHP in this case. I think perhaps PTE- vs PMD-mapped is a separate issue. The issue at hand is whether PKF_THP implies a fixed size (and alignment). If compat is an issue, then PKF_THP must continue to imply PMD-size. If compat is not an issue, then size can be determined by iterating over the entries. Having a mechanism to determine the level at which a block is mapped would potentially be a useful feature, but seems orthogonal to me. > > I saw Willy complain about mTHP being called "mTHP," but in this case, calling > it "mTHP" or just "THP" doesn't change anything if old tools continue to assume > that KPF_THP means PMD-mapped THP. I think Willy was just ribbing me because he preferred calling it "anonymous large folios". That's how I took it anyway. > >> >> Thinking about this a bit more, I wonder if PKF_MTHP is the right name for a new >> flag; We don't currently expose the term "mTHP" to user space. I can't think of >> a better name though. > > Yes. If "compatibility" is a requirement, we cannot disregard it. > >> I'd still like to understand what is actually broken that this change is fixing. >> Is the concern that a user could see KPF_THP and advance forward by >> "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size / getpagesize()" entries? >> > > Maybe we need an example which is thinking that KPF_THP is PMD-mapped. Yes, that would help. > >>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Yan, Zi >>>> >>> > > Thanks > Barry